Implementing process attachment tools#
Overview#
This topic provides the technical details needed to implement a process attachment tool similar to rocprofv3 --attach. Process attachment allows profiling tools to dynamically attach to running GPU applications without requiring application restart. The implementation can use either the provided Python or exported C functions.
Direct Python execution#
The Python file rocprof-attach can be directly called to attach to a specific Process ID (PID) and use custom tools within the attachment target.
$ rocprof-attach -p 12345 -t path/to/your-tool-library.so -d 5000
In the preceding example, the rocprof-attach will attach to the process with PID 12345 and the library path/to/your-tool-library.so will be loaded by ROCprofiler-SDK from within that process. detach will be called after 5000 milliseconds and rocprof-attach will exit when detachment is complete.
By default, rocprof-attach attaches to the target process and all of its descendant processes. To attach only to the specified PID, pass --attach-children=false:
$ rocprof-attach -p 12345 -t path/to/your-tool-library.so --attach-children=false
More information can be found by invoking rocprof-attach -h
Python functions#
The python file rocprof-attach defines an attach function that can be used for attachment:
def attach(
pid,
attach_tool_library,
attach_duration_msec,
attach_library=ROCPROF_ATTACH_LIBRARY,
attach_children=True,
):
Function details
The attach function performs the entire attachment process, including attachment and detachment, and provides the ability to use custom tools via the tool_libraries parameter. It also has a simple control flow intended for direct calling from Python. For more complex control, it’s recommended to use the explicit attach and detach functions provided by the librocprofiler-sdk-rocattach.so binary instead.
Parameters
pid: Required - PID of process to attach to.
attach_tool_library: Colon delimited list of tool libraries to use.
- attach_duration_msec: Optional - Profiling duration in milliseconds.
If unspecified, attachment runs until Enter is pressed or SIGINT (Ctrl+C) is received.
- attach_library: Optional - Tool library to use for attachment and detachment.
Default works for nearly all applications.
If unspecified, defaults to the absolute path of
librocprofiler-sdk-rocattach.so.
- attach_children: Optional - Specifies whether to attach to the target process and all of its descendant processes.
Defaults to
True; passFalseto attach only to the specified PID.
C Functions#
The C library librocprofiler-sdk-rocattach.so defines attach and detach functions that can be used for attachment:
extern "C" {
// Attach to a process and all of its descendant processes
rocattach_status_t rocattach_attach_tree(int pid) ROCATTACH_API;
// Attach to a single process only
rocattach_status_t rocattach_attach(int pid) ROCATTACH_API;
// Detach from a process and all of its descendant processes
rocattach_status_t rocattach_detach_tree(int pid) ROCATTACH_API;
// Detach from a single process (or all sessions when pid=0)
rocattach_status_t rocattach_detach(int pid) ROCATTACH_API;
}
Function Details:
- rocattach_attach_tree(int pid): Attaches to a process and all of its descendants.
Enumerates the full process tree rooted at
pidvia/procbefore attaching.Attachment proceeds in breadth-first order from the root.
If attachment to an individual child process fails, the error is logged and attachment continues with the remaining processes; the return status reflects the last error seen.
The process tree is snapshotted at the time of the call; processes spawned after this point are not included.
- rocattach_attach(int pid): Attaches to a single process only.
Takes the target process ID as parameter.
Doesn’t attach to child processes.
When profiling applications that spawn child processes, use
rocattach_attach_treeinstead.
- rocattach_detach_tree(int pid): Detaches from a process and all of its descendants.
Enumerates the process tree rooted at
pidvia/procat the time of the call.Only processes with an active attachment session are detached; others are silently skipped.
Symmetric counterpart to
rocattach_attach_tree; use these two together.Reentrant: the sessions lock is acquired and released per-process and isn’t held across the
/proctraversal, so concurrent calls from multiple threads are safe.
- rocattach_detach(int pid): Detaches from a single process.
Takes the target process ID as a parameter.
Cleans up attachment resources and terminates profiling.
A PID of 0 can be specified to detach from all the current sessions.
Function call sequence#
Initial attachment sequence#
The initial attachment process roughly follows this sequence:
rocattach_attach(pid) ← Your tool calls this
ptrace calls rocprofiler_register_attach(env_buffer)
tool_library::rocprofiler_configure(…)
tool_library::rocprofiler_configure_attach(…)
tool_library::tool_init(…)
tool_library::tool_attach(…)
[Profiling and data collection…]
rocattach_detach(pid) ← Your tool calls this
ptrace calls rocprofiler_register_detach()
tool_library::tool_detach(…)
[Program ends]
tool_library::tool_fini(…)
Reattachment sequence#
For reattachment to a previously attached process:
rocattach_attach(pid) ← Your tool calls this again
ptrace calls rocprofiler_register_attach(env_buffer)
tool_library::tool_attach(…)
[Continued profiling and data collection…]
rocattach_detach(pid) ← Your tool calls this
ptrace calls rocprofiler_register_detach()
tool_library::tool_detach(…)
Environment variable configuration#
This section lists the environment variables required for process attachment.
Required variables#
The target process must have ROCP_TOOL_ATTACH=1 set, or be using a version of rocprofiler-register configured with the CMake flag ROCPROFILER_REGISTER_BUILD_DEFAULT_ATTACHMENT=ON.
export ROCP_TOOL_ATTACH=1
OR
cmake /path/to/rocprofiler-register -DROCPROFILER_REGISTER_BUILD_DEFAULT_ATTACHMENT=ON
Tool library configuration#
The attachment system can use any tool library. librocprofiler-sdk-tool.so is used when the environment variable is not set.
// Attachment libraries to be used
setenv("ROCPROF_ATTACH_TOOL_LIBRARY", "example-tool-1.so:example-tool-2.so", 1);
Using the attachment functions#
This is a simplified example of how to use these functions in your own attachment tool:
Basic attachment implementation#
#include <rocattach.h>
#include <dlfcn.h>
#include <iostream>
#include <thread>
#include <chrono>
class ROCprofilerAttachmentTool {
private:
void* attach_lib_handle = nullptr;
rocattach_status_t (*attach_func)(int) = nullptr;
rocattach_status_t (*detach_func)(int) = nullptr;
public:
bool initialize() {
// Load the rocprofiler-attach library/binary
attach_lib_handle = dlopen("librocprofiler-sdk-rocattach.so", RTLD_NOW);
if (!attach_lib_handle) {
std::cerr << "Failed to load librocprofiler-sdk-rocattach: " << dlerror() << std::endl;
return false;
}
// Get the attachment function pointers.
// Use rocattach_attach_tree/rocattach_detach_tree to attach to the process and all
// its descendants, or rocattach_attach/rocattach_detach for a single process only.
attach_func = (rocattach_status_t(*)(int))dlsym(attach_lib_handle, "rocattach_attach_tree");
detach_func = (rocattach_status_t(*)(int))dlsym(attach_lib_handle, "rocattach_detach_tree");
if (!attach_func || !detach_func) {
std::cerr << "Failed to find attachment functions" << std::endl;
return false;
}
return true;
}
bool attach_to_process(pid_t pid, uint32_t duration_ms) {
// Validate the target process
if (kill(pid, 0) != 0) {
std::cerr << "Target process " << pid << " is not accessible" << std::endl;
return false;
}
std::cout << "Attaching to process " << pid << std::endl;
// Start attachment - this will handle all ptrace operations
if (!attach_func(pid))
{
return false;
}
// Profile for specified duration
std::cout << "Profiling for " << duration_ms << " milliseconds..." << std::endl;
std::this_thread::sleep_for(std::chrono::milliseconds(duration_ms));
// Stop profiling
if (!detach_func(pid))
{
return false;
}
std::cout << "Profiling completed" << std::endl;
return true;
}
~ROCprofilerAttachmentTool() {
if (attach_lib_handle) {
dlclose(attach_lib_handle);
}
}
};
Main implementation#
#include <iostream>
#include <vector>
#include <string>
#include <cstdlib>
int main(int argc, char* argv[]) {
if (argc < 2) {
std::cerr << "Usage: " << argv[0] << " <PID> [duration_ms]" << std::endl;
std::cerr << " PID: Process ID to attach to" << std::endl;
std::cerr << " duration_ms: Optional profiling duration in milliseconds" << std::endl;
return 1;
}
pid_t target_pid = std::stoi(argv[1]);
uint32_t duration = (argc > 2) ? std::stoi(argv[2]) : 1000;
// For this example, the tool library "librocprofiler-sdk-tool.so" is used by
// default because ROCPROF_ATTACH_TOOL_LIBRARY is not set. These environment
// variables are used to communicate profiling options to rocprofiler-sdk-tool.
setenv("ROCPROF_HIP_RUNTIME_API_TRACE", "1", 1);
setenv("ROCPROF_KERNEL_TRACE", "1", 1);
setenv("ROCPROF_MEMORY_COPY_TRACE", "1", 1);
setenv("ROCPROF_OUTPUT_PATH", "./attachment-output", 1);
setenv("ROCPROF_OUTPUT_FILE_NAME", "attached_profile", 1);
// Initialize and run attachment tool
ROCprofilerAttachmentTool tool;
if (!tool.initialize()) {
std::cerr << "Failed to initialize attachment tool" << std::endl;
return 1;
}
if (!tool.attach_to_process(target_pid, duration)) {
std::cerr << "Attachment failed" << std::endl;
return 1;
}
std::cout << "Attachment completed successfully" << std::endl;
return 0;
}