Implementing Process Attachment Tools#
Overview#
This document provides the technical details needed to implement a process attachment tool similar to rocprofv3 --attach
. Process attachment allows profiling tools to dynamically attach to running GPU applications without requiring application restart.
The implementation uses specific exported C functions and involves low-level process manipulation using ptrace, environment variable injection, library loading, and coordination with the ROCprofiler-SDK registration system.
Exported C Functions for Attachment#
The attachment functionality provides the following exported C functions that tools can use:
ROCprofiler-Attach Functions#
These functions are exported from the rocprofiler-attach
binary:
extern "C" {
// Start attachment to a target process
void attach(uint32_t pid) ROCPROFILER_EXPORT;
// Detach from target process and cleanup
void detach() ROCPROFILER_EXPORT;
}
Function Details:
``attach(uint32_t pid)``: Main entry point for starting attachment to a process - Takes the target process ID as parameter - Initiates ptrace-based attachment sequence - Spawns background thread for ptrace operations
``detach()``: Entry point for detaching from the target process - Cleans up attachment resources and terminates profiling - Joins ptrace thread and releases resources
ROCprofiler-Register Functions#
These functions are exported from the librocprofiler-register.so
library and are called via ptrace:
extern "C" {
// Activate profiling in target process (called via ptrace)
rocprofiler_register_error_code_t
rocprofiler_register_attach(const char* environment_buffer, const char* tool_lib_path)
ROCPROFILER_REGISTER_PUBLIC_API;
// Deactivate profiling in target process (called via ptrace)
rocprofiler_register_error_code_t
rocprofiler_register_detach()
ROCPROFILER_REGISTER_PUBLIC_API;
// Reattach to previously attached process (experimental)
rocprofiler_register_error_code_t
rocprofiler_register_invoke_reattach()
ROCPROFILER_REGISTER_PUBLIC_API;
// Client callback functions for reattachment support
void rocprofiler_call_client_reattach(void)
ROCPROFILER_REGISTER_PUBLIC_API;
void rocprofiler_call_client_detach(void)
ROCPROFILER_REGISTER_PUBLIC_API;
}
Function Details:
``rocprofiler_register_attach(const char* environment_buffer, const char* tool_lib_path)``: - Called via ptrace from the attachment system - Receives serialized environment variables for profiling configuration - Receives the tool library path to load (defaults to “librocprofiler-sdk-tool.so” if NULL) - Loads the specified tool library and activates profiling services - Returns
rocprofiler_register_error_code_t
status``rocprofiler_register_detach()``: - Called via ptrace to stop profiling in the target process - Calls the tool’s detach function and cleans up resources - Returns
rocprofiler_register_error_code_t
status``rocprofiler_register_invoke_reattach()``: (EXPERIMENTAL) - Called to reattach profiling to a previously attached process - Invokes client reattach callbacks without full re-initialization - Used for resuming profiling after temporary detachment - Returns
rocprofiler_register_error_code_t
status``rocprofiler_call_client_reattach()`` and ``rocprofiler_call_client_detach()``: - C wrapper functions for client tool reattachment callbacks - Automatically resolved and called by the registration system - Enable tools to handle dynamic attach/detach cycles
Function Call Sequence#
Initial Attachment Sequence#
The initial attachment process follows this sequence:
Tool Implementation
|
v
attach(pid) ← Your tool calls this
|
v
Ptrace attachment & environment setup
|
v
rocprofiler_register_attach(env_buffer) ← Called via ptrace in target
|
v
Profiling active in target process
|
v
[Profiling data collection...]
|
v
rocprofiler_register_detach() ← Called via ptrace in target
|
v
detach() ← Your tool calls this
|
v
Cleanup complete
Reattachment Sequence (Experimental)#
For reattachment to a previously attached process:
Tool Implementation
|
v
attach(pid) ← Your tool calls this again
|
v
Ptrace attachment & environment setup
|
v
rocprofiler_register_attach(env_buffer) ← Detects previous attachment
|
v
rocprofiler_register_invoke_reattach() ← Calls client reattach callbacks
|
v
Profiling resumed in target process
|
v
[Continued profiling data collection...]
|
v
rocprofiler_register_detach() ← Called via ptrace in target
|
v
detach() ← Your tool calls this
|
v
Cleanup complete
Using the Attachment Functions#
Here’s how to use these functions in your own attachment tool:
Basic Attachment Tool Implementation#
#include <dlfcn.h>
#include <iostream>
#include <thread>
#include <chrono>
class ROCprofilerAttachmentTool {
private:
void* attach_lib_handle = nullptr;
void (*attach_func)(uint32_t) = nullptr;
void (*detach_func)() = nullptr;
public:
bool initialize() {
// Load the rocprofiler-attach library/binary
attach_lib_handle = dlopen("librocprofiler-attach.so", RTLD_NOW);
if (!attach_lib_handle) {
std::cerr << "Failed to load rocprofiler-attach: " << dlerror() << std::endl;
return false;
}
// Get the attachment function pointers
attach_func = (void(*)(uint32_t))dlsym(attach_lib_handle, "attach");
detach_func = (void(*)())dlsym(attach_lib_handle, "detach");
if (!attach_func || !detach_func) {
std::cerr << "Failed to find attachment functions" << std::endl;
return false;
}
return true;
}
bool attach_to_process(pid_t pid, uint32_t duration_ms = 0) {
// Validate the target process
if (kill(pid, 0) != 0) {
std::cerr << "Target process " << pid << " is not accessible" << std::endl;
return false;
}
std::cout << "Attaching to process " << pid << std::endl;
// Start attachment - this will handle all ptrace operations
attach_func(pid);
if (duration_ms > 0) {
// Profile for specified duration
std::cout << "Profiling for " << duration_ms << " milliseconds..." << std::endl;
std::this_thread::sleep_for(std::chrono::milliseconds(duration_ms));
// Stop profiling
detach_func();
} else {
std::cout << "Profiling until process ends or manual detach..." << std::endl;
// Monitor process or wait for external signal to detach
while (kill(pid, 0) == 0) {
std::this_thread::sleep_for(std::chrono::seconds(1));
}
detach_func();
}
std::cout << "Profiling completed" << std::endl;
return true;
}
~ROCprofilerAttachmentTool() {
if (attach_lib_handle) {
dlclose(attach_lib_handle);
}
}
};
Complete Tool Example#
#include <iostream>
#include <vector>
#include <string>
#include <cstdlib>
int main(int argc, char* argv[]) {
if (argc < 2) {
std::cerr << "Usage: " << argv[0] << " <PID> [duration_ms]" << std::endl;
std::cerr << " PID: Process ID to attach to" << std::endl;
std::cerr << " duration_ms: Optional profiling duration in milliseconds" << std::endl;
return 1;
}
pid_t target_pid = std::stoi(argv[1]);
uint32_t duration = (argc > 2) ? std::stoi(argv[2]) : 0;
// Set up profiling environment variables before attachment
setenv("ROCP_TOOL_ATTACH", "1", 1);
// Note: The attachment system now uses the hardcoded default tool library path
// "librocprofiler-sdk-tool.so" and no longer uses environment variables for tool selection
setenv("ROCPROF_HIP_API_TRACE", "1", 1);
setenv("ROCPROF_KERNEL_TRACE", "1", 1);
setenv("ROCPROF_MEMORY_COPY_TRACE", "1", 1);
setenv("ROCPROF_OUTPUT_PATH", "./attachment-output", 1);
setenv("ROCPROF_OUTPUT_FILE_NAME", "attached_profile", 1);
// Initialize and run attachment tool
ROCprofilerAttachmentTool tool;
if (!tool.initialize()) {
std::cerr << "Failed to initialize attachment tool" << std::endl;
return 1;
}
if (!tool.attach_to_process(target_pid, duration)) {
std::cerr << "Attachment failed" << std::endl;
return 1;
}
std::cout << "Attachment completed successfully" << std::endl;
return 0;
}
Experimental Reattachment API#
ROCprofiler-SDK now provides experimental support for reattachment, allowing tools to handle dynamic attach/detach cycles more efficiently.
Tool Configuration for Reattachment#
Tools that support reattachment should implement the experimental configuration structure:
#include <rocprofiler-sdk/registration.h>
// Experimental reattachment callbacks
void tool_reattach(void* tool_data) {
// Reinitialize contexts and resume profiling
// This is called when reattaching to a previously profiled process
}
void tool_detach(void* tool_data) {
// Suspend profiling operations temporarily
// This is called during detachment, but contexts may be preserved
}
extern "C" rocprofiler_tool_configure_result_experimental_t*
rocprofiler_configure_experimental(uint32_t version,
const char* runtime_version,
uint32_t prio,
rocprofiler_client_id_t* client_id)
{
static auto cfg = rocprofiler_tool_configure_result_experimental_t {
.size = sizeof(rocprofiler_tool_configure_result_experimental_t),
.initialize = &tool_init,
.finalize = &tool_fini,
.tool_data = nullptr,
.tool_reattach = &tool_reattach, // Experimental reattachment support
.tool_detach = &tool_detach // Experimental detachment support
};
return &cfg;
}
Client Callback Functions#
The registration system automatically provides C wrapper functions:
// These are automatically generated and called by rocprofiler-register
extern "C" void rocprofiler_call_client_reattach(void) {
// Calls the tool's reattach callback with stored tool_data
}
extern "C" void rocprofiler_call_client_detach(void) {
// Calls the tool's detach callback with stored tool_data
}
Reattachment Environment Variables#
When using reattachment, set this additional environment variable:
// Indicates that the tool was loaded via attachment (not LD_PRELOAD)
setenv("ROCPROFILER_REGISTER_TOOL_ATTACHED", "1", 1);
This helps the registration system differentiate between initial attachment and reattachment cycles.
Environment Variable Configuration#
Before calling the attachment functions, set up environment variables that will be injected into the target process:
Required Variables#
// Essential for attachment functionality
setenv("ROCP_TOOL_ATTACH", "1", 1);
Tool Library Configuration#
The attachment system now uses a hardcoded default tool library path:
// The attachment system automatically uses "librocprofiler-sdk-tool.so"
// No environment variable configuration is needed or supported
Tracing Options#
// Enable different types of tracing
setenv("ROCPROF_HIP_API_TRACE", "1", 1); // HIP API calls
setenv("ROCPROF_HSA_API_TRACE", "1", 1); // HSA API calls
setenv("ROCPROF_KERNEL_TRACE", "1", 1); // Kernel dispatches
setenv("ROCPROF_MEMORY_COPY_TRACE", "1", 1); // Memory operations
setenv("ROCPROF_MEMORY_ALLOCATION_TRACE", "1", 1); // Memory allocations
setenv("ROCPROF_SCRATCH_MEMORY_TRACE", "1", 1); // Scratch memory
setenv("ROCPROF_MARKER_TRACE", "1", 1); // ROCTx markers
Output Configuration#
// Control output location and format
setenv("ROCPROF_OUTPUT_PATH", "/path/to/output", 1);
setenv("ROCPROF_OUTPUT_FILE_NAME", "profile_name", 1);
setenv("ROCPROF_OUTPUT_FORMAT", "csv", 1); // or "json", "pftrace", etc.
Build Configuration#
To build a tool using the attachment functions:
CMakeLists.txt#
cmake_minimum_required(VERSION 3.16)
project(my_rocprofiler_attach_tool)
set(CMAKE_CXX_STANDARD 17)
# Find ROCprofiler SDK (for headers and linking)
find_package(rocprofiler-sdk REQUIRED)
add_executable(my_attach_tool
main.cpp
attachment_tool.cpp
)
# Link with required libraries
target_link_libraries(my_attach_tool
rocprofiler-sdk::rocprofiler-sdk
dl # for dlopen/dlsym operations
)
# Set capabilities for ptrace operations
add_custom_command(TARGET my_attach_tool POST_BUILD
COMMAND sudo setcap cap_sys_ptrace+ep $<TARGET_FILE:my_attach_tool>
COMMENT "Setting ptrace capability"
)
Error Handling#
When using the attachment functions, handle these common error conditions:
class AttachmentErrorHandler {
public:
static bool validate_target_process(pid_t pid) {
// Check if process exists
if (kill(pid, 0) != 0) {
std::cerr << "Process " << pid << " not found or not accessible" << std::endl;
return false;
}
// Check if it's a GPU application
std::string maps_path = "/proc/" + std::to_string(pid) + "/maps";
std::ifstream maps(maps_path);
std::string line;
bool has_gpu_libs = false;
while (std::getline(maps, line)) {
if (line.find("libamdhip64.so") != std::string::npos ||
line.find("libhsa-runtime64.so") != std::string::npos) {
has_gpu_libs = true;
break;
}
}
if (!has_gpu_libs) {
std::cerr << "Process " << pid << " does not appear to use GPU APIs" << std::endl;
return false;
}
return true;
}
static void handle_attachment_errors() {
// Check for common permission issues
if (geteuid() != 0) {
std::cerr << "Warning: Not running as root. Ensure CAP_SYS_PTRACE capability is set." << std::endl;
}
// Check if rocprofiler libraries are available
if (getenv("LD_LIBRARY_PATH") == nullptr ||
std::string(getenv("LD_LIBRARY_PATH")).find("/opt/rocm/lib") == std::string::npos) {
std::cerr << "Warning: /opt/rocm/lib may not be in LD_LIBRARY_PATH" << std::endl;
}
}
};
Architecture Overview#
Process attachment consists of several cooperating components:
Attachment Tool (your implementation)
|
v
1. Process Discovery & Validation
|
v
2. Ptrace Attachment & Control
|
v
3. Environment Variable Injection
|
v
4. Library Loading (rocprofiler-register)
|
v
5. Profiling Service Activation
|
v
6. Data Collection & Management
|
v
7. Detachment & Cleanup
Theoretical Implementation Details#
Core Implementation Components#
1. Process Discovery and Validation#
Target Process Requirements:
#include <sys/types.h>
#include <signal.h>
#include <unistd.h>
bool validate_target_process(pid_t pid) {
// Check if process exists and is accessible
if (kill(pid, 0) != 0) {
return false; // Process doesn't exist or no permission
}
// Verify it's a GPU application by checking loaded libraries
std::string maps_path = "/proc/" + std::to_string(pid) + "/maps";
std::ifstream maps(maps_path);
std::string line;
bool has_hip = false, has_hsa = false;
while (std::getline(maps, line)) {
if (line.find("libamdhip64.so") != std::string::npos) has_hip = true;
if (line.find("libhsa-runtime64.so") != std::string::npos) has_hsa = true;
}
return has_hip || has_hsa; // Must use HIP or HSA
}
2. Ptrace-Based Process Control#
Core Ptrace Operations:
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <sys/user.h>
class ProcessAttachment {
private:
pid_t target_pid;
bool attached = false;
public:
bool attach(pid_t pid) {
target_pid = pid;
// Attach to the target process
if (ptrace(PTRACE_ATTACH, target_pid, nullptr, nullptr) == -1) {
perror("ptrace PTRACE_ATTACH failed");
return false;
}
// Wait for the process to stop
int status;
if (waitpid(target_pid, &status, 0) == -1) {
perror("waitpid failed");
detach();
return false;
}
if (!WIFSTOPPED(status)) {
fprintf(stderr, "Process did not stop after attach\n");
detach();
return false;
}
attached = true;
return true;
}
bool detach() {
if (!attached) return true;
// Detach and allow process to continue
if (ptrace(PTRACE_DETACH, target_pid, nullptr, nullptr) == -1) {
perror("ptrace PTRACE_DETACH failed");
return false;
}
attached = false;
return true;
}
};
3. Environment Variable Injection#
Environment Variable Management:
#include <fstream>
#include <vector>
class EnvironmentInjector {
public:
struct EnvironmentVar {
std::string name;
std::string value;
};
// Prepare environment variables for profiling
std::vector<EnvironmentVar> prepare_profiling_env(
const std::vector<std::string>& trace_options,
const std::string& output_path,
const std::string& output_file) {
std::vector<EnvironmentVar> env_vars;
// Essential attachment variable
env_vars.push_back({"ROCP_TOOL_ATTACH", "1"});
// Configure tracing based on options
for (const auto& option : trace_options) {
if (option == "hip-trace") {
env_vars.push_back({"ROCPROF_HIP_API_TRACE", "1"});
}
if (option == "kernel-trace") {
env_vars.push_back({"ROCPROF_KERNEL_TRACE", "1"});
}
if (option == "hsa-trace") {
env_vars.push_back({"ROCPROF_HSA_API_TRACE", "1"});
}
if (option == "memory-copy-trace") {
env_vars.push_back({"ROCPROF_MEMORY_COPY_TRACE", "1"});
}
}
// Output configuration
env_vars.push_back({"ROCPROF_OUTPUT_PATH", output_path});
env_vars.push_back({"ROCPROF_OUTPUT_FILE_NAME", output_file});
return env_vars;
}
// Serialize environment for injection
std::vector<uint8_t> serialize_environment(const std::vector<EnvironmentVar>& vars) {
std::vector<uint8_t> buffer(4); // Start with count
uint32_t count = vars.size();
// Store count in first 4 bytes
buffer[0] = count & 0xFF;
buffer[1] = (count >> 8) & 0xFF;
buffer[2] = (count >> 16) & 0xFF;
buffer[3] = (count >> 24) & 0xFF;
// Add each variable as null-terminated name and value
for (const auto& var : vars) {
// Add variable name
for (char c : var.name) {
buffer.push_back(c);
}
buffer.push_back(0); // Null terminate name
// Add variable value
for (char c : var.value) {
buffer.push_back(c);
}
buffer.push_back(0); // Null terminate value
}
return buffer;
}
};
4. Memory Manipulation and Library Loading#
Remote Memory Operations:
#include <sys/mman.h>
class RemoteMemoryManager {
private:
pid_t target_pid;
public:
RemoteMemoryManager(pid_t pid) : target_pid(pid) {}
// Allocate memory in remote process
void* remote_mmap(size_t length, int prot, int flags) {
// Find a suitable location for injection
struct user_regs_struct regs;
if (ptrace(PTRACE_GETREGS, target_pid, nullptr, ®s) == -1) {
return nullptr;
}
// Save original registers
struct user_regs_struct orig_regs = regs;
// Set up mmap syscall
regs.rax = 9; // __NR_mmap
regs.rdi = 0; // addr (let kernel choose)
regs.rsi = length;
regs.rdx = prot;
regs.r10 = flags;
regs.r8 = -1; // fd
regs.r9 = 0; // offset
if (ptrace(PTRACE_SETREGS, target_pid, nullptr, ®s) == -1) {
return nullptr;
}
// Execute syscall
if (ptrace(PTRACE_SYSCALL, target_pid, nullptr, nullptr) == -1) {
return nullptr;
}
// Wait for syscall completion
int status;
waitpid(target_pid, &status, 0);
// Get result
if (ptrace(PTRACE_GETREGS, target_pid, nullptr, ®s) == -1) {
return nullptr;
}
void* result = (void*)regs.rax;
// Restore original registers
ptrace(PTRACE_SETREGS, target_pid, nullptr, &orig_regs);
return (result == (void*)-1) ? nullptr : result;
}
// Write data to remote process memory
bool write_memory(void* addr, const void* data, size_t size) {
const uint8_t* bytes = static_cast<const uint8_t*>(data);
size_t written = 0;
while (written < size) {
long word = 0;
size_t to_copy = std::min(sizeof(long), size - written);
// For partial words, read existing content first
if (to_copy < sizeof(long)) {
errno = 0;
word = ptrace(PTRACE_PEEKDATA, target_pid,
(uint8_t*)addr + written, nullptr);
if (errno != 0) return false;
}
// Copy new data into word
memcpy(&word, bytes + written, to_copy);
// Write word to remote process
if (ptrace(PTRACE_POKEDATA, target_pid,
(uint8_t*)addr + written, word) == -1) {
return false;
}
written += to_copy;
}
return true;
}
};
5. Library Injection and Symbol Resolution#
Dynamic Library Loading:
#include <dlfcn.h>
#include <link.h>
class LibraryInjector {
private:
pid_t target_pid;
RemoteMemoryManager memory_manager;
public:
LibraryInjector(pid_t pid) : target_pid(pid), memory_manager(pid) {}
// Inject rocprofiler-register library
bool inject_register_library() {
const char* lib_path = "/opt/rocm/lib/librocprofiler-register.so";
// Find dlopen in target process
void* dlopen_addr = find_function_address("dlopen");
if (!dlopen_addr) {
fprintf(stderr, "Could not find dlopen in target process\n");
return false;
}
// Allocate memory for library path
void* path_addr = memory_manager.remote_mmap(
strlen(lib_path) + 1,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS);
if (!path_addr) return false;
// Write library path to remote memory
if (!memory_manager.write_memory(path_addr, lib_path, strlen(lib_path) + 1)) {
return false;
}
// Call dlopen in target process
return call_remote_function(dlopen_addr,
{(uint64_t)path_addr, RTLD_NOW | RTLD_GLOBAL});
}
void* find_function_address(const char* function_name) {
// Parse /proc/PID/maps to find loaded libraries
std::string maps_path = "/proc/" + std::to_string(target_pid) + "/maps";
std::ifstream maps(maps_path);
std::string line;
while (std::getline(maps, line)) {
if (line.find("libc.so") != std::string::npos) {
// Extract base address of libc
size_t dash = line.find('-');
std::string base_addr_str = line.substr(0, dash);
void* base_addr = (void*)std::stoull(base_addr_str, nullptr, 16);
// Open libc and find function offset
void* handle = dlopen("libc.so.6", RTLD_LAZY);
if (handle) {
void* func_addr = dlsym(handle, function_name);
if (func_addr) {
// Calculate actual address in target process
return (uint8_t*)base_addr + ((uint8_t*)func_addr - (uint8_t*)dlsym(RTLD_DEFAULT, "main"));
}
dlclose(handle);
}
}
}
return nullptr;
}
};
6. ROCprofiler-Register Communication Protocol#
Attachment Protocol Implementation:
extern "C" {
// Function signatures from rocprofiler-register
typedef void (*attach_func_t)(uint32_t pid);
typedef void (*detach_func_t)();
}
class ROCprofilerAttachment {
private:
pid_t target_pid;
void* register_handle = nullptr;
attach_func_t attach_func = nullptr;
detach_func_t detach_func = nullptr;
public:
bool initialize() {
// Load rocprofiler-register library
register_handle = dlopen("/opt/rocm/lib/librocprofiler-register.so", RTLD_NOW);
if (!register_handle) {
fprintf(stderr, "Failed to load rocprofiler-register: %s\n", dlerror());
return false;
}
// Get attachment functions
attach_func = (attach_func_t)dlsym(register_handle, "attach");
detach_func = (detach_func_t)dlsym(register_handle, "detach");
if (!attach_func || !detach_func) {
fprintf(stderr, "Failed to find attachment functions\n");
return false;
}
return true;
}
bool attach_to_process(pid_t pid, const std::vector<uint8_t>& env_buffer) {
target_pid = pid;
// Set up environment for rocprofiler-register
// This involves injecting the environment buffer into the target process
// Call the attach function
attach_func(pid);
return true;
}
void detach_from_process() {
if (detach_func) {
detach_func();
}
}
};
Complete Attachment Tool Implementation#
Main Attachment Tool Structure:
#include <iostream>
#include <vector>
#include <string>
#include <chrono>
#include <thread>
class ROCprofilerAttachTool {
private:
ProcessAttachment process_control;
EnvironmentInjector env_injector;
LibraryInjector lib_injector;
ROCprofilerAttachment rocprof_attachment;
public:
struct AttachmentConfig {
pid_t target_pid;
std::vector<std::string> trace_options;
std::string output_path = "./rocprof-attachment-output";
std::string output_filename = "attached_profile";
uint32_t duration_msec = 0; // 0 = until process ends
};
bool attach_and_profile(const AttachmentConfig& config) {
// 1. Validate target process
if (!validate_target_process(config.target_pid)) {
std::cerr << "Invalid or inaccessible target process: " << config.target_pid << std::endl;
return false;
}
// 2. Initialize rocprofiler attachment system
if (!rocprof_attachment.initialize()) {
std::cerr << "Failed to initialize rocprofiler attachment system" << std::endl;
return false;
}
// 3. Attach to target process
if (!process_control.attach(config.target_pid)) {
std::cerr << "Failed to attach to process " << config.target_pid << std::endl;
return false;
}
// 4. Prepare environment variables
auto env_vars = env_injector.prepare_profiling_env(
config.trace_options,
config.output_path,
config.output_filename);
auto env_buffer = env_injector.serialize_environment(env_vars);
// 5. Inject rocprofiler-register library
LibraryInjector injector(config.target_pid);
if (!injector.inject_register_library()) {
std::cerr << "Failed to inject rocprofiler-register library" << std::endl;
process_control.detach();
return false;
}
// 6. Activate profiling
if (!rocprof_attachment.attach_to_process(config.target_pid, env_buffer)) {
std::cerr << "Failed to activate profiling" << std::endl;
process_control.detach();
return false;
}
// 7. Allow process to continue with profiling active
if (!process_control.detach()) {
std::cerr << "Warning: Failed to detach cleanly" << std::endl;
}
// 8. Wait for specified duration or until process ends
if (config.duration_msec > 0) {
std::cout << "Profiling for " << config.duration_msec << " milliseconds..." << std::endl;
std::this_thread::sleep_for(std::chrono::milliseconds(config.duration_msec));
// Re-attach to stop profiling
rocprof_attachment.detach_from_process();
} else {
std::cout << "Profiling until process ends..." << std::endl;
// Monitor process and wait for it to end
while (kill(config.target_pid, 0) == 0) {
std::this_thread::sleep_for(std::chrono::seconds(1));
}
}
std::cout << "Profiling completed. Output saved to: "
<< config.output_path << "/" << config.output_filename << std::endl;
return true;
}
};
// Example usage
int main(int argc, char* argv[]) {
if (argc < 2) {
std::cerr << "Usage: " << argv[0] << " <PID> [options]" << std::endl;
return 1;
}
ROCprofilerAttachTool::AttachmentConfig config;
config.target_pid = std::stoi(argv[1]);
config.trace_options = {"hip-trace", "kernel-trace", "memory-copy-trace"};
config.duration_msec = 5000; // 5 seconds
ROCprofilerAttachTool tool;
if (!tool.attach_and_profile(config)) {
std::cerr << "Attachment and profiling failed" << std::endl;
return 1;
}
return 0;
}
Required System Permissions and Setup#
Permission Requirements:
# Your attachment tool will need:
# 1. Ptrace permissions (may require root or capabilities)
sudo setcap cap_sys_ptrace+ep your_attachment_tool
# 2. Access to /proc filesystem
# Usually available by default
# 3. Ability to load shared libraries
# Ensure ROCm libraries are in LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH
Build Requirements:
# CMakeLists.txt for your attachment tool
cmake_minimum_required(VERSION 3.16)
project(rocprofiler_attach_tool)
set(CMAKE_CXX_STANDARD 17)
find_package(rocprofiler-sdk REQUIRED)
add_executable(rocprofiler_attach_tool
main.cpp
process_attachment.cpp
environment_injection.cpp
library_injection.cpp
)
target_link_libraries(rocprofiler_attach_tool
rocprofiler-sdk::rocprofiler-sdk
dl # for dlopen/dlsym
)
Error Handling and Debugging#
Common Issues and Solutions:
Ptrace Permissions: Use
strace
to debug ptrace failuresLibrary Loading: Check
/proc/PID/maps
to verify library injectionEnvironment Variables: Validate environment buffer format
Process State: Monitor target process status during attachment
Debugging Techniques:
// Enable debug logging
setenv("ROCPROF_LOGGING_LEVEL", "trace", 1);
// Monitor attachment progress
bool debug_attachment(pid_t pid) {
std::cout << "Target process memory maps:" << std::endl;
std::string cmd = "cat /proc/" + std::to_string(pid) + "/maps";
system(cmd.c_str());
std::cout << "Target process environment:" << std::endl;
cmd = "cat /proc/" + std::to_string(pid) + "/environ | tr '\\0' '\\n'";
system(cmd.c_str());
return true;
}
This implementation guide provides the foundation needed to build a complete process attachment tool for ROCprofiler-SDK. The actual rocprofv3 implementation uses similar techniques with additional optimizations and error handling.