Skip to content

Reapply " [XRay] Add support for instrumentation of DSOs on x86_64 (#90959)" #113548

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 25, 2024

Conversation

sebastiankreutzer
Copy link
Contributor

This fixes remaining issues in my previous PR #90959.

Changes:

  • Removed dependency on LLVM header in xray_interface.cpp
  • Fixed XRay patching for some targets due to missing changes in architecture-specific patching functions
  • Addressed some remaining compiler warnings that I missed in the previous patch
  • Formatting

I have tested these changes on x86_64 (natively), as well as ppc64le, aarch64 and arm32 (cross-compiled and emulated using qemu).

Original description:

This PR introduces shared library (DSO) support for XRay based on a revised version of the implementation outlined in this RFC.
The feature enables the patching and handling of events from DSOs, supporting both libraries linked at startup or explicitly loaded, e.g. via dlopen.
This patch adds the following:

  • The -fxray-shared flag to enable the feature (turned off by default)
  • A small runtime library that is linked into every instrumented DSO, providing position-independent trampolines and code to register with the main XRay runtime
  • Changes to the XRay runtime to support management and patching of multiple objects

These changes are fully backward compatible, i.e. running without instrumented DSOs will produce identical traces (in terms of recorded function IDs) to the previous implementation.

Due to my limited ability to test on other architectures, this feature is only implemented and tested with x86_64. Extending support to other architectures is fairly straightforward, requiring only a position-independent implementation of the architecture-specific trampoline implementation (see compiler-rt/lib/xray/xray_trampoline_x86_64.S for reference).

This patch does not include any functionality to resolve function IDs from DSOs for the provided logging/tracing modes. These modes still work and will record calls from DSOs, but symbol resolution for these functions in not available. Getting this to work properly requires recording information about the loaded DSOs and should IMO be discussed in a separate RFC, as there are mulitple feasible approaches.

@llvmbot llvmbot added clang Clang issues not falling into any other category compiler-rt clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" xray labels Oct 24, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 24, 2024

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Sebastian Kreutzer (sebastiankreutzer)

Changes

This fixes remaining issues in my previous PR #90959.

Changes:

  • Removed dependency on LLVM header in xray_interface.cpp
  • Fixed XRay patching for some targets due to missing changes in architecture-specific patching functions
  • Addressed some remaining compiler warnings that I missed in the previous patch
  • Formatting

I have tested these changes on x86_64 (natively), as well as ppc64le, aarch64 and arm32 (cross-compiled and emulated using qemu).

Original description:

This PR introduces shared library (DSO) support for XRay based on a revised version of the implementation outlined in this RFC.
The feature enables the patching and handling of events from DSOs, supporting both libraries linked at startup or explicitly loaded, e.g. via dlopen.
This patch adds the following:

  • The -fxray-shared flag to enable the feature (turned off by default)
  • A small runtime library that is linked into every instrumented DSO, providing position-independent trampolines and code to register with the main XRay runtime
  • Changes to the XRay runtime to support management and patching of multiple objects

These changes are fully backward compatible, i.e. running without instrumented DSOs will produce identical traces (in terms of recorded function IDs) to the previous implementation.

Due to my limited ability to test on other architectures, this feature is only implemented and tested with x86_64. Extending support to other architectures is fairly straightforward, requiring only a position-independent implementation of the architecture-specific trampoline implementation (see compiler-rt/lib/xray/xray_trampoline_x86_64.S for reference).

This patch does not include any functionality to resolve function IDs from DSOs for the provided logging/tracing modes. These modes still work and will record calls from DSOs, but symbol resolution for these functions in not available. Getting this to work properly requires recording information about the loaded DSOs and should IMO be discussed in a separate RFC, as there are mulitple feasible approaches.


Patch is 87.93 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/113548.diff

29 Files Affected:

  • (modified) clang/include/clang/Basic/CodeGenOptions.def (+2)
  • (modified) clang/include/clang/Driver/Options.td (+5)
  • (modified) clang/include/clang/Driver/XRayArgs.h (+2)
  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+8-4)
  • (modified) clang/lib/Driver/XRayArgs.cpp (+21)
  • (added) clang/test/Driver/XRay/xray-shared.cpp (+17)
  • (modified) compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake (+1)
  • (modified) compiler-rt/cmake/config-ix.cmake (+4)
  • (modified) compiler-rt/include/xray/xray_interface.h (+56-9)
  • (modified) compiler-rt/lib/xray/CMakeLists.txt (+76-10)
  • (modified) compiler-rt/lib/xray/xray_AArch64.cpp (+12-7)
  • (modified) compiler-rt/lib/xray/xray_arm.cpp (+12-7)
  • (added) compiler-rt/lib/xray/xray_dso_init.cpp (+62)
  • (modified) compiler-rt/lib/xray/xray_hexagon.cpp (+12-7)
  • (modified) compiler-rt/lib/xray/xray_init.cpp (+157-26)
  • (modified) compiler-rt/lib/xray/xray_interface.cpp (+218-64)
  • (modified) compiler-rt/lib/xray/xray_interface_internal.h (+62-14)
  • (modified) compiler-rt/lib/xray/xray_loongarch64.cpp (+16-7)
  • (modified) compiler-rt/lib/xray/xray_mips.cpp (+16-7)
  • (modified) compiler-rt/lib/xray/xray_mips64.cpp (+16-7)
  • (modified) compiler-rt/lib/xray/xray_powerpc64.cpp (+24-6)
  • (modified) compiler-rt/lib/xray/xray_trampoline_x86_64.S (+17-7)
  • (modified) compiler-rt/lib/xray/xray_x86_64.cpp (+18-12)
  • (added) compiler-rt/test/xray/TestCases/Posix/basic-mode-dso.cpp (+47)
  • (added) compiler-rt/test/xray/TestCases/Posix/clang-xray-shared.cpp (+14)
  • (added) compiler-rt/test/xray/TestCases/Posix/dlopen.cpp (+107)
  • (added) compiler-rt/test/xray/TestCases/Posix/dso-dep-chains.cpp (+197)
  • (added) compiler-rt/test/xray/TestCases/Posix/patch-premain-dso.cpp (+45)
  • (added) compiler-rt/test/xray/TestCases/Posix/patching-unpatching-dso.cpp (+75)
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index eac831278ee20d..e45370bde74a5d 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -136,6 +136,8 @@ CODEGENOPT(XRayIgnoreLoops , 1, 0)
 ///< Emit the XRay function index section.
 CODEGENOPT(XRayFunctionIndex , 1, 1)
 
+///< Set when -fxray-shared is enabled
+CODEGENOPT(XRayShared , 1, 0)
 
 ///< Set the minimum number of instructions in a function to determine selective
 ///< XRay instrumentation.
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 2ddb2f5312148e..0a21aaf3c9d670 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2948,6 +2948,11 @@ def fxray_selected_function_group :
   HelpText<"When using -fxray-function-groups, select which group of functions to instrument. Valid range is 0 to fxray-function-groups - 1">,
   MarshallingInfoInt<CodeGenOpts<"XRaySelectedFunctionGroup">, "0">;
 
+defm xray_shared : BoolFOption<"xray-shared",
+  CodeGenOpts<"XRayShared">, DefaultFalse,
+  PosFlag<SetTrue, [], [ClangOption, CC1Option],
+          "Enable shared library instrumentation with XRay">,
+  NegFlag<SetFalse>>;
 
 defm fine_grained_bitfield_accesses : BoolOption<"f", "fine-grained-bitfield-accesses",
   CodeGenOpts<"FineGrainedBitfieldAccesses">, DefaultFalse,
diff --git a/clang/include/clang/Driver/XRayArgs.h b/clang/include/clang/Driver/XRayArgs.h
index bdd3d979547eed..1b5c4a4c42f12a 100644
--- a/clang/include/clang/Driver/XRayArgs.h
+++ b/clang/include/clang/Driver/XRayArgs.h
@@ -27,6 +27,7 @@ class XRayArgs {
   XRayInstrSet InstrumentationBundle;
   llvm::opt::Arg *XRayInstrument = nullptr;
   bool XRayRT = true;
+  bool XRayShared = false;
 
 public:
   /// Parses the XRay arguments from an argument list.
@@ -35,6 +36,7 @@ class XRayArgs {
                llvm::opt::ArgStringList &CmdArgs, types::ID InputType) const;
 
   bool needsXRayRt() const { return XRayInstrument && XRayRT; }
+  bool needsXRayDSORt() const { return XRayInstrument && XRayRT && XRayShared; }
   llvm::ArrayRef<std::string> modeList() const { return Modes; }
   XRayInstrSet instrumentationBundle() const { return InstrumentationBundle; }
 };
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 91605a67a37fc0..1c3c8c816594e5 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1623,10 +1623,14 @@ bool tools::addSanitizerRuntimes(const ToolChain &TC, const ArgList &Args,
 }
 
 bool tools::addXRayRuntime(const ToolChain&TC, const ArgList &Args, ArgStringList &CmdArgs) {
-  if (Args.hasArg(options::OPT_shared))
-    return false;
-
-  if (TC.getXRayArgs().needsXRayRt()) {
+  if (Args.hasArg(options::OPT_shared)) {
+    if (TC.getXRayArgs().needsXRayDSORt()) {
+      CmdArgs.push_back("--whole-archive");
+      CmdArgs.push_back(TC.getCompilerRTArgString(Args, "xray-dso"));
+      CmdArgs.push_back("--no-whole-archive");
+      return true;
+    }
+  } else if (TC.getXRayArgs().needsXRayRt()) {
     CmdArgs.push_back("--whole-archive");
     CmdArgs.push_back(TC.getCompilerRTArgString(Args, "xray"));
     for (const auto &Mode : TC.getXRayArgs().modeList())
diff --git a/clang/lib/Driver/XRayArgs.cpp b/clang/lib/Driver/XRayArgs.cpp
index 8c5134e2501358..d0bb5d4887c184 100644
--- a/clang/lib/Driver/XRayArgs.cpp
+++ b/clang/lib/Driver/XRayArgs.cpp
@@ -63,6 +63,23 @@ XRayArgs::XRayArgs(const ToolChain &TC, const ArgList &Args) {
         << XRayInstrument->getSpelling() << Triple.str();
   }
 
+  if (Args.hasFlag(options::OPT_fxray_shared, options::OPT_fno_xray_shared,
+                   false)) {
+    XRayShared = true;
+
+    // DSO instrumentation is currently limited to x86_64
+    if (Triple.getArch() != llvm::Triple::x86_64) {
+      D.Diag(diag::err_drv_unsupported_opt_for_target)
+          << "-fxray-shared" << Triple.str();
+    }
+
+    unsigned PICLvl = std::get<1>(tools::ParsePICArgs(TC, Args));
+    if (!PICLvl) {
+      D.Diag(diag::err_opt_not_valid_without_opt) << "-fxray-shared"
+                                                  << "-fPIC";
+    }
+  }
+
   // Both XRay and -fpatchable-function-entry use
   // TargetOpcode::PATCHABLE_FUNCTION_ENTER.
   if (Arg *A = Args.getLastArg(options::OPT_fpatchable_function_entry_EQ))
@@ -177,6 +194,10 @@ void XRayArgs::addArgs(const ToolChain &TC, const ArgList &Args,
   Args.addOptOutFlag(CmdArgs, options::OPT_fxray_function_index,
                      options::OPT_fno_xray_function_index);
 
+  if (XRayShared)
+    Args.addOptInFlag(CmdArgs, options::OPT_fxray_shared,
+                      options::OPT_fno_xray_shared);
+
   if (const Arg *A =
           Args.getLastArg(options::OPT_fxray_instruction_threshold_EQ)) {
     int Value;
diff --git a/clang/test/Driver/XRay/xray-shared.cpp b/clang/test/Driver/XRay/xray-shared.cpp
new file mode 100644
index 00000000000000..215854e1fc7cef
--- /dev/null
+++ b/clang/test/Driver/XRay/xray-shared.cpp
@@ -0,0 +1,17 @@
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fPIC -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fpic -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s
+// RUN: not %clang -### --target=x86_64-unknown-linux-gnu -fno-PIC -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=ERR-PIC
+// RUN: not %clang -### --target=x86_64-unknown-linux-gnu -fno-pic -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=ERR-PIC
+
+// On 64 bit darwin, PIC is always enabled
+// RUN: %clang -### --target=x86_64-apple-darwin -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s
+
+// Check unsupported targets
+// RUN: not %clang -### --target=aarch64-pc-freebsd -fPIC -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=ERR-TARGET
+// RUN: not %clang -### --target=arm64-apple-macos -fPIC -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=ERR-TARGET
+
+// CHECK: "-cc1" {{.*}}"-fxray-instrument" {{.*}}"-fxray-shared"
+// ERR-TARGET:   error: unsupported option '-fxray-shared' for target
+// ERR-PIC:   error: option '-fxray-shared' cannot be specified without '-fPIC'
+
diff --git a/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake b/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
index d00d39518104bf..fb4dfa7bd09dfe 100644
--- a/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
+++ b/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
@@ -104,6 +104,7 @@ else()
 set(ALL_XRAY_SUPPORTED_ARCH ${X86_64} ${ARM32} ${ARM64} ${MIPS32} ${MIPS64}
 		powerpc64le ${HEXAGON} ${LOONGARCH64})
 endif()
+set(ALL_XRAY_DSO_SUPPORTED_ARCH ${X86_64})
 set(ALL_SHADOWCALLSTACK_SUPPORTED_ARCH ${ARM64})
 
 if (UNIX)
diff --git a/compiler-rt/cmake/config-ix.cmake b/compiler-rt/cmake/config-ix.cmake
index a494e0532a50bc..431f544e8ad6a7 100644
--- a/compiler-rt/cmake/config-ix.cmake
+++ b/compiler-rt/cmake/config-ix.cmake
@@ -668,6 +668,9 @@ if(APPLE)
   list_intersect(XRAY_SUPPORTED_ARCH
     ALL_XRAY_SUPPORTED_ARCH
     SANITIZER_COMMON_SUPPORTED_ARCH)
+  list_intersect(XRAY_DSO_SUPPORTED_ARCH
+    ALL_XRAY_DSO_SUPPORTED_ARCH
+    SANITIZER_COMMON_SUPPORTED_ARCH)
   list_intersect(SHADOWCALLSTACK_SUPPORTED_ARCH
     ALL_SHADOWCALLSTACK_SUPPORTED_ARCH
     SANITIZER_COMMON_SUPPORTED_ARCH)
@@ -702,6 +705,7 @@ else()
   filter_available_targets(CFI_SUPPORTED_ARCH ${ALL_CFI_SUPPORTED_ARCH})
   filter_available_targets(SCUDO_STANDALONE_SUPPORTED_ARCH ${ALL_SCUDO_STANDALONE_SUPPORTED_ARCH})
   filter_available_targets(XRAY_SUPPORTED_ARCH ${ALL_XRAY_SUPPORTED_ARCH})
+  filter_available_targets(XRAY_DSO_SUPPORTED_ARCH ${ALL_XRAY_DSO_SUPPORTED_ARCH})
   filter_available_targets(SHADOWCALLSTACK_SUPPORTED_ARCH
     ${ALL_SHADOWCALLSTACK_SUPPORTED_ARCH})
   filter_available_targets(GWP_ASAN_SUPPORTED_ARCH ${ALL_GWP_ASAN_SUPPORTED_ARCH})
diff --git a/compiler-rt/include/xray/xray_interface.h b/compiler-rt/include/xray/xray_interface.h
index 727431c04e4f73..675ea0cbc48c83 100644
--- a/compiler-rt/include/xray/xray_interface.h
+++ b/compiler-rt/include/xray/xray_interface.h
@@ -93,31 +93,78 @@ enum XRayPatchingStatus {
   FAILED = 3,
 };
 
-/// This tells XRay to patch the instrumentation points. See XRayPatchingStatus
-/// for possible result values.
+/// This tells XRay to patch the instrumentation points in all currently loaded
+/// objects. See XRayPatchingStatus for possible result values.
 extern XRayPatchingStatus __xray_patch();
 
+/// This tells XRay to patch the instrumentation points in the given object.
+/// See XRayPatchingStatus for possible result values.
+extern XRayPatchingStatus __xray_patch_object(int32_t ObjId);
+
 /// Reverses the effect of __xray_patch(). See XRayPatchingStatus for possible
 /// result values.
 extern XRayPatchingStatus __xray_unpatch();
 
-/// This patches a specific function id. See XRayPatchingStatus for possible
+/// Reverses the effect of __xray_patch_object. See XRayPatchingStatus for
+/// possible result values.
+extern XRayPatchingStatus __xray_unpatch_object(int32_t ObjId);
+
+/// This unpacks the given (packed) function id and patches
+/// the corresponding function.  See XRayPatchingStatus for possible
 /// result values.
 extern XRayPatchingStatus __xray_patch_function(int32_t FuncId);
 
-/// This unpatches a specific function id. See XRayPatchingStatus for possible
+/// This patches a specific function in the given object. See XRayPatchingStatus
+/// for possible result values.
+extern XRayPatchingStatus __xray_patch_function_in_object(int32_t FuncId,
+                                                          int32_t ObjId);
+
+/// This unpacks the given (packed) function id and unpatches
+/// the corresponding function. See XRayPatchingStatus for possible
 /// result values.
 extern XRayPatchingStatus __xray_unpatch_function(int32_t FuncId);
 
-/// This function returns the address of the function provided a valid function
-/// id. We return 0 if we encounter any error, even if 0 may be a valid function
-/// address.
+/// This unpatches a specific function in the given object.
+/// See XRayPatchingStatus for possible result values.
+extern XRayPatchingStatus __xray_unpatch_function_in_object(int32_t FuncId,
+                                                            int32_t ObjId);
+
+/// This function unpacks the given (packed) function id and returns the address
+/// of the corresponding function. We return 0 if we encounter any error, even
+/// if 0 may be a valid function address.
 extern uintptr_t __xray_function_address(int32_t FuncId);
 
-/// This function returns the maximum valid function id. Returns 0 if we
-/// encounter errors (when there are no instrumented functions, etc.).
+/// This function returns the address of the function in the given object
+/// provided valid function and object ids. We return 0 if we encounter any
+/// error, even if 0 may be a valid function address.
+extern uintptr_t __xray_function_address_in_object(int32_t FuncId,
+                                                   int32_t ObjId);
+
+/// This function returns the maximum valid function id for the main executable
+/// (object id = 0). Returns 0 if we encounter errors (when there are no
+/// instrumented functions, etc.).
 extern size_t __xray_max_function_id();
 
+/// This function returns the maximum valid function id for the given object.
+/// Returns 0 if we encounter errors (when there are no instrumented functions,
+/// etc.).
+extern size_t __xray_max_function_id_in_object(int32_t ObjId);
+
+/// This function returns the number of previously registered objects
+/// (executable + loaded DSOs). Returns 0 if XRay has not been initialized.
+extern size_t __xray_num_objects();
+
+/// Unpacks the function id from the given packed id.
+extern int32_t __xray_unpack_function_id(int32_t PackedId);
+
+/// Unpacks the object id from the given packed id.
+extern int32_t __xray_unpack_object_id(int32_t PackedId);
+
+/// Creates and returns a packed id from the given function and object ids.
+/// If the ids do not fit within the reserved number of bits for each part, the
+/// high bits are truncated.
+extern int32_t __xray_pack_id(int32_t FuncId, int32_t ObjId);
+
 /// Initialize the required XRay data structures. This is useful in cases where
 /// users want to control precisely when the XRay instrumentation data
 /// structures are initialized, for example when the XRay library is built with
diff --git a/compiler-rt/lib/xray/CMakeLists.txt b/compiler-rt/lib/xray/CMakeLists.txt
index cf7b5062aae32d..f38c07420c9abf 100644
--- a/compiler-rt/lib/xray/CMakeLists.txt
+++ b/compiler-rt/lib/xray/CMakeLists.txt
@@ -10,6 +10,10 @@ set(XRAY_SOURCES
   xray_utils.cpp
   )
 
+set(XRAY_DSO_SOURCES
+  xray_dso_init.cpp
+  )
+
 # Implementation files for all XRay modes.
 set(XRAY_FDR_MODE_SOURCES
   xray_fdr_flags.cpp
@@ -33,6 +37,11 @@ set(x86_64_SOURCES
   xray_trampoline_x86_64.S
   )
 
+set(x86_64_DSO_SOURCES
+   xray_trampoline_x86_64.S
+   )
+
+
 set(arm_SOURCES
   xray_arm.cpp
   xray_trampoline_arm.S
@@ -128,10 +137,12 @@ set(XRAY_IMPL_HEADERS
 # consumption by tests.
 set(XRAY_ALL_SOURCE_FILES
   ${XRAY_SOURCES}
+  ${XRAY_DSO_SOURCES}
   ${XRAY_FDR_MODE_SOURCES}
   ${XRAY_BASIC_MODE_SOURCES}
   ${XRAY_PROFILING_MODE_SOURCES}
   ${x86_64_SOURCES}
+  ${x86_64_DSO_SOURCES}
   ${arm_SOURCES}
   ${armhf_SOURCES}
   ${hexagon_SOURCES}
@@ -162,6 +173,9 @@ set(XRAY_CFLAGS
   ${COMPILER_RT_CXX_CFLAGS})
 set(XRAY_COMMON_DEFINITIONS SANITIZER_COMMON_NO_REDEFINE_BUILTINS XRAY_HAS_EXCEPTIONS=1)
 
+# DSO trampolines need to be compiled with GOT addressing
+set(XRAY_COMMON_DEFINITIONS_DSO ${XRAY_COMMON_DEFINITIONS} XRAY_PIC)
+
 # Too many existing bugs, needs cleanup.
 append_list_if(COMPILER_RT_HAS_WNO_FORMAT -Wno-format XRAY_CFLAGS)
 
@@ -201,7 +215,16 @@ if (APPLE)
     CFLAGS ${XRAY_CFLAGS}
     DEFS ${XRAY_COMMON_DEFINITIONS}
     DEPS ${XRAY_DEPS})
+  add_compiler_rt_object_libraries(RTXrayDSO
+    OS ${XRAY_SUPPORTED_OS}
+    ARCHS ${XRAY_DSO_SUPPORTED_ARCH}
+    SOURCES ${XRAY_DSO_SOURCES}
+    ADDITIONAL_HEADERS ${XRAY_IMPL_HEADERS}
+    CFLAGS ${XRAY_CFLAGS}
+    DEFS ${XRAY_COMMON_DEFINITIONS_DSO}
+    DEPS ${XRAY_DEPS})
   set(XRAY_RTXRAY_ARCH_LIBS "")
+  set(XRAY_DSO_RTXRAY_ARCH_LIBS "")
   foreach(arch ${XRAY_SUPPORTED_ARCH})
     if(NOT ${arch} IN_LIST XRAY_SOURCE_ARCHS)
       continue()
@@ -215,6 +238,17 @@ if (APPLE)
       DEFS ${XRAY_COMMON_DEFINITIONS}
       DEPS ${XRAY_DEPS})
     list(APPEND XRAY_RTXRAY_ARCH_LIBS RTXray_${arch})
+    if (${arch} IN_LIST XRAY_DSO_SUPPORTED_ARCH)
+      add_compiler_rt_object_libraries(RTXrayDSO_${arch}
+        OS ${XRAY_SUPPORTED_OS}
+        ARCHS ${XRAY_DSO_SUPPORTED_ARCH}
+        SOURCES ${${arch}_DSO_SOURCES}
+        ADDITIONAL_HEADERS ${XRAY_IMPL_HEADERS}
+        CFLAGS ${XRAY_CFLAGS}
+        DEFS ${XRAY_COMMON_DEFINITIONS_DSO}
+        DEPS ${XRAY_DEPS})
+      list(APPEND XRAY_DSO_RTXRAY_ARCH_LIBS RTXrayDSO_${arch})
+    endif()
   endforeach()
   add_compiler_rt_object_libraries(RTXrayFDR
     OS ${XRAY_SUPPORTED_OS}
@@ -252,6 +286,17 @@ if (APPLE)
     LINK_FLAGS ${XRAY_LINK_FLAGS} ${WEAK_SYMBOL_LINK_FLAGS}
     LINK_LIBS ${XRAY_LINK_LIBS}
     PARENT_TARGET xray)
+  add_compiler_rt_runtime(clang_rt.xray-dso
+    STATIC
+    OS ${XRAY_SUPPORTED_OS}
+    ARCHS ${XRAY_DSO_SUPPORTED_ARCH}
+    OBJECT_LIBS RTXrayDSO ${XRAY_DSO_RTXRAY_ARCH_LIBS}
+    CFLAGS ${XRAY_CFLAGS}
+    DEFS ${XRAY_COMMON_DEFINITIONS}
+    LINK_FLAGS ${XRAY_LINK_FLAGS} ${WEAK_SYMBOL_LINK_FLAGS}
+    LINK_LIBS ${XRAY_LINK_LIBS}
+    PARENT_TARGET xray)
+
   add_compiler_rt_runtime(clang_rt.xray-fdr
     STATIC
     OS ${XRAY_SUPPORTED_OS}
@@ -346,16 +391,37 @@ else() # not Apple
       DEFS ${XRAY_COMMON_DEFINITIONS}
       OBJECT_LIBS RTXrayBASIC
       PARENT_TARGET xray)
-   # Profiler Mode runtime
-   add_compiler_rt_runtime(clang_rt.xray-profiling
-     STATIC
-     ARCHS ${arch}
-     CFLAGS ${XRAY_CFLAGS}
-     LINK_FLAGS ${XRAY_LINK_FLAGS}
-     LINK_LIBS ${XRAY_LINK_LIBS}
-     DEFS ${XRAY_COMMON_DEFINITIONS}
-     OBJECT_LIBS RTXrayPROFILING
-     PARENT_TARGET xray)
+    # Profiler Mode runtime
+    add_compiler_rt_runtime(clang_rt.xray-profiling
+      STATIC
+      ARCHS ${arch}
+      CFLAGS ${XRAY_CFLAGS}
+      LINK_FLAGS ${XRAY_LINK_FLAGS}
+      LINK_LIBS ${XRAY_LINK_LIBS}
+      DEFS ${XRAY_COMMON_DEFINITIONS}
+      OBJECT_LIBS RTXrayPROFILING
+      PARENT_TARGET xray)
+
+    if (${arch} IN_LIST XRAY_DSO_SUPPORTED_ARCH)
+      # TODO: Only implemented for X86 at the moment
+      add_compiler_rt_object_libraries(RTXrayDSO
+        ARCHS ${arch}
+        SOURCES ${XRAY_DSO_SOURCES} ${${arch}_DSO_SOURCES} 
+        ADDITIONAL_HEADERS ${XRAY_IMPL_HEADERS}
+        CFLAGS ${XRAY_CFLAGS}
+        DEFS ${XRAY_COMMON_DEFINITIONS_DSO}
+        DEPS ${XRAY_DEPS})
+      # DSO runtime archive
+      add_compiler_rt_runtime(clang_rt.xray-dso
+        STATIC
+        ARCHS ${arch}
+        CFLAGS ${XRAY_CFLAGS}
+        LINK_FLAGS ${XRAY_LINK_FLAGS}
+        LINK_LIBS ${XRAY_LINK_LIBS}
+        DEFS ${XRAY_COMMON_DEFINITIONS}
+        OBJECT_LIBS RTXrayDSO
+        PARENT_TARGET xray)
+    endif()
   endforeach()
 endif() # not Apple
 
diff --git a/compiler-rt/lib/xray/xray_AArch64.cpp b/compiler-rt/lib/xray/xray_AArch64.cpp
index c1d77758946edf..2b151162ce4b9b 100644
--- a/compiler-rt/lib/xray/xray_AArch64.cpp
+++ b/compiler-rt/lib/xray/xray_AArch64.cpp
@@ -89,18 +89,23 @@ inline static bool patchSled(const bool Enable, const uint32_t FuncId,
 
 bool patchFunctionEntry(const bool Enable, const uint32_t FuncId,
                         const XRaySledEntry &Sled,
-                        void (*Trampoline)()) XRAY_NEVER_INSTRUMENT {
+                        const XRayTrampolines &Trampolines,
+                        bool LogArgs) XRAY_NEVER_INSTRUMENT {
+  auto Trampoline =
+      LogArgs ? Trampolines.LogArgsTrampoline : Trampolines.EntryTrampoline;
   return patchSled(Enable, FuncId, Sled, Trampoline);
 }
 
-bool patchFunctionExit(const bool Enable, const uint32_t FuncId,
-                       const XRaySledEntry &Sled) XRAY_NEVER_INSTRUMENT {
-  return patchSled(Enable, FuncId, Sled, __xray_FunctionExit);
+bool patchFunctionExit(
+    const bool Enable, const uint32_t FuncId, const XRaySledEntry &Sled,
+    const XRayTrampolines &Trampolines) XRAY_NEVER_INSTRUMENT {
+  return patchSled(Enable, FuncId, Sled, Trampolines.ExitTrampoline);
 }
 
-bool patchFunctionTailExit(const bool Enable, const uint32_t FuncId,
-                           const XRaySledEntry &Sled) XRAY_NEVER_INSTRUMENT {
-  return patchSled(Enable, FuncId, Sled, __xray_FunctionTailExit);
+bool patchFunctionTailExit(
+    const bool Enable, const uint32_t FuncId, const XRaySledEntry &Sled,
+    const XRayTrampolines &Trampolines) XRAY_NEVER_INSTRUMENT {
+  return patchSled(Enable, FuncId, Sled, Trampolines.TailExitTrampoline);
 }
 
 // AArch64AsmPrinter::LowerPATCHABLE_EVENT_CALL generates this code sequence:
diff --git a/compiler-rt/lib/xray/xray_arm.cpp b/compiler-rt/lib/xray/xray_arm.cpp
index e1818555906c35..e318bb7070e802 100644
--- a/compiler-rt/lib/xray/xray_arm.cpp
+++ b/compiler-rt/lib/xray/xray_arm.cpp
@@ -128,18 +128,23 @@ inline static bool patchSled(const bool Enable, const uint32_t FuncId,
 
 bool patchFunctionEntry(const bool Enable, const uint32_t FuncId,
                         const XRaySledEntry &Sled,
-                        void (*Trampoline)()) XRAY_NEVER_INSTRUMENT {
+                        const XRayTrampolines &Trampolines,
+                        bool LogArgs) XRAY_NEVER_INSTRUMENT {
+  auto Trampoline =
+      LogArgs ? Trampolines.LogArgsTrampoline : Trampolines.EntryTrampoline;
   return patchSled(Enable, FuncId, Sled, Trampoline);
 }
 
-bool patchFunctionExit(const bool Enable, const uint32_t FuncId,
-                       const XRaySledEntry &Sled) XRAY_NEVER_INSTRUMENT {
-  return patchSled(Enable, FuncId, Sled, __xray_FunctionExit);
+bool patchFunctionExit(
+    const bool Enable, const uint32_t FuncId, const XRaySledEntry &Sled,
+    c...
[truncated]

@sebastiankreutzer
Copy link
Contributor Author

sebastiankreutzer commented Oct 24, 2024

Tagging reviewers of previous PR: @jplehr @MaskRay @androm3da @deanberris @petrhosek

@sebastiankreutzer
Copy link
Contributor Author

Buildkite failure is due to unrelated errors in libcxx/test/libcxx/headers_in_modulemap.sh.py

Copy link
Contributor

@jplehr jplehr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jplehr jplehr merged commit e738a5d into llvm:main Oct 25, 2024
12 of 14 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 25, 2024

LLVM Buildbot has detected a new failure on builder clang-m68k-linux-cross running on suse-gary-m68k-cross while building clang,compiler-rt at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/27/builds/1011

Here is the relevant piece of the build log for the reference
Step 5 (ninja check 1) failure: stage 1 checked (failure)
...
[295/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/CommentHandlerTest.cpp.o
[296/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/ASTSelectionTest.cpp.o
[297/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/ExecutionTest.cpp.o
[298/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/LookupTest.cpp.o
[299/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RangeSelectorTest.cpp.o
[300/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/QualTypeNamesTest.cpp.o
[301/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/BitfieldInitializer.cpp.o
[302/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/Attr.cpp.o
[303/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/CXXMethodDecl.cpp.o
[304/1129] Building CXX object tools/clang/unittests/AST/CMakeFiles/ASTTests.dir/ASTImporterTest.cpp.o
FAILED: tools/clang/unittests/AST/CMakeFiles/ASTTests.dir/ASTImporterTest.cpp.o 
/usr/bin/c++ -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_STATIC -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/tools/clang/unittests/AST -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/unittests/AST -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/include -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/tools/clang/include -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/include -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/llvm/include -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/third-party/unittest/googletest/include -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/third-party/unittest/googlemock/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -fno-lifetime-dse -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -fno-strict-aliasing -O3 -DNDEBUG -std=c++17  -Wno-variadic-macros -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -Wno-suggest-override -MD -MT tools/clang/unittests/AST/CMakeFiles/ASTTests.dir/ASTImporterTest.cpp.o -MF tools/clang/unittests/AST/CMakeFiles/ASTTests.dir/ASTImporterTest.cpp.o.d -o tools/clang/unittests/AST/CMakeFiles/ASTTests.dir/ASTImporterTest.cpp.o -c /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/unittests/AST/ASTImporterTest.cpp
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
[305/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/InitListExprPostOrder.cpp.o
[306/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/Concept.cpp.o
[307/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/InitListExprPreOrderNoQueue.cpp.o
[308/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/CXXBoolLiteralExpr.cpp.o
[309/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/IntegerLiteral.cpp.o
[310/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/CXXOperatorCallExprTraverser.cpp.o
[311/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/Class.cpp.o
[312/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/ImplicitCtor.cpp.o
[313/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/DeductionGuide.cpp.o
[314/1129] Building CXX object tools/clang/unittests/ASTMatchers/CMakeFiles/ASTMatchersTests.dir/ASTMatchersNarrowingTest.cpp.o
[315/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/DeclRefExpr.cpp.o
[316/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/ConstructExpr.cpp.o
[317/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/LexicallyOrderedRecursiveASTVisitorTest.cpp.o
[318/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/ImplicitCtorInitializer.cpp.o
[319/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/MemberPointerTypeLoc.cpp.o
[320/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/LambdaExpr.cpp.o
[321/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/LambdaTemplateParams.cpp.o
[322/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/CXXMemberCall.cpp.o
[323/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/LambdaDefaultCapture.cpp.o
[324/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/CallbacksCompoundAssignOperator.cpp.o
[325/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/ParenExpr.cpp.o
[326/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/InitListExprPostOrderNoQueue.cpp.o
[327/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/CallbacksCallExpr.cpp.o
[328/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/NestedNameSpecifiers.cpp.o
[329/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/TemplateArgumentLocTraverser.cpp.o
[330/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/CallbacksLeaf.cpp.o
[331/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/InitListExprPreOrder.cpp.o
[332/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/CallbacksBinaryOperator.cpp.o
[333/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/CallbacksUnaryOperator.cpp.o
[334/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTestPostOrderVisitor.cpp.o
[335/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTests/TraversalScope.cpp.o
[336/1129] Building CXX object tools/clang/unittests/ASTMatchers/CMakeFiles/ASTMatchersTests.dir/ASTMatchersTraversalTest.cpp.o
[337/1129] Building CXX object tools/clang/unittests/Tooling/CMakeFiles/ToolingTests.dir/RecursiveASTVisitorTestDeclVisitor.cpp.o
ninja: build stopped: subcommand failed.

@frobtech frobtech mentioned this pull request Oct 25, 2024
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 25, 2024

LLVM Buildbot has detected a new failure on builder sanitizer-aarch64-linux-bootstrap-hwasan running on sanitizer-buildbot12 while building clang,compiler-rt at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/55/builds/3071

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 83113 tests, 48 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.
FAIL: lld :: ELF/cgprofile-shared-warn.s (80861 of 83113)
******************** TEST 'lld :: ELF/cgprofile-shared-warn.s' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-mc -filetype=obj -triple=x86_64-unknown-linux /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/lld/test/ELF/cgprofile-shared-warn.s -o /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/lld/test/ELF/Output/cgprofile-shared-warn.s.tmp.o
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-mc -filetype=obj -triple=x86_64-unknown-linux /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/lld/test/ELF/cgprofile-shared-warn.s -o /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/lld/test/ELF/Output/cgprofile-shared-warn.s.tmp.o
RUN: at line 4: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld --shared /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/lld/test/ELF/Output/cgprofile-shared-warn.s.tmp.o -o /dev/null 2>&1 | /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/count 0
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld --shared /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/lld/test/ELF/Output/cgprofile-shared-warn.s.tmp.o -o /dev/null
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/count 0
Expected 0 lines, got 1.

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
Slowest Tests:
--------------------------------------------------------------------------
89.40s: Clang :: CodeGen/aarch64-sve-intrinsics/acle_sve_reinterpret.c
85.83s: Clang :: CodeGen/aarch64-sve-intrinsics/acle_sve_reinterpret-bfloat.c
75.30s: Clang :: Driver/fsanitize.c
74.95s: LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir
71.86s: Clang :: Headers/arm-neon-header.c
70.66s: Clang :: CodeGen/X86/rot-intrinsics.c
65.81s: Clang :: CodeGen/X86/avx-builtins.c
65.39s: Clang :: CodeGen/X86/sse2-builtins.c
63.11s: Clang :: Analysis/runtime-regression.c
55.64s: Clang :: CodeGen/X86/avx2-builtins.c
53.01s: Clang :: Driver/arm-cortex-cpus-2.c
52.13s: Clang :: Driver/arm-cortex-cpus-1.c
51.46s: Clang :: CodeGen/X86/fma-builtins-constrained.c
49.93s: Clang :: CodeGen/X86/x86_64-xsave.c
47.96s: Clang :: CodeGen/X86/sse-builtins-constrained.c
47.82s: Clang :: CodeGen/X86/sse41-builtins.c
45.30s: Clang :: CodeGen/X86/sse42-builtins.c
43.17s: Clang :: OpenMP/target_defaultmap_codegen_01.cpp
43.00s: Clang :: CodeGen/X86/avx512f-builtins.c
41.65s: Clang :: Analysis/a_flaky_crash.cpp
Step 10 (stage2/hwasan check) failure: stage2/hwasan check (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:513: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 83113 tests, 48 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.
FAIL: lld :: ELF/cgprofile-shared-warn.s (80861 of 83113)
******************** TEST 'lld :: ELF/cgprofile-shared-warn.s' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-mc -filetype=obj -triple=x86_64-unknown-linux /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/lld/test/ELF/cgprofile-shared-warn.s -o /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/lld/test/ELF/Output/cgprofile-shared-warn.s.tmp.o
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llvm-mc -filetype=obj -triple=x86_64-unknown-linux /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/lld/test/ELF/cgprofile-shared-warn.s -o /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/lld/test/ELF/Output/cgprofile-shared-warn.s.tmp.o
RUN: at line 4: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld --shared /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/lld/test/ELF/Output/cgprofile-shared-warn.s.tmp.o -o /dev/null 2>&1 | /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/count 0
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld --shared /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/lld/test/ELF/Output/cgprofile-shared-warn.s.tmp.o -o /dev/null
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/count 0
Expected 0 lines, got 1.

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
Slowest Tests:
--------------------------------------------------------------------------
89.40s: Clang :: CodeGen/aarch64-sve-intrinsics/acle_sve_reinterpret.c
85.83s: Clang :: CodeGen/aarch64-sve-intrinsics/acle_sve_reinterpret-bfloat.c
75.30s: Clang :: Driver/fsanitize.c
74.95s: LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir
71.86s: Clang :: Headers/arm-neon-header.c
70.66s: Clang :: CodeGen/X86/rot-intrinsics.c
65.81s: Clang :: CodeGen/X86/avx-builtins.c
65.39s: Clang :: CodeGen/X86/sse2-builtins.c
63.11s: Clang :: Analysis/runtime-regression.c
55.64s: Clang :: CodeGen/X86/avx2-builtins.c
53.01s: Clang :: Driver/arm-cortex-cpus-2.c
52.13s: Clang :: Driver/arm-cortex-cpus-1.c
51.46s: Clang :: CodeGen/X86/fma-builtins-constrained.c
49.93s: Clang :: CodeGen/X86/x86_64-xsave.c
47.96s: Clang :: CodeGen/X86/sse-builtins-constrained.c
47.82s: Clang :: CodeGen/X86/sse41-builtins.c
45.30s: Clang :: CodeGen/X86/sse42-builtins.c
43.17s: Clang :: OpenMP/target_defaultmap_codegen_01.cpp
43.00s: Clang :: CodeGen/X86/avx512f-builtins.c
41.65s: Clang :: Analysis/a_flaky_crash.cpp

rupprecht pushed a commit that referenced this pull request Oct 29, 2024
Compiling with `-fxray-shared` requires position-independent code
(introduced in #113548).
Some tests do not explicitly specify this, thus falling back to the
compiler default.
If, for example, Clang is compiled with
`-DCLANG_DEFAULT_PIE_ON_LINUX=OFF`, these checks fail.

This patch addresses this issue in two tests:
- Removing a check in `xray-shared.cpp` that only tests default PIC
behavior
- Adding `-fPIC` explicitly in `clang-xray-shared.cpp`
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…lvm#90959)" (llvm#113548)

This fixes remaining issues in my previous PR llvm#90959.

Changes:
- Removed dependency on LLVM header in `xray_interface.cpp`
- Fixed XRay patching for some targets due to missing changes in
architecture-specific patching functions
- Addressed some remaining compiler warnings that I missed in the
previous patch
- Formatting

I have tested these changes on `x86_64` (natively), as well as
`ppc64le`, `aarch64` and `arm32` (cross-compiled and emulated using
qemu).

**Original description:**

This PR introduces shared library (DSO) support for XRay based on a
revised version of the implementation outlined in [this
RFC](https://discourse.llvm.org/t/rfc-upstreaming-dso-instrumentation-support-for-xray/73000).
The feature enables the patching and handling of events from DSOs,
supporting both libraries linked at startup or explicitly loaded, e.g.
via `dlopen`.
This patch adds the following:
- The `-fxray-shared` flag to enable the feature (turned off by default)
- A small runtime library that is linked into every instrumented DSO,
providing position-independent trampolines and code to register with the
main XRay runtime
- Changes to the XRay runtime to support management and patching of
multiple objects

These changes are fully backward compatible, i.e. running without
instrumented DSOs will produce identical traces (in terms of recorded
function IDs) to the previous implementation.

Due to my limited ability to test on other architectures, this feature
is only implemented and tested with x86_64. Extending support to other
architectures is fairly straightforward, requiring only a
position-independent implementation of the architecture-specific
trampoline implementation (see
`compiler-rt/lib/xray/xray_trampoline_x86_64.S` for reference).

This patch does not include any functionality to resolve function IDs
from DSOs for the provided logging/tracing modes. These modes still work
and will record calls from DSOs, but symbol resolution for these
functions in not available. Getting this to work properly requires
recording information about the loaded DSOs and should IMO be discussed
in a separate RFC, as there are mulitple feasible approaches.

---------

Co-authored-by: Sebastian Kreutzer <[email protected]>
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
)

Compiling with `-fxray-shared` requires position-independent code
(introduced in llvm#113548).
Some tests do not explicitly specify this, thus falling back to the
compiler default.
If, for example, Clang is compiled with
`-DCLANG_DEFAULT_PIE_ON_LINUX=OFF`, these checks fail.

This patch addresses this issue in two tests:
- Removing a check in `xray-shared.cpp` that only tests default PIC
behavior
- Adding `-fPIC` explicitly in `clang-xray-shared.cpp`
MaskRay pushed a commit that referenced this pull request Nov 5, 2024
This patch adds support for `-fxray-shared` on AArch64. This feature,
introduced in #113548 for x86_64, enables the instrumentation of shared
libraries with XRay.

Changes:
- Adds AArch64 to the list of targets supporting `-fxray-shared`
- Introduces PIC versions of the AArch64 XRay trampolines 
- Adjusts relevant XRay tests
PhilippRados pushed a commit to PhilippRados/llvm-project that referenced this pull request Nov 6, 2024
This patch adds support for `-fxray-shared` on AArch64. This feature,
introduced in llvm#113548 for x86_64, enables the instrumentation of shared
libraries with XRay.

Changes:
- Adds AArch64 to the list of targets supporting `-fxray-shared`
- Introduces PIC versions of the AArch64 XRay trampolines 
- Adjusts relevant XRay tests
MaskRay pushed a commit that referenced this pull request Nov 8, 2024
This patch implements support for `-fxray-shared` on AArch64 and fixes a
remaining issue in the previous PR #114431.

A bug in the XRay `CMakeLists.txt` caused the XRay assembly sources to
be built for every architecture in `XRAY_DSO_SUPPORTED_ARCH` on Apple.
This led to the compiler trying to compile AArch64 assembly for X86
targets and vice versa.
This is addressed here by ensuring that assembly sources are only built
for the matching architecture (see fixup commit).

**Original PR description:**
This patch adds support for `-fxray-shared` on AArch64. This feature,
introduced in #113548 for x86_64, enables the instrumentation of shared
libraries with XRay.

Changes:
- Adds AArch64 to the list of targets supporting `-fxray-shared`
- Introduces PIC versions of the AArch64 XRay trampolines 
- Adjusts relevant XRay tests
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
…15300)

This patch implements support for `-fxray-shared` on AArch64 and fixes a
remaining issue in the previous PR llvm#114431.

A bug in the XRay `CMakeLists.txt` caused the XRay assembly sources to
be built for every architecture in `XRAY_DSO_SUPPORTED_ARCH` on Apple.
This led to the compiler trying to compile AArch64 assembly for X86
targets and vice versa.
This is addressed here by ensuring that assembly sources are only built
for the matching architecture (see fixup commit).

**Original PR description:**
This patch adds support for `-fxray-shared` on AArch64. This feature,
introduced in llvm#113548 for x86_64, enables the instrumentation of shared
libraries with XRay.

Changes:
- Adds AArch64 to the list of targets supporting `-fxray-shared`
- Introduces PIC versions of the AArch64 XRay trampolines 
- Adjusts relevant XRay tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category compiler-rt xray
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants