Would it be good to create a library object and use `LD_PRELOAD` to support self-profiling? #1

yskelg · 2024-08-07T18:06:19Z

Wow, This is really great Idea. Thank you for the inspiration @ThinkOpenly.

Using LD_PRELOAD to execute at the start with constructor and terminate at exit would be very convenient for profiling other program!
If we consider the interface of that library, we could also measure specific functions.

The text was updated successfully, but these errors were encountered:

ThinkOpenly · 2024-08-07T20:43:27Z

Wow, This is really great Idea. Thank you for the inspiration @ThinkOpenly.

I'm pleased that you like it!

Using LD_PRELOAD to execute at the start with constructor and terminate at exit would be very convenient for profiling other program!

Are you suggesting creating a library to be pre-loaded with a constructor that implements PROFILE_BEGIN and a destructor that implements PROFILE_END? That could work, but the constructor should set a flag that the other API methods would need to check every time, just in case LD_PRELOAD was not specified. This adds a bit of additional overhead.

Do you see significant advantage to using LD_PRELOAD in place of PROFILE_BEGIN/PROFILE_END?

If we consider the interface of that library, we could also measure specific functions.

Tell me more about what you are suggesting here.

The current implementation requires that the code to be profiled be instrumented with PROFILE_START/PROFILE_STOP. Are you suggesting there is a way to avoid having to instrument/compile/link by using LD_PRELOAD?

This code is a simple implementation of my idea, with a focus on making the self-profile portable. It seems useful, even if there's a call to the main function, because This method seems to reduce overhead compared to using "perf record" directly. We can directly insert the code according to its original purpose. Here are the test results from my Raspberry Pi 5, gcc version 12.2.0 (Debian 12.2.0-14) $ uname -a Linux paran 6.10.1-v8-16k+ ThinkOpenly#1 SMP PREEMPT Sat Jul 27 17:52:03 KST 2024 aarch64 GNU/Linux $ make run export PERF_COUNT_HW_CPU_CYCLES=1; ./test_profile Sorting... 00: { "H", 107, 0.900000 } 01: { "I", 111, 0.900000 } 02: { "G", 117, 0.900000 } 03: { "E", 127, 0.900000 } 04: { "F", 147, 0.900000 } 05: { "A", 157, 0.900000 } 06: { "K", 157, 0.900000 } 07: { "L", 157, 0.900000 } 08: { "M", 157, 0.900000 } 09: { "N", 157, 0.900000 } 10: { "O", 157, 0.900000 } 11: { "P", 157, 0.900000 } 12: { "Z", 157, 0.900000 } 13: { "C", 175, 0.900000 } 14: { "J", 227, 0.900000 } 15: { "B", 517, 0.900000 } 16: { "D", 571, 0.900000 } PERF_COUNT_HW_CPU_CYCLES(0): 7970 export PERF_COUNT_HW_CPU_CYCLES=1; LD_PRELOAD=self-profile.so ./preload_test_profile Sorting... PERF_COUNT_HW_CPU_CYCLES(0): 7444 export PERF_COUNT_HW_CPU_CYCLES=1; LD_PRELOAD=self-profile.so ./bsearch Sorting... PERF_COUNT_HW_CPU_CYCLES(0): 6779 Signed-off-by: Yunseong Kim <[email protected]>

yskelg · 2024-08-09T02:08:05Z

Thank you @ThinkOpenly for your comments, which have helped me to articulate the self-profiling project more clearly.

One of the key strengths of this project, in my opinion, is the ability to focus profiling specifically on the code where it's needed most.

As you know, If used alongside production code, I believe we can divide the activation into macros and build options—similar to how static trace points are activated with ftrace in the Linux kernel.

This project has reminded me of the importance of understanding the underlying principles to explore new approaches, rather than always relying on existing tools passively.

Do you see significant advantage to using LD_PRELOAD in place of PROFILE_BEGIN/PROFILE_END?

I think my focus is on portability with other executable program. My PR is a Proof of Concept based on what I’ve implemented so far, and I’m happy to update it with any additional ideas you might have. In #2 , I implemented the ability to measure the original main function.

Are you suggesting there is a way to avoid having to instrument/compile/link by using LD_PRELOAD?

If there’s a specific function the user wants to profile, similar to the main function in self-profile.c, this implementation allows for that. I focused on the reusability of the main function for now.

P.S.
If there are any additional features you'd like to see implemented, please feel free to open an issue or leave a comment.

Once again, thank you for the inspiration.

yskelg mentioned this issue Aug 8, 2024

self-profile: add LD_PRELOAD Support #2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Would it be good to create a library object and use `LD_PRELOAD` to support self-profiling? #1

Would it be good to create a library object and use `LD_PRELOAD` to support self-profiling? #1

yskelg commented Aug 7, 2024 •

edited

Loading

ThinkOpenly commented Aug 7, 2024

yskelg commented Aug 9, 2024

Would it be good to create a library object and use LD_PRELOAD to support self-profiling? #1

Would it be good to create a library object and use LD_PRELOAD to support self-profiling? #1

Comments

yskelg commented Aug 7, 2024 • edited Loading

ThinkOpenly commented Aug 7, 2024

yskelg commented Aug 9, 2024

Would it be good to create a library object and use `LD_PRELOAD` to support self-profiling? #1

Would it be good to create a library object and use `LD_PRELOAD` to support self-profiling? #1

yskelg commented Aug 7, 2024 •

edited

Loading