-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't observe performance improvement for built-in tests with propeller #3
Comments
Hi Uttam,
The main.cc callee.cc example was to demonstrate how Propeller does
basic block reordering. I would be really surprised if this improved
performance noticeably in any manner. This is a simple "Hello world"
where you can verify via nm that Propeller is indeed doing what you would
expect with the expected basic blocks being reordered.
Propeller is really effective on programs which are front end bound like
the clang benchmark.
Thanks
Sri
…On Thu, Oct 24, 2019 at 3:54 PM Uttam Pawar ***@***.***> wrote:
Hi,
I'm not able to observe the performance benefit due to propeller toolchain
for the included test program (main.cc, callee.cc). Followed the steps
given in Propeller_RFC.pdf.
High level observations:
1. Elapsed time doesn't show any improvement.
2. cycles and instruction, branch mispredicts are almost same
3. overall cache-misses are lower but L1-icache-load-misses are similar
$ time ./a.out.orig.labels 1000000000 2 >& /dev/null
real 0m21.094s
user 0m20.489s
sys 0m0.604s
$ time ./a.out.labels 1000000000 2 >& /dev/null
real 0m20.357s
user 0m19.908s
sys 0m0.448s
Elapsed time varies from 1 to 5%.
Perf data
$ perf stat -e
cycles,instructions,cache-misses,L1-icache-load-misses,br_misp_retired.all_branches,br_inst_retired.all_branches,icache_64b.iftag_stall
./a.out.o
rig.labels 1000000000 1> /dev/null
Performance counter stats for './a.out.orig.labels 1000000000':
80,231,347,233 cycles (66.67%)
243,314,361,618 instructions # 3.03 insn per cycle (83.33%)
22,522 cache-misses (83.33%)
2,644,077 L1-icache-load-misses (83.33%)
20,400,061 br_misp_retired.all_branches (83.33%)
53,442,616,374 br_inst_retired.all_branches (83.34%)
68,554,744 icache_64b.iftag_stall (57.14%)
21.191516400 seconds time elapsed
Optimized binary
$ perf stat -e
cycles,instructions,cache-misses,L1-icache-load-misses,br_misp_retired.all_branches,br_inst_retired.all_branches,icache_64b.iftag_stall
./a.out.l
abels 1000000000 1> /dev/null
Performance counter stats for './a.out.labels 1000000000':
81,446,698,907 cycles (66.66%)
243,218,220,681 instructions # 2.99 insn per cycle (83.33%)
14,907 cache-misses (83.34%)
2,533,002 L1-icache-load-misses (83.34%)
20,571,010 br_misp_retired.all_branches (83.34%)
53,455,580,211 br_inst_retired.all_branches (83.33%)
68,847,492 icache_64b.iftag_stall (57.14%)
21.512644234 seconds time elapsed
The referenced paper doesn't mention the benefit for the included test
program. What is expected improvement for the included test?
Please see more details (build, runtime steps, etc.) in following gist.
https://gist.github.com/uttampawar/5407f998bc3f02f58c4b83b0b4dc20fe
Any hint is appreciated.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3?email_source=notifications&email_token=AJJPQRZPEYHF7PUBRKMD6Q3QQIRRVA5CNFSM4JE4OKA2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HUH27DQ>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJJPQR7BOV2NCAR67JXU3STQQIRRVANCNFSM4JE4OKAQ>
.
|
@tmsri Thanks, this makes sense. Across, I'm going to apply this methodology to couple of runtimes (PHP and Node.js/V8) where I've seen > 35% CPU front-end bound bottleneck. Here is an specific example of running Ghost.js (Node.js workload), I see large front-end bound stalls. These numbers are derived using the TMAM Methodology We have fixed or at least reduced the ITLB_Misses by 3% with use of large_pages (Node.js optimizations) which also reduced Frontend_stall by the same amount. I want to apply propeller toolchain to see if it can help reduce these stalls with optimal code layout. |
On Fri, Oct 25, 2019 at 9:57 AM Uttam Pawar ***@***.***> wrote:
@tmsri <https://github.com/tmsri> Thanks, this makes sense. I'm going to
apply this methodology to couple of runtimes (PHP and Node.js/V8) where
I've seen > 35% CPU front-end bound bottleneck.
Sounds good, please let us know how this goes. We are happy to help in
case you run into issues.
Sri
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3?email_source=notifications&email_token=AJJPQR3RWW2VCKWJLWYHBXLQQMQP7A5CNFSM4JE4OKA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECI6DKA#issuecomment-546431400>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJJPQRYRXFN4PLPD5CK4YOLQQMQP7ANCNFSM4JE4OKAQ>
.
|
During register coalescing, we update the live-intervals on-the-fly. To do that we are in this strange mode where the live-intervals can be slightly out-of-sync (more precisely they are forward looking) compared to what the IR actually represents. This happens because the register coalescer only updates the IR when it is done with updating the live-intervals and it has to do it this way because updating the IR on-the-fly would actually clobber some information on how the live-ranges that are being updated look like. This is problematic for updates that rely on the IR to accurately represents the state of the live-ranges. Right now, we have only one of those: stripValuesNotDefiningMask. To reconcile this need of out-of-sync IR, this patch introduces a new argument to LiveInterval::refineSubRanges that allows the code doing the live range updates to reason about how the code should look like after the coalescer will have rewritten the registers. Essentially this captures how a subregister index with be offseted to match its position in a new register class. E.g., let say we want to merge: V1.sub1:<2 x s32> = COPY V2.sub3:<4 x s32> We do that by choosing a class where sub1:<2 x s32> and sub3:<4 x s32> overlap, i.e., by choosing a class where we can find "offset + 1 == 3". Put differently we align V2's sub3 with V1's sub1: V2: sub0 sub1 sub2 sub3 V1: <offset> sub0 sub1 This offset will look like a composed subregidx in the the class: V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32> => V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32> Now if we didn't rewrite the uses and def of V1, all the checks for V1 need to account for this offset to match what the live intervals intend to capture. Prior to this patch, we would fail to recognize the uses and def of V1 and would end up with machine verifier errors: No live segment at def. This could lead to miscompile as we would drop some live-ranges and thus, miss some interferences. For this problem to trigger, we need to reach stripValuesNotDefiningMask while having a mismatch between the IR and the live-ranges (i.e., we have to apply a subreg offset to the IR.) This requires the following three conditions: 1. An update of overlapping subreg lanes: e.g., dsub0 == <ssub0, ssub1> 2. An update with Tuple registers with a possibility to coalesce the subreg index: e.g., v1.dsub_1 == v2.dsub_3 3. Subreg liveness enabled. looking at the IR to decide what is alive and what is not, i.e., calling stripValuesNotDefiningMask. coalescer maintains for the live-ranges information. None of the targets that currently use subreg liveness (i.e., the targets that fulfill #3, Hexagon, AMDGPU, PowerPC, and SystemZ IIRC) expose #1 and and #2, so this patch also artificial enables subreg liveness for ARM, so that a nice test case can be attached.
Note: I tried to use propeller optimization technique on two workloads but don't see any benefit. Here is a new data point for the 'node' binary which includes 'd8' JavaScript engine which is built with llvm-propeller and a "webtooling" workload (https://github.com/v8/web-tooling-benchmark). Performance data using "perf" is in the gist at, On the side note, can someone provide me instructions on how to use 'propeller' steps to build clang compiler and verify the benefit with 'clang' compiler as described in the paper? |
Hi Uttam,
On Mon, Nov 25, 2019 at 6:04 PM Uttam Pawar ***@***.***> wrote:
Note: I tried to use propeller optimization technique on two workloads but
don't see any benefit.
Here is a new data point for the 'node' binary which includes 'd8'
JavaScript engine which is built with llvm-propeller and a "webtooling"
workload (https://github.com/v8/web-tooling-benchmark).
Performance data using "perf" is in the gist at,
https://gist.github.com/uttampawar/759078fa749170b6b75815874a81162a
We will try this out too and let you know. Do you know how front end
bound this benchmark is?
On the side note, can someone provide me instructions on how to use
'propeller' steps to build clang compiler and verify the benefit with
'clang' compiler as described in the paper?
There is a "plo" directory that is checked in as part of
google/llvm-propeller. Do the following:
$ cd plo
$ make check-performance
This will build clang with propeller and you should be able to verify the
performance.
Thanks
Sri
… I appreciate the help. TIA.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3?email_source=notifications&email_token=AJJPQR7HO4QE7Y7DVGTAJFDQVR72HA5CNFSM4JE4OKA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFEOHDA#issuecomment-558424972>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJJPQRZCQTTSJTJHARLZQ4TQVR72HANCNFSM4JE4OKAQ>
.
|
Sri, Following are the details about frontend-bound stalls (calculated using TMAM methodology while workload is in steady state) And Ghost.js (original): |
@tmsri make check-performance failed. cherEmitter3runERN4llvm11raw_ostreamE.$45fc0800c800f0679e2102212a27a313.llvm.6216348528576891080)
...
ld.lld: error: too many errors emitted, stopping now (use -error-limit=0 to see all errors) BTW, is there is a way to test/verify the benefit just between default clang and clang with propeller (without PGO or LTO)? |
On Tue, Nov 26, 2019 at 10:43 AM Uttam Pawar ***@***.***> wrote:
@tmsri <https://github.com/tmsri> make check-performance failed.
cherEmitter3runERN4llvm11raw_ostreamE.$45fc0800c800f0679e2102212a27a313.llvm.6216348528576891080)
referenced 378 more times
...
ld.lld: error: undefined symbol:
llvm::Record::getValueAsInt(llvm::StringRef) const
referenced by AsmMatcherEmitter.cpp
/mnt/sdb1/upawar/llvm-propeller/pl
Sorry about that, I think we synced yesterday and this caused a break. We
will fix this asap.
… ..
...
dcd29.tmp.o:((anonymous
namespace)::CallingConvEmitter::EmitAction(llvm::Record*, unsigned int,
llvm::raw_ostream&) (.$4422eb9397b83d587c5eecd7b780bf63))
referenced 97 more times
ld.lld: error: too many errors emitted, stopping now (use -error-limit=0
to see all errors)
clang-10: error: linker command failed with exit code 1 (use -v to see
invocation)
ninja: build stopped: subcommand failed.
Makefile:189: recipe for target 'pgo/build/bin/clang-10' failed
make: *** [pgo/build/bin/clang-10] Error 1
This looks like error during LTO build.
BTW, is there is a way to test/verify the benefit just between default
clang and clang with propeller (without PGO or LTO)?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3?email_source=notifications&email_token=AJJPQR6ME57GJ7AJEN3GS3DQVVU6HA5CNFSM4JE4OKA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFHBL6I#issuecomment-558765561>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJJPQRZLYUD6GQTH2KCRXTLQVVU6HANCNFSM4JE4OKAQ>
.
|
@uttampawar |
Cool. I'll give it a try. Thanks. |
@rlavaee The performance test with "make check-performance" progressed but didn't succeed completely. It looks like pgo-labels build failed. Looking into plo/pgo-labels I found, $ cd plo Other binaries I found in "plo" ... $ ls -l stage-pgo-labels/build/bin/clang-10 $ ls -l pgo-vanilla/build/bin/clang-10 (0 byte file) $ ls -l stage-pgo-vanilla/build/bin/clang-10 My environment: Change in paths.mk file, Am I missing something or environmental issue? Any help is appreciated. TIA. |
@tmsri @rlavaee See complete log in a gist at, |
Hi, Uttam, sorry for the late reply, just got back from Thanksgiving.
I see lots of gold linker plugin errors in building pgo-lables. In theory,
gold linker should never be involved in the whole process. Let me dig a
little bit to see how this could happen.
Thanks,
Han
…On Wed, Nov 27, 2019 at 12:04 PM Uttam Pawar ***@***.***> wrote:
@tmsri <https://github.com/tmsri> @rlavaee <https://github.com/rlavaee>
See complete log in a gist at,
https://gist.github.com/uttampawar/8f6d1ec0c9627d50066cb2e0ca35859a
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3?email_source=notifications&email_token=AELA2QUWWFGZIKQ3XPMPYALQV3HEPA5CNFSM4JE4OKA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFKTT7I#issuecomment-559233533>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AELA2QUHS5NOP2BNC5JMJR3QV3HEPANCNFSM4JE4OKAQ>
.
--
Han Shen | Software Engineer | [email protected] | +1-650-440-3330
|
@shenhanc78 Okay. Thanks for the followup. |
@shenhanc78 Any update? TIA. |
Hi Uttam, yup. I've just pushed a new version which forces lld to be used
along all builds. You may pull / sync and have another try.
I'm happy to help if any further problems.
Thanks!
-Han
…On Fri, Dec 6, 2019 at 10:32 AM Uttam Pawar ***@***.***> wrote:
@shenhanc78 <https://github.com/shenhanc78> Any update? TIA.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3?email_source=notifications&email_token=AELA2QWS6O4IYEUL4E6R2PDQXKLDTA5CNFSM4JE4OKA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGE6ZZA#issuecomment-562687204>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AELA2QWZ3DR544MZVRE5Z3DQXKLDTANCNFSM4JE4OKAQ>
.
--
Han Shen | Software Engineer | [email protected] | +1-650-440-3330
|
That's great. I'll give it a try. Thanks. |
…instructions This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966
…t binding This fixes a failing testcase on Fedora 30 x86_64 (regression Fedora 29->30): PASS: ./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit * frame #0: 0x00007ffff7aa6e75 libc.so.6`__GI_raise + 325 frame #1: 0x00007ffff7a91895 libc.so.6`__GI_abort + 295 frame #2: 0x0000000000401140 a.out`func_c at main.c:12:2 frame #3: 0x000000000040113a a.out`func_b at main.c:18:2 frame #4: 0x0000000000401134 a.out`func_a at main.c:26:2 frame #5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2 frame #6: 0x00007ffff7a92f33 libc.so.6`__libc_start_main + 243 frame #7: 0x000000000040106e a.out`_start + 46 vs. FAIL - unrecognized abort() function: ./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit * frame #0: 0x00007ffff7aa6e75 libc.so.6`.annobin_raise.c + 325 frame #1: 0x00007ffff7a91895 libc.so.6`.annobin_loadmsgcat.c_end.unlikely + 295 frame #2: 0x0000000000401140 a.out`func_c at main.c:12:2 frame #3: 0x000000000040113a a.out`func_b at main.c:18:2 frame #4: 0x0000000000401134 a.out`func_a at main.c:26:2 frame #5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2 frame #6: 0x00007ffff7a92f33 libc.so.6`.annobin_libc_start.c + 243 frame #7: 0x000000000040106e a.out`.annobin_init.c.hot + 46 The extra ELF symbols are there due to Annobin (I did not investigate why this problem happened specifically since F-30 and not since F-28). It is due to: Symbol table '.dynsym' contains 2361 entries: Valu e Size Type Bind Vis Name 0000000000022769 5 FUNC LOCAL DEFAULT _nl_load_domain.cold 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_abort.c.unlikely ... 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_loadmsgcat.c_end.unlikely ... 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_textdomain.c_end.unlikely 000000000002276e 548 FUNC GLOBAL DEFAULT abort 000000000002276e 548 FUNC GLOBAL DEFAULT abort@@GLIBC_2.2.5 000000000002276e 548 FUNC LOCAL DEFAULT __GI_abort 0000000000022992 0 NOTYPE LOCAL HIDDEN .annobin_abort.c_end.unlikely GDB has some more complicated preferences between overlapping and/or sharing address symbols, I have made here so far the most simple fix for this case. Differential revision: https://reviews.llvm.org/D63540
…imizing part. Summary: This is the next portion of patches for dsymutil. Create DwarfEmitter interface to generate all debug info tables. Put DwarfEmitter into DwarfLinker library and make tools/dsymutil/DwarfStreamer to be child of DwarfEmitter. It passes check-all testing. MD5 checksum for clang .dSYM bundle matches for the dsymutil with/without that patch. Reviewers: JDevlieghere, friss, dblaikie, aprantl Reviewed By: JDevlieghere Subscribers: merge_guards_bot, hiraditya, thegameg, probinson, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D72476
The test is currently failing on some systems with ASAN enabled due to: ``` ==22898==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000003da4 at pc 0x00010951c33d bp 0x7ffee6709e00 sp 0x7ffee67095c0 READ of size 5 at 0x603000003da4 thread T0 #0 0x10951c33c in wrap_memmove+0x16c (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x1833c) #1 0x7fff4a327f57 in CFDataReplaceBytes+0x1ba (CoreFoundation:x86_64+0x13f57) #2 0x7fff4a415a44 in __CFDataInit+0x2db (CoreFoundation:x86_64+0x101a44) #3 0x1094f8490 in main main.m:424 #4 0x7fff77482084 in start+0x0 (libdyld.dylib:x86_64+0x17084) 0x603000003da4 is located 0 bytes to the right of 20-byte region [0x603000003d90,0x603000003da4) allocated by thread T0 here: #0 0x109547c02 in wrap_calloc+0xa2 (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x43c02) #1 0x7fff763ad3ef in class_createInstance+0x52 (libobjc.A.dylib:x86_64+0x73ef) #2 0x7fff4c6b2d73 in NSAllocateObject+0x12 (Foundation:x86_64+0x1d73) #3 0x7fff4c6b5e5f in -[_NSPlaceholderData initWithBytes:length:copy:deallocator:]+0x40 (Foundation:x86_64+0x4e5f) #4 0x7fff4c6d4cf1 in -[NSData(NSData) initWithBytes:length:]+0x24 (Foundation:x86_64+0x23cf1) #5 0x1094f8245 in main main.m:404 #6 0x7fff77482084 in start+0x0 (libdyld.dylib:x86_64+0x17084) ``` The reason is that we create a string "HELLO" but get the size wrong (it's 5 bytes instead of 4). Later on we read the buffer and pretend it is 5 bytes long, causing an OOB read which ASAN detects. In general this test probably needs some cleanup as it produces on macOS 10.15 around 100 compiler warnings which isn't great, but let's first get the bot green.
This reverts commit e57a9ab. Parser/cxx2a-placeholder-type-constraint.cpp has MSan failures. Present at 7b81c3f: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/17133/steps/check-clang%20msan/logs/stdio not present at eaa594f: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/17132/steps/check-clang%20msan/logs/stdio Stack trace: ``` ==57032==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0xccfe016 in clang::AutoTypeLoc::getLocalSourceRange() const /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/include/clang/AST/TypeLoc.h:2036:19 #1 0xcc56758 in CheckDeducedPlaceholderConstraints(clang::Sema&, clang::AutoType const&, clang::AutoTypeLoc, clang::QualType) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaTemplateDeduction.cpp:4505:56 #2 0xcc550ce in clang::Sema::DeduceAutoType(clang::TypeLoc, clang::Expr*&, clang::QualType&, llvm::Optional<unsigned int>, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaTemplateDeduction.cpp:4707:11 #3 0xcc52407 in clang::Sema::DeduceAutoType(clang::TypeSourceInfo*, clang::Expr*&, clang::QualType&, llvm::Optional<unsigned int>, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaTemplateDeduction.cpp:4457:10 #4 0xba38332 in clang::Sema::deduceVarTypeFromInitializer(clang::VarDecl*, clang::DeclarationName, clang::QualType, clang::TypeSourceInfo*, clang::SourceRange, bool, clang::Expr*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaDecl.cpp:11351:7 #5 0xba3a8a9 in clang::Sema::DeduceVariableDeclarationType(clang::VarDecl*, bool, clang::Expr*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaDecl.cpp:11385:26 #6 0xba3c520 in clang::Sema::AddInitializerToDecl(clang::Decl*, clang::Expr*, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaDecl.cpp:11725:9 #7 0xb39c498 in clang::Parser::ParseDeclarationAfterDeclaratorAndAttributes(clang::Declarator&, clang::Parser::ParsedTemplateInfo const&, clang::Parser::ForRangeInit*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:2399:17 #8 0xb394d80 in clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec&, clang::DeclaratorContext, clang::SourceLocation*, clang::Parser::ForRangeInit*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:2128:21 #9 0xb383bbf in clang::Parser::ParseSimpleDeclaration(clang::DeclaratorContext, clang::SourceLocation&, clang::Parser::ParsedAttributesWithRange&, bool, clang::Parser::ForRangeInit*, clang::SourceLocation*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:1848:10 #10 0xb383129 in clang::Parser::ParseDeclaration(clang::DeclaratorContext, clang::SourceLocation&, clang::Parser::ParsedAttributesWithRange&, clang::SourceLocation*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/ADT/PointerUnion.h #11 0xb53a388 in clang::Parser::ParseStatementOrDeclarationAfterAttributes(llvm::SmallVector<clang::Stmt*, 32u>&, clang::Parser::ParsedStmtContext, clang::SourceLocation*, clang::Parser::ParsedAttributesWithRange&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:221:13 #12 0xb539309 in clang::Parser::ParseStatementOrDeclaration(llvm::SmallVector<clang::Stmt*, 32u>&, clang::Parser::ParsedStmtContext, clang::SourceLocation*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:106:20 #13 0xb55610e in clang::Parser::ParseCompoundStatementBody(bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:1079:11 #14 0xb559529 in clang::Parser::ParseFunctionStatementBody(clang::Decl*, clang::Parser::ParseScope&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:2204:21 #15 0xb33c13e in clang::Parser::ParseFunctionDefinition(clang::ParsingDeclarator&, clang::Parser::ParsedTemplateInfo const&, clang::Parser::LateParsedAttrList*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:1339:10 #16 0xb394703 in clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec&, clang::DeclaratorContext, clang::SourceLocation*, clang::Parser::ForRangeInit*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:2068:11 #17 0xb338e52 in clang::Parser::ParseDeclOrFunctionDefInternal(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:1099:10 #18 0xb337674 in clang::Parser::ParseDeclarationOrFunctionDefinition(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:1115:12 #19 0xb334a96 in clang::Parser::ParseExternalDeclaration(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:935:12 #20 0xb32f12a in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:686:12 #21 0xb31e193 in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:158:20 #22 0x80263f0 in clang::FrontendAction::Execute() /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:936:8 #23 0x7f2a257 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:965:33 #24 0x8288bef in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:290:25 #25 0xad44c2 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/tools/driver/cc1_main.cpp:239:15 #26 0xacd76a in ExecuteCC1Tool(llvm::ArrayRef<char const*>) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/tools/driver/driver.cpp:325:12 #27 0xacc9fd in main /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/tools/driver/driver.cpp:398:12 #28 0x7f7d82cdb2e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #29 0xa4dde9 in _start (/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm_build_msan/bin/clang-11+0xa4dde9) ```
This reverts commit dfecec6. Merging the change revealed that there is a failure on the memory sanitizer bots. Command Output (stderr): -- ==3569==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x1d71bff in llvm::AVRSubtarget::ParseSubtargetFeatures(llvm::StringRef, llvm::StringRef) /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/lib/Target/AVR/AVRGenSubtargetInfo.inc:471:7 #1 0x1d721f8 in initializeSubtargetDependencies /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AVR/AVRSubtarget.cpp:50:3 #2 0x1d721f8 in llvm::AVRSubtarget::AVRSubtarget(llvm::Triple const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, llvm::AVRTargetMachine const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AVR/AVRSubtarget.cpp:33:18 #3 0x1d3077f in llvm::AVRTargetMachine::AVRTargetMachine(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, llvm::Optional<llvm::Reloc::Model>, llvm::Optional<llvm::CodeModel::Model>, llvm::CodeGenOpt::Level, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AVR/AVRTargetMachine.cpp:52:7 #4 0x1d3169d in llvm::RegisterTargetMachine<llvm::AVRTargetMachine>::Allocator(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, llvm::Optional<llvm::Reloc::Model>, llvm::Optional<llvm::CodeModel::Model>, llvm::CodeGenOpt::Level, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/TargetRegistry.h:1121:16 #5 0x86662f in createTargetMachine /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/TargetRegistry.h:402:12 #6 0x86662f in compileModule(char**, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:473:52 #7 0x861f42 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:356:22 #8 0x7f76f7b072e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #9 0x7ebbc9 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc+0x7ebbc9) SUMMARY: MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/lib/Target/AVR/AVRGenSubtargetInfo.inc:471:7 in llvm::AVRSubtarget::ParseSubtargetFeatures(llvm::StringRef, llvm::StringRef) Exiting FileCheck error: '<stdin>' is empty. -- The patch wiill be re-committed once fixed.
hi @uttampawar can you share how you built the node binary for your optimization experiment? |
Summary: Previously `AtosSymbolizer` would set the PID to examine in the constructor which is called early on during sanitizer init. This can lead to incorrect behaviour in the case of a fork() because if the symbolizer is launched in the child it will be told examine the parent process rather than the child. To fix this the PID is determined just before the symbolizer is launched. A test case is included that triggers the buggy behaviour that existed prior to this patch. The test observes the PID that `atos` was called on. It also examines the symbolized stacktrace. Prior to this patch `atos` failed to symbolize the stacktrace giving output that looked like... ``` #0 0x100fc3bb5 in __sanitizer_print_stack_trace asan_stack.cpp:86 #1 0x10490dd36 in PrintStack+0x56 (/path/to/print-stack-trace-in-code-loaded-after-fork.cpp.tmp_shared_lib.dylib:x86_64+0xd36) #2 0x100f6f986 in main+0x4a6 (/path/to/print-stack-trace-in-code-loaded-after-fork.cpp.tmp_loader:x86_64+0x100001986) #3 0x7fff714f1cc8 in start+0x0 (/usr/lib/system/libdyld.dylib:x86_64+0x1acc8) ``` After this patch stackframes `#1` and `#2` are fully symbolized. This patch is also a pre-requisite refactor for rdar://problem/58789439. Reviewers: kubamracek, yln Subscribers: #sanitizers, llvm-commits Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D77623
Summary: Previously `AtosSymbolizer` would set the PID to examine in the constructor which is called early on during sanitizer init. This can lead to incorrect behaviour in the case of a fork() because if the symbolizer is launched in the child it will be told examine the parent process rather than the child. To fix this the PID is determined just before the symbolizer is launched. A test case is included that triggers the buggy behaviour that existed prior to this patch. The test observes the PID that `atos` was called on. It also examines the symbolized stacktrace. Prior to this patch `atos` failed to symbolize the stacktrace giving output that looked like... ``` #0 0x100fc3bb5 in __sanitizer_print_stack_trace asan_stack.cpp:86 #1 0x10490dd36 in PrintStack+0x56 (/path/to/print-stack-trace-in-code-loaded-after-fork.cpp.tmp_shared_lib.dylib:x86_64+0xd36) #2 0x100f6f986 in main+0x4a6 (/path/to/print-stack-trace-in-code-loaded-after-fork.cpp.tmp_loader:x86_64+0x100001986) #3 0x7fff714f1cc8 in start+0x0 (/usr/lib/system/libdyld.dylib:x86_64+0x1acc8) ``` After this patch stackframes `#1` and `#2` are fully symbolized. This patch is also a pre-requisite refactor for rdar://problem/58789439. Reviewers: kubamracek, yln Subscribers: #sanitizers, llvm-commits Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D77623
Summary: crash stack: ``` lang: tools/clang/include/clang/AST/AttrImpl.inc:1490: unsigned int clang::AlignedAttr::getAlignment(clang::ASTContext &) const: Assertion `!isAlignmentDependent()' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: ./bin/clang -cc1 -std=c++1y -ast-dump -frecovery-ast -fcxx-exceptions /tmp/t4.cpp 1. /tmp/t4.cpp:3:31: current parser token ';' #0 0x0000000002530cff llvm::sys::PrintStackTrace(llvm::raw_ostream&) llvm-project/llvm/lib/Support/Unix/Signals.inc:564:13 #1 0x000000000252ee30 llvm::sys::RunSignalHandlers() llvm-project/llvm/lib/Support/Signals.cpp:69:18 #2 0x000000000253126c SignalHandler(int) llvm-project/llvm/lib/Support/Unix/Signals.inc:396:3 #3 0x00007f86964d0520 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x13520) #4 0x00007f8695f9ff61 raise /build/glibc-oCLvUT/glibc-2.29/signal/../sysdeps/unix/sysv/linux/raise.c:51:1 #5 0x00007f8695f8b535 abort /build/glibc-oCLvUT/glibc-2.29/stdlib/abort.c:81:7 #6 0x00007f8695f8b40f _nl_load_domain /build/glibc-oCLvUT/glibc-2.29/intl/loadmsgcat.c:1177:9 #7 0x00007f8695f98b92 (/lib/x86_64-linux-gnu/libc.so.6+0x32b92) #8 0x0000000004503d9f llvm::APInt::getZExtValue() const llvm-project/llvm/include/llvm/ADT/APInt.h:1623:5 #9 0x0000000004503d9f clang::AlignedAttr::getAlignment(clang::ASTContext&) const llvm-project/build/tools/clang/include/clang/AST/AttrImpl.inc:1492:0 ``` Reviewers: sammccall Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D78085
Bitcode file alignment is only 32-bit so 64-bit offsets need special handling. /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:6327:28: runtime error: load of misaligned address 0x7fca2bcfe54c for type 'const uint64_t' (aka 'const unsigned long'), which requires 8 byte alignment 0x7fca2bcfe54c: note: pointer points here 00 00 00 00 5a a6 01 00 00 00 00 00 19 a7 01 00 00 00 00 00 48 a7 01 00 00 00 00 00 7d a7 01 00 ^ #0 0x3be2fe4 in clang::ASTReader::TypeCursorForIndex(unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:6327:28 #1 0x3be30a0 in clang::ASTReader::readTypeRecord(unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:6348:24 #2 0x3bd3d4a in clang::ASTReader::GetType(unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:6985:26 #3 0x3c5d9ae in clang::ASTDeclReader::Visit(clang::Decl*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:533:31 #4 0x3c91cac in clang::ASTReader::ReadDeclRecord(unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:4045:10 #5 0x3bd4fb1 in clang::ASTReader::GetDecl(unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:7352:5 #6 0x3bce2f9 in clang::ASTReader::ReadASTBlock(clang::serialization::ModuleFile&, unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:3625:22 #7 0x3bd6d75 in clang::ASTReader::ReadAST(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, unsigned int, llvm::SmallVectorImpl<clang::ASTReader::ImportedSubmodule>*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:4230:32 #8 0x3a6b415 in clang::CompilerInstance::createPCHExternalASTSource(llvm::StringRef, llvm::StringRef, bool, bool, clang::Preprocessor&, clang::InMemoryModuleCache&, clang::ASTContext&, clang::PCHContainerReader const&, llvm::ArrayRef<std::shared_ptr<clang::ModuleFileExtension> >, llvm::ArrayRef<std::shared_ptr<clang::DependencyCollector> >, void*, bool, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:539:19 #9 0x3a6b00e in clang::CompilerInstance::createPCHExternalASTSource(llvm::StringRef, bool, bool, void*, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:501:18 #10 0x3abac80 in clang::FrontendAction::BeginSourceFile(clang::CompilerInstance&, clang::FrontendInputFile const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:865:12 #11 0x3a6e61c in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:972:13 #12 0x3ba74bf in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:282:25 #13 0xa3f753 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/cc1_main.cpp:240:15 #14 0xa3a68a in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:330:12 #15 0xa37f31 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:407:12 #16 0x7fca2a7032e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #17 0xa21029 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/clang-11+0xa21029) This reverts commit 30d5946.
Since G_ICMP can be selected to a SUBS, we can fold shifts into such compares. E.g. ``` cmp w1, w0, lsl #3 cmp w1, w0, lsr #3 cmp w1, w0, asr #3 ``` This is done the same way as for adds and subtracts, using `selectShiftedRegister`. This gives some minor code size savings on CTMark. https://reviews.llvm.org/D79365
Summary: The previous code tries to strip out parentheses and anything in between them. I'm guessing the idea here was to try to drop any listed arguments for the function being symbolized. Unfortunately this approach is broken in several ways. * Templated functions may contain parentheses. The existing approach messes up these names. * In C++ argument types are part of a function's signature for the purposes of overloading so removing them could be confusing. Fix this simply by not trying to adjust the function name that comes from `atos`. A test case is included. Without the change the test case produced output like: ``` WRITE of size 4 at 0x6060000001a0 thread T0 #0 0x10b96614d in IntWrapper<void >::operator=> const&) asan-symbolize-templated-cxx.cpp:10 #1 0x10b960b0e in void writeToA<IntWrapper<void > >>) asan-symbolize-templated-cxx.cpp:30 #2 0x10b96bf27 in decltype>)>> >)) std::__1::__invoke<void >), IntWrapper<void > >>), IntWrapper<void >&&) type_traits:4425 #3 0x10b96bdc1 in void std::__1::__invoke_void_return_wrapper<void>::__call<void >), IntWrapper<void > >>), IntWrapper<void >&&) __functional_base:348 #4 0x10b96bd71 in std::__1::__function::__alloc_func<void >), std::__1::allocator<void >)>, void >)>::operator>&&) functional:1533 #5 0x10b9684e2 in std::__1::__function::__func<void >), std::__1::allocator<void >)>, void >)>::operator>&&) functional:1707 #6 0x10b96cd7b in std::__1::__function::__value_func<void >)>::operator>&&) const functional:1860 #7 0x10b96cc17 in std::__1::function<void >)>::operator>) const functional:2419 #8 0x10b960ca6 in Foo<void >), IntWrapper<void > >::doCall>) asan-symbolize-templated-cxx.cpp:44 #9 0x10b96088b in main asan-symbolize-templated-cxx.cpp:54 #10 0x7fff6ffdfcc8 in start (in libdyld.dylib) + 0 ``` Note how the symbol names for the frames are messed up (e.g. #8, #1). With the patch the output looks like: ``` WRITE of size 4 at 0x6060000001a0 thread T0 #0 0x10005214d in IntWrapper<void (int)>::operator=(IntWrapper<void (int)> const&) asan-symbolize-templated-cxx.cpp:10 #1 0x10004cb0e in void writeToA<IntWrapper<void (int)> >(IntWrapper<void (int)>) asan-symbolize-templated-cxx.cpp:30 #2 0x100057f27 in decltype(std::__1::forward<void (*&)(IntWrapper<void (int)>)>(fp)(std::__1::forward<IntWrapper<void (int)> >(fp0))) std::__1::__invoke<void (*&)(IntWrapper<void (int)>), IntWrapper<void (int)> >(void (*&)(IntWrapper<void (int)>), IntWrapper<void (int)>&&) type_traits:4425 #3 0x100057dc1 in void std::__1::__invoke_void_return_wrapper<void>::__call<void (*&)(IntWrapper<void (int)>), IntWrapper<void (int)> >(void (*&)(IntWrapper<void (int)>), IntWrapper<void (int)>&&) __functional_base:348 #4 0x100057d71 in std::__1::__function::__alloc_func<void (*)(IntWrapper<void (int)>), std::__1::allocator<void (*)(IntWrapper<void (int)>)>, void (IntWrapper<void (int)>)>::operator()(IntWrapper<void (int)>&&) functional:1533 #5 0x1000544e2 in std::__1::__function::__func<void (*)(IntWrapper<void (int)>), std::__1::allocator<void (*)(IntWrapper<void (int)>)>, void (IntWrapper<void (int)>)>::operator()(IntWrapper<void (int)>&&) functional:1707 #6 0x100058d7b in std::__1::__function::__value_func<void (IntWrapper<void (int)>)>::operator()(IntWrapper<void (int)>&&) const functional:1860 #7 0x100058c17 in std::__1::function<void (IntWrapper<void (int)>)>::operator()(IntWrapper<void (int)>) const functional:2419 #8 0x10004cca6 in Foo<void (IntWrapper<void (int)>), IntWrapper<void (int)> >::doCall(IntWrapper<void (int)>) asan-symbolize-templated-cxx.cpp:44 #9 0x10004c88b in main asan-symbolize-templated-cxx.cpp:54 #10 0x7fff6ffdfcc8 in start (in libdyld.dylib) + 0 ``` rdar://problem/58887175 Reviewers: kubamracek, yln Subscribers: #sanitizers, llvm-commits Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D79597
sanitizer-x86_64-linux-autoconf has failed after the previous tsan commit: FAIL: ThreadSanitizer-x86_64 :: java_finalizer2.cpp (245 of 403) ******************** TEST 'ThreadSanitizer-x86_64 :: java_finalizer2.cpp' FAILED ******************** Script: -- : 'RUN: at line 1'; /b/sanitizer-x86_64-linux-autoconf/build/tsan_debug_build/./bin/clang --driver-mode=g++ -fsanitize=thread -Wall -m64 -gline-tables-only -I/b/sanitizer-x86_64-linux-autoconf/build/llvm-project/compiler-rt/test/tsan/../ -std=c++11 -I/b/sanitizer-x86_64-linux-autoconf/build/llvm-project/compiler-rt/test/tsan/../ -nostdinc++ -I/b/sanitizer-x86_64-linux-autoconf/build/tsan_debug_build/tools/clang/runtime/compiler-rt-bins/lib/tsan/libcxx_tsan_x86_64/include/c++/v1 -O1 /b/sanitizer-x86_64-linux-autoconf/build/llvm-project/compiler-rt/test/tsan/java_finalizer2.cpp -o /b/sanitizer-x86_64-linux-autoconf/build/tsan_debug_build/tools/clang/runtime/compiler-rt-bins/test/tsan/X86_64Config/Output/java_finalizer2.cpp.tmp && /b/sanitizer-x86_64-linux-autoconf/build/tsan_debug_build/tools/clang/runtime/compiler-rt-bins/test/tsan/X86_64Config/Output/java_finalizer2.cpp.tmp 2>&1 | FileCheck /b/sanitizer-x86_64-linux-autoconf/build/llvm-project/compiler-rt/test/tsan/java_finalizer2.cpp -- Exit Code: 1 Command Output (stderr): -- /b/sanitizer-x86_64-linux-autoconf/build/llvm-project/compiler-rt/test/tsan/java_finalizer2.cpp:82:11: error: CHECK: expected string not found in input // CHECK: DONE ^ <stdin>:1:1: note: scanning from here FATAL: ThreadSanitizer CHECK failed: /b/sanitizer-x86_64-linux-autoconf/build/llvm-project/compiler-rt/lib/tsan/rtl/tsan_sync.cpp:69 "((*meta)) == ((0))" (0x4000003e, 0x0) ^ <stdin>:5:12: note: possible intended match here #3 __tsan::OnUserAlloc(__tsan::ThreadState*, unsigned long, unsigned long, unsigned long, bool) /b/sanitizer-x86_64-linux-autoconf/build/llvm-project/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:225:16 (java_finalizer2.cpp.tmp+0x4af407) ^ http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/51143/steps/test%20tsan%20in%20debug%20compiler-rt%20build/logs/stdio Fix heap object overlap by offsetting java heap as other tests are doing.
Summary: crash stack: ``` llvm-project/clang/lib/AST/ASTContext.cpp:2248: clang::TypeInfo clang::ASTContext::getTypeInfoImpl(const clang::Type *) const: Assertion `!A->getDeducedType().isNull() && "cannot request the size of an undeduced or dependent auto type"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: #0 0x00000000025bb0bf llvm::sys::PrintStackTrace(llvm::raw_ostream&) llvm-project/llvm/lib/Support/Unix/Signals.inc:564:13 #1 0x00000000025b92b0 llvm::sys::RunSignalHandlers() llvm-project/llvm/lib/Support/Signals.cpp:69:18 #2 0x00000000025bb535 SignalHandler(int) llvm-project/llvm/lib/Support/Unix/Signals.inc:396:3 #3 0x00007f9ef9298110 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14110) #4 0x00007f9ef8d72761 raise /build/glibc-M65Gwz/glibc-2.30/signal/../sysdeps/unix/sysv/linux/raise.c:51:1 #5 0x00007f9ef8d5c55b abort /build/glibc-M65Gwz/glibc-2.30/stdlib/abort.c:81:7 #6 0x00007f9ef8d5c42f get_sysdep_segment_value /build/glibc-M65Gwz/glibc-2.30/intl/loadmsgcat.c:509:8 #7 0x00007f9ef8d5c42f _nl_load_domain /build/glibc-M65Gwz/glibc-2.30/intl/loadmsgcat.c:970:34 #8 0x00007f9ef8d6b092 (/lib/x86_64-linux-gnu/libc.so.6+0x34092) #9 0x000000000458abe0 clang::ASTContext::getTypeInfoImpl(clang::Type const*) const llvm-project/clang/lib/AST/ASTContext.cpp:0:5 ``` Reviewers: sammccall Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81384
…RM` was undefined after definition. `PP->getMacroInfo()` returns nullptr for undefined macro, which leads to null-dereference at `MI->tockens().back()`. Stack dump: ``` #0 0x000000000217d15a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/llvm-project/build/bin/clang-tidy+0x217d15a) #1 0x000000000217b17c llvm::sys::RunSignalHandlers() (/llvm-project/build/bin/clang-tidy+0x217b17c) #2 0x000000000217b2e3 SignalHandler(int) (/llvm-project/build/bin/clang-tidy+0x217b2e3) #3 0x00007f39be5b1390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390) #4 0x0000000000593532 clang::tidy::bugprone::BadSignalToKillThreadCheck::check(clang::ast_matchers::MatchFinder::MatchResult const&) (/llvm-project/build/bin/clang-tidy+0x593532) ``` Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D85401
…RM` is not a literal. If `SIGTERM` is not a literal (e.g. `#define SIGTERM ((unsigned)15)`) bugprone-bad-signal-to-kill-thread check crashes. Stack dump: ``` #0 0x000000000217d15a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/llvm-project/build/bin/clang-tidy+0x217d15a) #1 0x000000000217b17c llvm::sys::RunSignalHandlers() (/llvm-project/build/bin/clang-tidy+0x217b17c) #2 0x000000000217b2e3 SignalHandler(int) (/llvm-project/build/bin/clang-tidy+0x217b2e3) #3 0x00007f6a7efb1390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390) #4 0x000000000212ac9b llvm::StringRef::getAsInteger(unsigned int, llvm::APInt&) const (/llvm-project/build/bin/clang-tidy+0x212ac9b) #5 0x0000000000593501 clang::tidy::bugprone::BadSignalToKillThreadCheck::check(clang::ast_matchers::MatchFinder::MatchResult const&) (/llvm-project/build/bin/clang-tidy+0x593501) ``` Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D85398
The following bpf linux kernel selftest failed with latest llvm: $ ./test_progs -n 7/10 ... The sequence of 8193 jumps is too complex. verification time 126272 usec stack depth 320 processed 114799 insns (limit 1000000) ... libbpf: failed to load object 'pyperf600_nounroll.o' test_bpf_verif_scale:FAIL:110 #7/10 pyperf600_nounroll.o:FAIL #7 bpf_verif_scale:FAIL After some investigation, I found the following llvm patch https://reviews.llvm.org/D84108 is responsible. The patch disabled hoisting common instructions in SimplifyCFG by default. Later on, the code changes and a SimplifyCFG phase with hoisting on cannot do the work any more. A test is provided to demonstrate the problem. The IR before simplifyCFG looks like: for.cond: %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ] %cmp = icmp ult i32 %i.0, 6 br i1 %cmp, label %for.body, label %for.cond.cleanup for.cond.cleanup: %2 = load i8*, i8** %frame_ptr, align 8, !tbaa !2 %cmp2 = icmp eq i8* %2, null %conv = zext i1 %cmp2 to i32 call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %1) #3 call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3 ret i32 %conv for.body: %3 = load i8*, i8** %frame_ptr, align 8, !tbaa !2 %tobool.not = icmp eq i8* %3, null br i1 %tobool.not, label %for.inc, label %land.lhs.true The first two insns of `for.cond.cleanup` and `for.body`, load and icmp, can be hoisted to `for.cond` block. With Patch D84108, the optimization is delayed. But unfortunately, later on loop rotation added addition phi nodes to `for.body` and hoisting cannot be done any more. Note such a hoisting is beneficial to bpf programs as bpf verifier does path sensitive analysis and verification. The hoisting preverts reloading from stack which will assume conservative value and increase exploited insns. In this case, it caused verifier failure. To fix this problem, I added an IR pass from bpf target to performance additional simplifycfg with hoisting common inst enabled. Differential Revision: https://reviews.llvm.org/D85434
… when `__STDC_WANT_LIB_EXT1__` was undefined after definition. PP->getMacroInfo() returns nullptr for undefined macro, so we need to check this return value before dereference. Stack dump: ``` #0 0x0000000002185e6a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/llvm-project/build/bin/clang-tidy+0x2185e6a) #1 0x0000000002183e8c llvm::sys::RunSignalHandlers() (/llvm-project/build/bin/clang-tidy+0x2183e8c) #2 0x0000000002183ff3 SignalHandler(int) (/llvm-project/build/bin/clang-tidy+0x2183ff3) #3 0x00007f37df9b1390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390) #4 0x000000000052054e clang::tidy::bugprone::NotNullTerminatedResultCheck::check(clang::ast_matchers::MatchFinder::MatchResult const&) (/llvm-project/build/bin/clang-tidy+0x52054e) ``` Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D85523
… when `__STDC_WANT_LIB_EXT1__` is not a literal. If `__STDC_WANT_LIB_EXT1__` is not a literal (e.g. `#define __STDC_WANT_LIB_EXT1__ ((unsigned)1)`) bugprone-not-null-terminated-result check crashes. Stack dump: ``` #0 0x0000000002185e6a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/llvm-project/build/bin/clang-tidy+0x2185e6a) #1 0x0000000002183e8c llvm::sys::RunSignalHandlers() (/llvm-project/build/bin/clang-tidy+0x2183e8c) #2 0x0000000002183ff3 SignalHandler(int) (/llvm-project/build/bin/clang-tidy+0x2183ff3) #3 0x00007f08d91b1390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390) #4 0x00000000021338bb llvm::StringRef::getAsInteger(unsigned int, llvm::APInt&) const (/llvm-project/build/bin/clang-tidy+0x21338bb) #5 0x000000000052051c clang::tidy::bugprone::NotNullTerminatedResultCheck::check(clang::ast_matchers::MatchFinder::MatchResult const&) (/llvm-project/build/bin/clang-tidy+0x52051c) ``` Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D85525
When `Target::GetEntryPointAddress()` calls `exe_module->GetObjectFile()->GetEntryPointAddress()`, and the returned `entry_addr` is valid, it can immediately be returned. However, just before that, an `llvm::Error` value has been setup, but in this case it is not consumed before returning, like is done further below in the function. In https://bugs.freebsd.org/248745 we got a bug report for this, where a very simple test case aborts and dumps core: ``` * thread #1, name = 'testcase', stop reason = breakpoint 1.1 frame #0: 0x00000000002018d4 testcase`main(argc=1, argv=0x00007fffffffea18) at testcase.c:3:5 1 int main(int argc, char *argv[]) 2 { -> 3 return 0; 4 } (lldb) p argc Program aborted due to an unhandled Error: Error value was Success. (Note: Success values must still be checked prior to being destroyed). Thread 1 received signal SIGABRT, Aborted. thr_kill () at thr_kill.S:3 3 thr_kill.S: No such file or directory. (gdb) bt #0 thr_kill () at thr_kill.S:3 #1 0x00000008049a0004 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52 #2 0x0000000804916229 in abort () at /usr/src/lib/libc/stdlib/abort.c:67 #3 0x000000000451b5f5 in fatalUncheckedError () at /usr/src/contrib/llvm-project/llvm/lib/Support/Error.cpp:112 #4 0x00000000019cf008 in GetEntryPointAddress () at /usr/src/contrib/llvm-project/llvm/include/llvm/Support/Error.h:267 #5 0x0000000001bccbd8 in ConstructorSetup () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:67 #6 0x0000000001bcd2c0 in ThreadPlanCallFunction () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:114 #7 0x00000000020076d4 in InferiorCallMmap () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/Utility/InferiorCallPOSIX.cpp:97 #8 0x0000000001f4be33 in DoAllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/FreeBSD/ProcessFreeBSD.cpp:604 #9 0x0000000001fe51b9 in AllocatePage () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:347 #10 0x0000000001fe5385 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:383 #11 0x0000000001974da2 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2301 #12 CanJIT () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2331 #13 0x0000000001a1bf3d in Evaluate () at /usr/src/contrib/llvm-project/lldb/source/Expression/UserExpression.cpp:190 #14 0x00000000019ce7a2 in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Target/Target.cpp:2372 #15 0x0000000001ad784c in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:414 #16 0x0000000001ad86ae in DoExecute () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:646 #17 0x0000000001a5e3ed in Execute () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandObject.cpp:1003 #18 0x0000000001a6c4a3 in HandleCommand () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:1762 #19 0x0000000001a6f98c in IOHandlerInputComplete () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2760 #20 0x0000000001a90b08 in Run () at /usr/src/contrib/llvm-project/lldb/source/Core/IOHandler.cpp:548 #21 0x00000000019a6c6a in ExecuteIOHandlers () at /usr/src/contrib/llvm-project/lldb/source/Core/Debugger.cpp:903 #22 0x0000000001a70337 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2946 #23 0x0000000001d9d812 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/API/SBDebugger.cpp:1169 #24 0x0000000001918be8 in MainLoop () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:675 #25 0x000000000191a114 in main () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:890``` Fix the incorrect error catch by only instantiating an `Error` object if it is necessary. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D86355
Hi,
I'm not able to observe the performance benefit due to propeller toolchain for the included test program (main.cc, callee.cc). Followed the steps given in Propeller_RFC.pdf.
High level observations:
$ time ./a.out.orig.labels 1000000000 2 >& /dev/null
real 0m21.094s
user 0m20.489s
sys 0m0.604s
$ time ./a.out.labels 1000000000 2 >& /dev/null
real 0m20.357s
user 0m19.908s
sys 0m0.448s
Elapsed time varies from 1 to 5%.
Perf data
$ perf stat -e cycles,instructions,cache-misses,L1-icache-load-misses,br_misp_retired.all_branches,br_inst_retired.all_branches,icache_64b.iftag_stall ./a.out.o
rig.labels 1000000000 1> /dev/null
Performance counter stats for './a.out.orig.labels 1000000000':
243,314,361,618 instructions # 3.03 insn per cycle (83.33%)
22,522 cache-misses (83.33%)
2,644,077 L1-icache-load-misses (83.33%)
20,400,061 br_misp_retired.all_branches (83.33%)
53,442,616,374 br_inst_retired.all_branches (83.34%)
68,554,744 icache_64b.iftag_stall (57.14%)
Optimized binary
$ perf stat -e cycles,instructions,cache-misses,L1-icache-load-misses,br_misp_retired.all_branches,br_inst_retired.all_branches,icache_64b.iftag_stall ./a.out.l
abels 1000000000 1> /dev/null
Performance counter stats for './a.out.labels 1000000000':
243,218,220,681 instructions # 2.99 insn per cycle (83.33%)
14,907 cache-misses (83.34%)
2,533,002 L1-icache-load-misses (83.34%)
20,571,010 br_misp_retired.all_branches (83.34%)
53,455,580,211 br_inst_retired.all_branches (83.33%)
68,847,492 icache_64b.iftag_stall (57.14%)
The referenced paper doesn't mention the benefit for the included test program. What is expected improvement for the included test?
Please see more details (build, runtime steps, etc.) in following gist.
https://gist.github.com/uttampawar/5407f998bc3f02f58c4b83b0b4dc20fe
Any hint is appreciated.
The text was updated successfully, but these errors were encountered: