-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add compressbits and expandbits opcodes #7259
Add compressbits and expandbits opcodes #7259
Conversation
@r30shah fyi |
bcf5ea4
to
eb12ac0
Compare
Instruction selection log excerpts: Powerexpandbits
compressbits
x86expandbits
compressbits
|
eb12ac0
to
da34bc0
Compare
This depends on #7564 to avoid a bug in the x86 codegen |
@zl-wang For some reason I can't request a review from you through the GUI. Can I get your review on the P codegen changes here? |
higher level comments:
|
TR::Register* | ||
OMR::X86::I386::TreeEvaluator::iexpandbitsEvaluator(TR::Node *node, TR::CodeGenerator *cg) | ||
{ | ||
return TR::TreeEvaluator::badILOpEvaluator(node, cg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure these 32-bit opcodes are valid on 32-bit
I'm not sure I understand the question.
There are a couple other potential uses for these operations (hashing [1], varint encode and decode idioms [2]), so adding them as IL opcodes lays the groundwork for accelerating those usecases in the future in a more platform-generic (and hopefully easier) way. [1] #7172 (comment) |
is the sign-extended result the correct expectation? for example, bcompressbits can result in 0xFFFFFFFF? after all, these routines sound like bit-logical operations. on the other hand, hardware instruction result is naturally zero-extended in reality (i.e. if it is defined to be zero-extended, you definitely don't need those |
8259023
to
4f37d83
Compare
4f37d83
to
c0e3e08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the changes for x86 look good. I just have one small comment. I also need to double-check the uses of decNodeReferenceCounts
versus decReferenceCount
.
c0e3e08
to
ed3d9f5
Compare
0b8e73e
to
6bf9b44
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
6bf9b44
to
7151061
Compare
7151061
to
b9b1f48
Compare
Jenkins build all |
b9b1f48
to
42ec290
Compare
Jenkins build all |
The failure on x86 is an instruction encoding failure
Odd that |
You have to force VEX encoding in your test. Otherwise it tries to use legacy encoding on hardware that doesn't support FMA for which that instruction does not exist in legacy mode. |
Signed-off-by: Spencer Comin <[email protected]>
Signed-off-by: Spencer Comin <[email protected]>
This commit adds opcodes for bit compress compress and expand operations (alternatively referred to as gather/scatter or extract/deposit) for byte, short, int, and long datatypes. Codegen support for Power and x86-64 is implemented with the pdepd/pextd and pdep/pext instructions, respectively. Signed-off-by: Spencer Comin <[email protected]>
42ec290
to
e378869
Compare
Jenkins build all |
lcompressbits
,icompressbits
,scompressbits
,bcompressbits
lexpandbits
,iexpandbits
,sexpandbits
,bexpandbits
See #7172