Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compressbits and expandbits opcodes #7259

Merged
merged 4 commits into from
Mar 6, 2025

Conversation

Spencer-Comin
Copy link
Contributor

@Spencer-Comin Spencer-Comin commented Feb 8, 2024

  • Adds the following opcodes:
    • lcompressbits, icompressbits, scompressbits, bcompressbits
    • lexpandbits, iexpandbits, sexpandbits, bexpandbits
  • Adds evaluators for Power and x86

See #7172

@Spencer-Comin
Copy link
Contributor Author

@r30shah fyi

@Spencer-Comin
Copy link
Contributor Author

Instruction selection log excerpts:

Power

expandbits

------------------------------
 n3n      (  0)  ireturn                                                                              [      0xab8d6f3930] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    b2i (in GPR_0017)                                                                  [      0xab8d6f3970] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=11920 nc=1
 n5n      (  0)      bexpandbits (in GPR_0017)                                                        [      0xab8d6f39b0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=11920 nc=2
 n6n      (  0)        bload  Parm  0<parm 0 B>[#389  Parm] [flags 0xc0000101 0x0 ]                   [      0xab8d6f39f0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=0
 n7n      (  0)        bload  Parm  1<parm 1 B>[#390  Parm] [flags 0xc0000101 0x0 ] (in GPR_0016)     [      0xab8d6f3a30] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=11488 nc=0
------------------------------

 [      0xab8d772e00]   0       lbz     GPR_0016, [gr1, 8]              # SymRef  Parm  1<parm 1 B>[#390  Parm] [flags 0xc0000101 0x0 ]
 [      0xab8d773060]   0       lbz     GPR_0018, [gr1, 0]              # SymRef  Parm  0<parm 0 B>[#389  Parm] [flags 0xc0000101 0x0 ]
 [      0xab8d7730f0]   0       pdepd   GPR_0017, GPR_0018, GPR_0016
 [      0xab8d7731d0]   0       retn
 [      0xab8d773250]   0       blr
------------------------------
 n3n      (  0)  ireturn                                                                              [     0x8441b195000] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    s2i (in GPR_0016)                                                                  [     0x8441b195040] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=17328 nc=1
 n5n      (  0)      sexpandbits (in GPR_0018)                                                        [     0x8441b195080] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=17936 nc=2
 n6n      (  0)        sload  Parm  0<parm 0 C>[#389  Parm] [flags 0xc0000102 0x0 ]                   [     0x8441b1950c0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=0
 n7n      (  0)        sload  Parm  1<parm 1 C>[#390  Parm] [flags 0xc0000102 0x0 ] (in GPR_0017)     [     0x8441b195100] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=17504 nc=0
------------------------------

 [     0x8441b214580]   0       lha     GPR_0017, [gr1, 8]              # SymRef  Parm  1<parm 1 C>[#390  Parm] [flags 0xc0000102 0x0 ]
 [     0x8441b2147e0]   0       lhz     GPR_0019, [gr1, 0]              # SymRef  Parm  0<parm 0 C>[#389  Parm] [flags 0xc0000102 0x0 ]
 [     0x8441b214870]   0       pdepd   GPR_0018, GPR_0019, GPR_0017
 [     0x8441b214900]   0       extsh   GPR_0016, GPR_0018
 [     0x8441b2149d0]   0       retn
 [     0x8441b214a50]   0       blr
------------------------------
 n3n      (  0)  ireturn                                                                              [     0x4e19246f050] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    iexpandbits (in GPR_0017)                                                          [     0x4e19246f090] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=38496 nc=2
 n5n      (  0)      iload  Parm  0<parm 0 I>[#389  Parm] [flags 0xc0000103 0x0 ]                     [     0x4e19246f0d0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=0
 n6n      (  0)      iload  Parm  1<parm 1 I>[#390  Parm] [flags 0xc0000103 0x0 ] (in GPR_0016)       [     0x4e19246f110] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=38064 nc=0
------------------------------

 [     0x4e1925495d0]   0       lwz     GPR_0016, [gr1, 8]              # SymRef  Parm  1<parm 1 I>[#390  Parm] [flags 0xc0000103 0x0 ]
 [     0x4e192549830]   0       lwz     GPR_0018, [gr1, 0]              # SymRef  Parm  0<parm 0 I>[#389  Parm] [flags 0xc0000103 0x0 ]
 [     0x4e1925498c0]   0       pdepd   GPR_0017, GPR_0018, GPR_0016
 [     0x4e1925499a0]   0       retn
 [     0x4e192549a20]   0       blr
------------------------------
 n3n      (  0)  lreturn                                                                              [     0x605f1cb43d0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    lexpandbits (in GPR_0017)                                                          [     0x605f1cb4410] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=22304 nc=2
 n5n      (  0)      lload  Parm  0<parm 0 J>[#389  Parm] [flags 0xc0000104 0x0 ] (in GPR_0018)       [     0x605f1cb4450] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=22480 nc=0
 n6n      (  0)      lload  Parm  1<parm 1 J>[#390  Parm] [flags 0xc0000104 0x0 ] (in GPR_0016)       [     0x605f1cb4490] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=21872 nc=0
------------------------------

 [     0x605f1d25690]   0       ld      GPR_0016, [gr1, 8]              # SymRef  Parm  1<parm 1 J>[#390  Parm] [flags 0xc0000104 0x0 ]
 [     0x605f1d258f0]   0       ld      GPR_0018, [gr1, 0]              # SymRef  Parm  0<parm 0 J>[#389  Parm] [flags 0xc0000104 0x0 ]
 [     0x605f1d25980]   0       pdepd   GPR_0017, GPR_0018, GPR_0016
 [     0x605f1d25a60]   0       retn

compressbits

------------------------------
 n3n      (  0)  ireturn                                                                              [     0x72202073310] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    b2i (in GPR_0017)                                                                  [     0x72202073350] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=18384 nc=1
 n5n      (  0)      bcompressbits (in GPR_0017)                                                      [     0x72202073390] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=18384 nc=2
 n6n      (  0)        bload  Parm  0<parm 0 B>[#389  Parm] [flags 0xc0000101 0x0 ] (in GPR_0016)     [     0x722020733d0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=17952 nc=0
 n7n      (  0)        bload  Parm  1<parm 1 B>[#390  Parm] [flags 0xc0000101 0x0 ]                   [     0x72202073410] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=0
------------------------------

 [     0x722020e4740]   0       lbz     GPR_0016, [gr1, 0]              # SymRef  Parm  0<parm 0 B>[#389  Parm] [flags 0xc0000101 0x0 ]
 [     0x722020e49a0]   0       lbz     GPR_0018, [gr1, 8]              # SymRef  Parm  1<parm 1 B>[#390  Parm] [flags 0xc0000101 0x0 ]
 [     0x722020e4a30]   0       pextd   GPR_0017, GPR_0016, GPR_0018
 [     0x722020e4b10]   0       retn
 [     0x722020e4b90]   0       blr
------------------------------
 n3n      (  0)  ireturn                                                                              [     0xb90c1793310] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    s2i (in GPR_0016)                                                                  [     0xb90c1793350] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=17952 nc=1
 n5n      (  0)      scompressbits (in GPR_0018)                                                      [     0xb90c1793390] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=18560 nc=2
 n6n      (  0)        sload  Parm  0<parm 0 C>[#389  Parm] [flags 0xc0000102 0x0 ] (in GPR_0017)     [     0xb90c17933d0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=18128 nc=0
 n7n      (  0)        sload  Parm  1<parm 1 C>[#390  Parm] [flags 0xc0000102 0x0 ]                   [     0xb90c1793410] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=0
------------------------------

 [     0xb90c18047f0]   0       lha     GPR_0017, [gr1, 0]              # SymRef  Parm  0<parm 0 C>[#389  Parm] [flags 0xc0000102 0x0 ]
 [     0xb90c1804a50]   0       lhz     GPR_0019, [gr1, 8]              # SymRef  Parm  1<parm 1 C>[#390  Parm] [flags 0xc0000102 0x0 ]
 [     0xb90c1804ae0]   0       pextd   GPR_0018, GPR_0017, GPR_0019
 [     0xb90c1804b70]   0       extsh   GPR_0016, GPR_0018
 [     0xb90c1804c40]   0       retn
 [     0xb90c1804cc0]   0       blr
------------------------------
 n3n      (  0)  ireturn                                                                              [     0x2927bd51d60] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    icompressbits (in GPR_0017)                                                        [     0x2927bd51da0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=50032 nc=2
 n5n      (  0)      iload  Parm  0<parm 0 I>[#389  Parm] [flags 0xc0000103 0x0 ] (in GPR_0016)       [     0x2927bd51de0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=49600 nc=0
 n6n      (  0)      iload  Parm  1<parm 1 I>[#390  Parm] [flags 0xc0000103 0x0 ]                     [     0x2927bd51e20] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=0
------------------------------

 [     0x2927be2c2e0]   0       lwz     GPR_0016, [gr1, 0]              # SymRef  Parm  0<parm 0 I>[#389  Parm] [flags 0xc0000103 0x0 ]
 [     0x2927be2c540]   0       lwz     GPR_0018, [gr1, 8]              # SymRef  Parm  1<parm 1 I>[#390  Parm] [flags 0xc0000103 0x0 ]
 [     0x2927be2c5d0]   0       pextd   GPR_0017, GPR_0016, GPR_0018
 [     0x2927be2c6b0]   0       retn
 [     0x2927be2c730]   0       blr
------------------------------
 n3n      (  0)  lreturn                                                                              [     0xd7490497440] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    lcompressbits (in GPR_0017)                                                        [     0xd7490497480] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=34704 nc=2
 n5n      (  0)      lload  Parm  0<parm 0 J>[#389  Parm] [flags 0xc0000104 0x0 ] (in GPR_0016)       [     0xd74904974c0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=34272 nc=0
 n6n      (  0)      lload  Parm  1<parm 1 J>[#390  Parm] [flags 0xc0000104 0x0 ] (in GPR_0018)       [     0xd7490497500] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=34880 nc=0
------------------------------

 [     0xd7490508700]   0       ld      GPR_0016, [gr1, 0]              # SymRef  Parm  0<parm 0 J>[#389  Parm] [flags 0xc0000104 0x0 ]
 [     0xd7490508960]   0       ld      GPR_0018, [gr1, 8]              # SymRef  Parm  1<parm 1 J>[#390  Parm] [flags 0xc0000104 0x0 ]
 [     0xd74905089f0]   0       pextd   GPR_0017, GPR_0016, GPR_0018
 [     0xd7490508ad0]   0       retn
 [     0xd7490508b50]   0       blr

x86

expandbits

------------------------------
 n3n      (  0)  ireturn                                                                              [0x55bfd9bcb390] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    b2i (in GPR_0017)                                                                  [0x55bfd9bcb3d0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=51760 nc=1
 n5n      (  0)      bexpandbits (in GPR_0017)                                                        [0x55bfd9bcb410] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=51760 nc=2
 n6n      (  0)        bload  Parm  0<parm 0 B>[#407  Parm] [flags 0xc0000101 0x0 ] (in GPR_0016)     [0x55bfd9bcb450] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=51296 nc=0
 n7n      (  0)        bload  Parm  1<parm 1 B>[#408  Parm] [flags 0xc0000101 0x0 ] (in GPR_0017)     [0x55bfd9bcb490] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=51760 nc=0
------------------------------

 [0x55bfd9c3c920]       mov     GPR_0016, byte ptr [vfp]                # L1RegMem, SymRef  Parm  0<parm 0 B>[#409  Parm] [flags 0xc0000101 0x0 ]
 [0x55bfd9c3caf0]       mov     GPR_0017, byte ptr [vfp+0x8]            # L1RegMem, SymRef  Parm  1<parm 1 B>[#410  Parm] [flags 0xc0000101 0x0 ]
 [0x55bfd9c3cb70]       movzx   GPR_0017, GPR_0017              # MOVZXReg4Reg1
 [0x55bfd9c3cbf0]       pdep    GPR_0017, GPR_0016, GPR_0017            # PDEP4RegRegReg
 [0x55bfd9c3cc70]       movsx   GPR_0017, GPR_0017              # MOVSXReg4Reg1
 [0x55bfd9c3cf70]       assocreg                        # assocreg
        POST: None
 [0x55bfd9c3cd40]       ret                             # RET
         PRE: [GPR_0017 : eax]
------------------------------
 n3n      (  0)  ireturn                                                                              [0x55eefe6e8390] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    s2i (in GPR_0017)                                                                  [0x55eefe6e83d0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=39472 nc=1
 n5n      (  0)      sexpandbits (in GPR_0017)                                                        [0x55eefe6e8410] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=39472 nc=2
 n6n      (  0)        sload  Parm  0<parm 0 C>[#407  Parm] [flags 0xc0000102 0x0 ] (in GPR_0016)     [0x55eefe6e8450] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=39008 nc=0
 n7n      (  0)        sload  Parm  1<parm 1 C>[#408  Parm] [flags 0xc0000102 0x0 ] (in GPR_0017)     [0x55eefe6e8490] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=39472 nc=0
------------------------------

 [0x55eefe759920]       mov     GPR_0016, word ptr [vfp]                # L2RegMem, SymRef  Parm  0<parm 0 C>[#409  Parm] [flags 0xc0000102 0x0 ]
 [0x55eefe759af0]       mov     GPR_0017, word ptr [vfp+0x8]            # L2RegMem, SymRef  Parm  1<parm 1 C>[#410  Parm] [flags 0xc0000102 0x0 ]
 [0x55eefe759b70]       movzx   GPR_0017, GPR_0017              # MOVZXReg4Reg2
 [0x55eefe759bf0]       pdep    GPR_0017, GPR_0016, GPR_0017            # PDEP4RegRegReg
 [0x55eefe759c70]       movsx   GPR_0017, GPR_0017              # MOVSXReg4Reg2
 [0x55eefe759f70]       assocreg                        # assocreg
        POST: None
 [0x55eefe759d40]       ret                             # RET
         PRE: [GPR_0017 : eax]
------------------------------
 n3n      (  0)  ireturn                                                                              [0x55d8aab60910] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    iexpandbits (in GPR_0017)                                                          [0x55d8aab60950] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=7536 nc=2
 n5n      (  0)      iload  Parm  0<parm 0 I>[#407  Parm] [flags 0xc0000103 0x0 ] (in GPR_0016)       [0x55d8aab60990] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=7072 nc=0
 n6n      (  0)      iload  Parm  1<parm 1 I>[#408  Parm] [flags 0xc0000103 0x0 ] (in GPR_0017)       [0x55d8aab609d0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=7536 nc=0
------------------------------

 [0x55d8aabb1c60]       mov     GPR_0016, dword ptr [vfp]               # L4RegMem, SymRef  Parm  0<parm 0 I>[#409  Parm] [flags 0xc0000103 0x0 ]
 [0x55d8aabb1e30]       mov     GPR_0017, dword ptr [vfp+0x8]           # L4RegMem, SymRef  Parm  1<parm 1 I>[#410  Parm] [flags 0xc0000103 0x0 ]
 [0x55d8aabb1eb0]       pdep    GPR_0017, GPR_0016, GPR_0017            # PDEP4RegRegReg
 [0x55d8aabb21b0]       assocreg                        # assocreg
        POST: None
 [0x55d8aabb1f80]       ret                             # RET
         PRE: [GPR_0017 : eax]
------------------------------
 n3n      (  0)  lreturn                                                                              [0x55da76332750] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    lexpandbits (in GPR_0017)                                                          [0x55da76332790] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=15280 nc=2
 n5n      (  0)      lload  Parm  0<parm 0 J>[#407  Parm] [flags 0xc0000104 0x0 ] (in GPR_0016)       [0x55da763327d0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=14816 nc=0
 n6n      (  0)      lload  Parm  1<parm 1 J>[#408  Parm] [flags 0xc0000104 0x0 ] (in GPR_0017)       [0x55da76332810] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=15280 nc=0
------------------------------

 [0x55da76383aa0]       mov     GPR_0016, qword ptr [vfp]               # L8RegMem, SymRef  Parm  0<parm 0 J>[#409  Parm] [flags 0xc0000104 0x0 ]
 [0x55da76383c70]       mov     GPR_0017, qword ptr [vfp+0x8]           # L8RegMem, SymRef  Parm  1<parm 1 J>[#410  Parm] [flags 0xc0000104 0x0 ]
 [0x55da76383cf0]       pdep    GPR_0017, GPR_0016, GPR_0017            # PDEP8RegRegReg
 [0x55da76383ff0]       assocreg                        # assocreg
        POST: None
 [0x55da76383dc0]       ret                             # RET
         PRE: [GPR_0017 : eax]

compressbits

------------------------------
 n3n      (  0)  ireturn                                                                              [0x558c59f18580] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    b2i (in GPR_0016)                                                                  [0x558c59f185c0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=39504 nc=1
 n5n      (  0)      bcompressbits (in GPR_0016)                                                      [0x558c59f18600] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=39504 nc=2
 n6n      (  0)        bload  Parm  0<parm 0 B>[#407  Parm] [flags 0xc0000101 0x0 ] (in GPR_0016)     [0x558c59f18640] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=39504 nc=0
 n7n      (  0)        bload  Parm  1<parm 1 B>[#408  Parm] [flags 0xc0000101 0x0 ] (in GPR_0017)     [0x558c59f18680] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=39968 nc=0
------------------------------

 [0x558c59f89b10]       mov     GPR_0016, byte ptr [vfp]                # L1RegMem, SymRef  Parm  0<parm 0 B>[#409  Parm] [flags 0xc0000101 0x0 ]
 [0x558c59f89ce0]       mov     GPR_0017, byte ptr [vfp+0x8]            # L1RegMem, SymRef  Parm  1<parm 1 B>[#410  Parm] [flags 0xc0000101 0x0 ]
 [0x558c59f89d60]       movzx   GPR_0016, GPR_0016              # MOVZXReg4Reg1
 [0x558c59f89de0]       pext    GPR_0016, GPR_0016, GPR_0017            # PEXT4RegRegReg
 [0x558c59f89e60]       movsx   GPR_0016, GPR_0016              # MOVSXReg4Reg1
 [0x558c59f8a160]       assocreg                        # assocreg
        POST: None
 [0x558c59f89f30]       ret                             # RET
         PRE: [GPR_0016 : eax]
------------------------------
 n3n      (  0)  ireturn                                                                              [0x556122e39580] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    s2i (in GPR_0016)                                                                  [0x556122e395c0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=43600 nc=1
 n5n      (  0)      scompressbits (in GPR_0016)                                                      [0x556122e39600] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=43600 nc=2
 n6n      (  0)        sload  Parm  0<parm 0 C>[#407  Parm] [flags 0xc0000102 0x0 ] (in GPR_0016)     [0x556122e39640] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=43600 nc=0
 n7n      (  0)        sload  Parm  1<parm 1 C>[#408  Parm] [flags 0xc0000102 0x0 ] (in GPR_0017)     [0x556122e39680] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=44064 nc=0
------------------------------

 [0x556122eaab10]       mov     GPR_0016, word ptr [vfp]                # L2RegMem, SymRef  Parm  0<parm 0 C>[#409  Parm] [flags 0xc0000102 0x0 ]
 [0x556122eaace0]       mov     GPR_0017, word ptr [vfp+0x8]            # L2RegMem, SymRef  Parm  1<parm 1 C>[#410  Parm] [flags 0xc0000102 0x0 ]
 [0x556122eaad60]       movzx   GPR_0016, GPR_0016              # MOVZXReg4Reg2
 [0x556122eaade0]       pext    GPR_0016, GPR_0016, GPR_0017            # PEXT4RegRegReg
 [0x556122eaae60]       movsx   GPR_0016, GPR_0016              # MOVSXReg4Reg2
 [0x556122eab160]       assocreg                        # assocreg
        POST: None
 [0x556122eaaf30]       ret                             # RET
         PRE: [GPR_0016 : eax]
------------------------------
 n3n      (  0)  ireturn                                                                              [0x563da6491c10] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    icompressbits (in GPR_0016)                                                        [0x563da6491c50] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=11936 nc=2
 n5n      (  0)      iload  Parm  0<parm 0 I>[#407  Parm] [flags 0xc0000103 0x0 ] (in GPR_0016)       [0x563da6491c90] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=11936 nc=0
 n6n      (  0)      iload  Parm  1<parm 1 I>[#408  Parm] [flags 0xc0000103 0x0 ] (in GPR_0017)       [0x563da6491cd0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=12400 nc=0
------------------------------

 [0x563da64e2f60]       mov     GPR_0016, dword ptr [vfp]               # L4RegMem, SymRef  Parm  0<parm 0 I>[#409  Parm] [flags 0xc0000103 0x0 ]
 [0x563da64e3130]       mov     GPR_0017, dword ptr [vfp+0x8]           # L4RegMem, SymRef  Parm  1<parm 1 I>[#410  Parm] [flags 0xc0000103 0x0 ]
 [0x563da64e31b0]       pext    GPR_0016, GPR_0016, GPR_0017            # PEXT4RegRegReg
 [0x563da64e34b0]       assocreg                        # assocreg
        POST: None
 [0x563da64e3280]       ret                             # RET
         PRE: [GPR_0016 : eax]
------------------------------
 n3n      (  0)  lreturn                                                                              [0x5590368bb850] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=- nc=1
 n4n      (  0)    lcompressbits (in GPR_0016)                                                        [0x5590368bb890] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=38032 nc=2
 n5n      (  0)      lload  Parm  0<parm 0 J>[#407  Parm] [flags 0xc0000104 0x0 ] (in GPR_0016)       [0x5590368bb8d0] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=38032 nc=0
 n6n      (  0)      lload  Parm  1<parm 1 J>[#408  Parm] [flags 0xc0000104 0x0 ] (in GPR_0017)       [0x5590368bb910] bci=[-1,0,-] rc=0 vc=35 vn=- li=2 udi=38496 nc=0
------------------------------

 [0x559036919550]       mov     GPR_0016, qword ptr [vfp]               # L8RegMem, SymRef  Parm  0<parm 0 J>[#409  Parm] [flags 0xc0000104 0x0 ]
 [0x559036919720]       mov     GPR_0017, qword ptr [vfp+0x8]           # L8RegMem, SymRef  Parm  1<parm 1 J>[#410  Parm] [flags 0xc0000104 0x0 ]
 [0x5590369197a0]       pext    GPR_0016, GPR_0016, GPR_0017            # PEXT8RegRegReg
 [0x559036919aa0]       assocreg                        # assocreg
        POST: None
 [0x559036919870]       ret                             # RET
         PRE: [GPR_0016 : eax]

@Spencer-Comin
Copy link
Contributor Author

This depends on #7564 to avoid a bug in the x86 codegen

@Spencer-Comin Spencer-Comin marked this pull request as ready for review November 28, 2024 14:53
@Spencer-Comin
Copy link
Contributor Author

@zl-wang For some reason I can't request a review from you through the GUI. Can I get your review on the P codegen changes here?

@zl-wang
Copy link
Contributor

zl-wang commented Jan 7, 2025

higher level comments:

  1. since we are dealing with individual bits in these routines, semantically b2i/s2i (implying sign-extension), instead of bu2i/su2i (implying zero-extension, more likely resulting in a nop) are right?
  2. what is the advantage of adding these one-off IL opcode vs. recognizing the specific methods?

TR::Register*
OMR::X86::I386::TreeEvaluator::iexpandbitsEvaluator(TR::Node *node, TR::CodeGenerator *cg)
{
return TR::TreeEvaluator::badILOpEvaluator(node, cg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure these 32-bit opcodes are valid on 32-bit

@Spencer-Comin
Copy link
Contributor Author

@zl-wang

  1. since we are dealing with individual bits in these routines, semantically b2i/s2i (implying sign-extension), instead of bu2i/su2i (implying zero-extension, more likely resulting in a nop) are right?

I'm not sure I understand the question.

  1. what is the advantage of adding these one-off IL opcode vs. recognizing the specific methods?

There are a couple other potential uses for these operations (hashing [1], varint encode and decode idioms [2]), so adding them as IL opcodes lays the groundwork for accelerating those usecases in the future in a more platform-generic (and hopefully easier) way.

[1] #7172 (comment)
[2] https://protobuf.dev/programming-guides/encoding/#varints

@zl-wang
Copy link
Contributor

zl-wang commented Jan 7, 2025

is the sign-extended result the correct expectation? for example, bcompressbits can result in 0xFFFFFFFF? after all, these routines sound like bit-logical operations. on the other hand, hardware instruction result is naturally zero-extended in reality (i.e. if it is defined to be zero-extended, you definitely don't need those exts* instructions for better performance).

@Spencer-Comin Spencer-Comin force-pushed the compress-expand-opcodes branch from 8259023 to 4f37d83 Compare February 20, 2025 20:56
@Spencer-Comin Spencer-Comin force-pushed the compress-expand-opcodes branch from 4f37d83 to c0e3e08 Compare February 21, 2025 15:21
Copy link
Contributor

@hzongaro hzongaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes for x86 look good. I just have one small comment. I also need to double-check the uses of decNodeReferenceCounts versus decReferenceCount.

@Spencer-Comin Spencer-Comin force-pushed the compress-expand-opcodes branch from c0e3e08 to ed3d9f5 Compare February 21, 2025 22:33
@hzongaro hzongaro self-assigned this Feb 28, 2025
@Spencer-Comin Spencer-Comin force-pushed the compress-expand-opcodes branch 2 times, most recently from 0b8e73e to 6bf9b44 Compare March 3, 2025 14:07
Copy link
Contributor

@r30shah r30shah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Spencer-Comin Spencer-Comin force-pushed the compress-expand-opcodes branch from 6bf9b44 to 7151061 Compare March 5, 2025 14:15
@Spencer-Comin Spencer-Comin force-pushed the compress-expand-opcodes branch from 7151061 to b9b1f48 Compare March 5, 2025 19:57
@hzongaro
Copy link
Contributor

hzongaro commented Mar 5, 2025

Jenkins build all

@Spencer-Comin Spencer-Comin force-pushed the compress-expand-opcodes branch from b9b1f48 to 42ec290 Compare March 5, 2025 20:50
@hzongaro
Copy link
Contributor

hzongaro commented Mar 5, 2025

Jenkins build all

@Spencer-Comin
Copy link
Contributor Author

The failure on x86 is an instruction encoding failure

16:11:07  51: �[0;32m[----------] �[m16 tests from RegRegRegParallelBitOpsEncTest/XRegRegRegEncEncodingTest
16:11:07  51: /home/jenkins/workspace/Build/fvtest/compilerunittest/x/BinaryEncoder.cpp:567: Failure
16:11:07  51:       Expected: std::get<5>(GetParam())
16:11:07  51:       Which is: [ c4 e2 73 f5 c3 ]
16:11:07  51: To be equal to: encodeInstruction(instr)
16:11:07  51:       Which is: [ f2 0f 30 f5 c3 ]
16:11:07  51: �[0;31m[  FAILED  ] �[mRegRegRegParallelBitOpsEncTest/XRegRegRegEncEncodingTest.encode/0, where GetParam() = (707, 1, 3, 2, 2, [ c4 e2 73 f5 c3 ]) (0 ms)

Odd that pdep is getting encoded wrong here, but all my testing passed... I'm looking into this

@BradleyWood
Copy link
Contributor

BradleyWood commented Mar 5, 2025

You have to force VEX encoding in your test. Otherwise it tries to use legacy encoding on hardware that doesn't support FMA for which that instruction does not exist in legacy mode.

BradleyWood and others added 3 commits March 6, 2025 09:55
This commit adds opcodes for bit compress compress and expand operations
(alternatively referred to as gather/scatter or extract/deposit) for byte,
short, int, and long datatypes. Codegen support for Power and x86-64 is
implemented with the pdepd/pextd and pdep/pext instructions, respectively.

Signed-off-by: Spencer Comin <[email protected]>
@Spencer-Comin Spencer-Comin force-pushed the compress-expand-opcodes branch from 42ec290 to e378869 Compare March 6, 2025 14:55
@hzongaro
Copy link
Contributor

hzongaro commented Mar 6, 2025

Jenkins build all

@hzongaro hzongaro merged commit d070be1 into eclipse-omr:master Mar 6, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants