-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
riscv64/x390: add *_overflow #9214
base: main
Are you sure you want to change the base?
Conversation
f504e73
to
b4c2cc0
Compare
b4c2cc0
to
242d45d
Compare
@bjorn3 can you take a look? also, I haven't found umul equivalent (or mul with zext*) in s390x. do you know what are their names? |
Hi @ghostway0, a few comments on the s390x part:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 Hey,
I don't know if this is ready for review yet, but It's a great start!
A few comments for the RISC-V part. I didn't check the lowerings in a lot of detail, mostly just spotting a few things that could be shorter.
Thanks for working on this!
Edit: I also ran the fuzzer (with these changes) and it pointed out this testcase.
Fuzzer testcase
Testcase:
test interpret
test run
target riscv64gc
target x86_64
function %a(i8) -> i8 {
block0(v0: i8):
v1, v2 = smul_overflow v0, v0
return v2
}
; run: %a(-15) == 1
Result:
ERROR cranelift_filetests::concurrent > FAIL: run
FAIL ./test.clif: run
Caused by:
Failed test: run: %a(-15) == 1, actual: 0
1 tests
Error: 1 failure
(let ((hi XReg (rv_mulhu x y)) | ||
(res XReg (rv_mul x y)) | ||
(one XReg (imm $I8 1)) | ||
(of XReg (gen_select_xreg (cmp_eqz hi) (zero_reg) one))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to use rv_snez
here instead of a select between one and zero.
(res XReg (rv_mul tmp_x tmp_y)) | ||
(hi XReg (rv_srli res (imm12_const (ty_bits ty)))) | ||
(one XReg (imm $I8 1)) | ||
(of XReg (gen_select_xreg (cmp_eqz hi) (zero_reg) one))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto here
(rule 1 (lower (has_type $I64 (uadd_overflow x y))) | ||
(let ((sum XReg (rv_add x y)) | ||
(one XReg (imm $I8 1)) | ||
(of XReg (gen_select_xreg (int_compare (IntCC.UnsignedLessThan) sum x) one (zero_reg)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to use rv_sltu
here instead of a select between one and zero. The RISC-V comparision functions already return a zero or one, and they are a lot shorter than our current implementation of select_xreg
(high_tmp XReg (rv_add (value_regs_get x 1) (value_regs_get y 1))) | ||
;; add carry. | ||
(high XReg (rv_add high_tmp carry)) | ||
(of XReg (gen_select_xreg (int_compare (IntCC.UnsignedLessThan) high carry) one (zero_reg)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto here for sltu
.
(res XReg (rv_mul tmp_x tmp_y)) | ||
(hi XReg (rv_srai res (imm12_const (ty_bits ty)))) | ||
(one XReg (imm $I8 1)) | ||
(of XReg (gen_select_xreg (cmp_eqz hi) (zero_reg) one))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be a snez
;; madd dst_hi, x_hi, y_lo, dst_hi | ||
;; madd dst_lo, x_lo, y_lo, zero | ||
(dst_hi1 XReg (rv_mulhu x_lo y_lo)) | ||
(one XReg (imm $I32 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one
doesn't seem to be used anywhere, similarly in the rules below there are a few one
unused instructions.
(one XReg (imm $I32 1)) | ||
(dst_hi2 ValueRegs (smadd_overflow64 x_lo y_hi (value_regs_get dst_hi1 0))) | ||
(dst_hi ValueRegs (smadd_overflow64 x_hi y_lo (value_regs_get dst_hi2 0))) | ||
(dst_lo XReg (madd x_lo y_lo (zero_reg))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing madd here, we can just multiply x_lo
and y_lo
and save one instruction.
(one XReg (imm $I32 1)) | ||
(dst_hi2 ValueRegs (umadd_overflow64 x_lo y_hi (value_regs_get dst_hi1 0))) | ||
(dst_hi ValueRegs (umadd_overflow64 x_hi y_lo (value_regs_get dst_hi2 0))) | ||
(dst_lo XReg (madd x_lo y_lo (zero_reg))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we could use mul
instead of madd
and save one instruction.
(let ((one XReg (imm $I8 1)) | ||
(hi XReg (rv_mulhu x y)) | ||
(m XReg (rv_mul x y)) | ||
(of_mul XReg (gen_select_xreg (cmp_eqz hi) (zero_reg) one)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could replace this with a snez
instruction
(m XReg (rv_mul x y)) | ||
(of_mul XReg (gen_select_xreg (cmp_eqz hi) (zero_reg) one)) | ||
(sum XReg (rv_add m z)) | ||
(of_add XReg (gen_select_xreg (int_compare (IntCC.UnsignedLessThan) sum m) one (zero_reg))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be replaced with a sltu
instruction which is a shorter sequence than a full select.
currently a draft and only riscv64