Skip to content

Conversation

@MichielDerhaeg
Copy link
Contributor

No description provided.

artemiy-volkov and others added 6 commits November 28, 2025 21:03
The Synopsys RMX-100 has a short, three-stage, in-order execution pipeline.

The option -mmpy-option was added to control which version of the MPY
unit the core has and what the latency of multiply instructions should
be similar to ARCv2 cores (see gcc/config/arc/arc.opt:60).

Authored-by: Artemiy Volkov <[email protected]>
Signed-off-by: Michiel Derhaeg <[email protected]>
The Synopsys RHX-100 has a 10-stage, dual-issue, in-order execution
pipeline.

It has support for instruction fusion, which will be addressed by
subsequent patches.
Copy link
Contributor Author

@MichielDerhaeg MichielDerhaeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to split up the commits in something sensible. Didn't check whether they can be built individually though.

return store_data_bypass_p (out_insn, in_insn);
}

/* Implement one boolean function for each of the values of the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the declarations in the header be moved to arcv.h?

Copy link
Member

@luismgsilva luismgsilva Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I originally placed the declarations in arcv.h, but looking at the existing code structure, it may be a better fit to keep them in riscv-protos.h

EDIT: See d479345


struct riscv_tune_param;

extern int riscv_get_tune_param_issue_rate (void);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes a circular dependency between riscv.cc and arcv.cc. Can we instead just pass in the issue_rate itself as a parameter where this is called?

Copy link
Member

@luismgsilva luismgsilva Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Yes.

EDIT: See 73219a4

case SIGN_EXTRACT:
if (TARGET_XTHEADBB && outer_code == SET
if ((TARGET_ARCV_RHX100 || TARGET_XTHEADBB)
&& outer_code == SET
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this was added for the bit-extract fusion.

Comment on lines +4643 to +4658
(define_insn "*zero_extract_fused"
[(set (match_operand:SI 0 "register_operand" "=r")
(zero_extract:SI (match_operand:SI 1 "register_operand" "r")
(match_operand 2 "const_int_operand")
(match_operand 3 "const_int_operand")))]
"TARGET_ARCV_RHX100 && !TARGET_64BIT
&& (INTVAL (operands[2]) > 1 || !TARGET_ZBS)"
{
int amount = INTVAL (operands[2]);
int end = INTVAL (operands[3]) + amount;
operands[2] = GEN_INT (BITS_PER_WORD - end);
operands[3] = GEN_INT (BITS_PER_WORD - amount);
return "slli\t%0,%1,%2\n\tsrli\t%0,%0,%3";
}
[(set_attr "type" "alu_fused")]
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, this fusion was never implemented as a define_insn_and_split. Might not be trivial to force these exact instructions after a split.

Comment on lines +4598 to +4658
(define_insn "madd_split_fused"
[(set (match_operand:SI 0 "register_operand" "=&r,r")
(plus:SI
(mult:SI (match_operand:SI 1 "register_operand" "r,r")
(match_operand:SI 2 "register_operand" "r,r"))
(match_operand:SI 3 "register_operand" "r,?0")))
(clobber (match_scratch:SI 4 "=&r,&r"))]
"TARGET_ARCV_RHX100
&& !TARGET_64BIT && (TARGET_ZMMUL || TARGET_MUL)"
{
if (REGNO (operands[0]) == REGNO (operands[3]))
{
return "mul\t%4,%1,%2\n\tadd\t%4,%3,%4\n\tmv\t%0,%4";
}
else
{
return "mul\t%0,%1,%2\n\tadd\t%0,%0,%3";
}
}
[(set_attr "type" "imul_fused")]
)

(define_insn "madd_split_fused_extended"
[(set (match_operand:DI 0 "register_operand" "=&r,r")
(sign_extend:DI
(plus:SI
(mult:SI (match_operand:SI 1 "register_operand" "r,r")
(match_operand:SI 2 "register_operand" "r,r"))
(match_operand:SI 3 "register_operand" "r,?0"))))
(clobber (match_scratch:SI 4 "=&r,&r"))]
"TARGET_ARCV_RHX100
&& (TARGET_ZMMUL || TARGET_MUL)"
{
if (REGNO (operands[0]) == REGNO (operands[3]))
{
return "mulw\t%4,%1,%2\n\taddw\t%4,%3,%4\n\tmv\t%0,%4";
}
else
{
return "mulw\t%0,%1,%2\n\taddw\t%0,%0,%3";
}
}
[(set_attr "type" "imul_fused")]
)

(define_insn "*zero_extract_fused"
[(set (match_operand:SI 0 "register_operand" "=r")
(zero_extract:SI (match_operand:SI 1 "register_operand" "r")
(match_operand 2 "const_int_operand")
(match_operand 3 "const_int_operand")))]
"TARGET_ARCV_RHX100 && !TARGET_64BIT
&& (INTVAL (operands[2]) > 1 || !TARGET_ZBS)"
{
int amount = INTVAL (operands[2]);
int end = INTVAL (operands[3]) + amount;
operands[2] = GEN_INT (BITS_PER_WORD - end);
operands[3] = GEN_INT (BITS_PER_WORD - amount);
return "slli\t%0,%1,%2\n\tsrli\t%0,%0,%3";
}
[(set_attr "type" "alu_fused")]
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these patterns used to be a defined_insn_and_split. In case we can't get rid of them, let's try to go back to define_insn_and_split (check the downstream commit history) and see what the performance looks like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we revert madd_split_fused1 back to a define_insn_and_split we get 0.898% improvement.

Signed-off-by: Luis Silva <[email protected]>
Before: 10,543,513 cycles
After: 10,543,496 cycles
Difference: 17 cycles
Improvement: 0.000%

Signed-off-by: Luis Silva <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants