-
Notifications
You must be signed in to change notification settings - Fork 40
Open
Description
For example VLEN = 16K
, For small vl cases(generally < 1K), there is always waste for VRF.
MSP is a feature in Cray, we need to support multiple thread.
We need to assign a chicken bits in the first scalar core, to control how many VLEN in each threads, thus we can split VLEN into 1K, 2K, 4k, 8k, 16k.
Thus we can have TLP for T1 with the cost: the waste of scalar core(rf), and the conflict handling logic in LSU, but we can receive pros:
- fine-grand vl for multiples thread;
- much more memory utilization for hiding the memory latency from different like SIMT.
Metadata
Metadata
Assignees
Labels
No labels