WIP: Optimize counter bitwidth in Foreach control#293
WIP: Optimize counter bitwidth in Foreach control#293kumasento wants to merge 3 commits intostanford-ppl:masterfrom
Conversation
| val bitwidth = math.max(getBitwidth(begin), getBitwidth(end)) | ||
|
|
||
| // TODO: Find a better way that can map bitwidth to the exact Fix type | ||
| if (bitwidth <= 7) { |
There was a problem hiding this comment.
@mattfel1 Hi there, I feel that it would be tedious to implement all the bitwidth-to-type cast and there should be a better way that I'm not aware of. Maybe you've met this scenario before and have a good way to deal with it? Thanks!
There was a problem hiding this comment.
Unfortunately, I don't think there is a non-tedious way to do this.
Since each bitwidth is its own trait (argon/lang/types/CustomBitWidths.scala), its painful to work with. Some people have used quasiquotes for this problem before but there isn't a nice way that I know of.
There was a problem hiding this comment.
Thanks @mattfel1 ! In my latest update I manually added all the mappings for different bit-width values. Hope it looks fine.
Introduction
This PR aims to improve the bitwidth synthesized for counters in
Foreach. Counters are all initialized toI32at the moment, which is not optimal since for many cases there can be a much tighter bitwidth bound, and therefore, much resource can be saved and timing can be improved. This idea has been mentioned by @mattfel1 in #288 .An motivating example that this type of optimization can be adopted:
Here, suppose
Nis a constant or its boundary is statically known, we can then calculate the minimum bitwidth required for the counter that counts from 0 toN-1, simply byfloor(log2(N)) + 1.Implementation
To implement this optimization, I'm thinking of adding a new
Transformerpass during compilation,CounterBitwidthTransformer, which iterates the program, finds all theOpForeach, and replaces theirCounterNewwith a new instance that has reduced bitwidth.There are some questions though, mostly due to my insufficient knowledge on Spatial internals:
CounterNewwithout inheriting its data type, i.e., replace our updatedCounterNew[T]byCounterNew[I32], whereTis the optimized data type?CounterNewthe only counter we should take care of?CounterNew[T]with specifiedT?Test plan
There is a new app (will be deleted later)
TestCounterBitwidththat simply iterates a SRAM and updates its content usingForeach. If we can notice theCounterNewcan be updated to one that has shorter bitwidth, and its generated hardware uses less resource, we may assume that this optimization pass is helpful.I can also implement unit tests later once I figure out how to do that.
Schedule
The initial version of this work will come out in the next two weeks, and we can finalize other details and make improvements in the following weeks.