Currently, all components are connected via shifter, if we increase lane size larger, the interconnect will be costly, thus we need to create a mesh-like interconnect to reduce this overhead.
We need to create a in-T1 protocol for decoupling, use flit to send and recv messages, this also maintain the lifecylce of instruction.