Question regarding calculation on https://inferencex.semianalysis.com/

Hello, I have a question regarding the definition of the y-axis "Token throughput per GPU" when PP > 1 (pipeline parallelism). Since each GPU only processes a subset of DeepSeek's layers, does the "per GPU token throughput" refer to the steady-state throughput (when the pipeline is full), or does it account for the overhead of pipeline ramp-up and ramp-down (pipeline bubbles)?