Is Prestissimo(Presto Native Worker) designed to be deployed with other Presto java workers? #7207
-
Is Prestissimo(Presto Native Worker) designed to be deployed in a dedicated cluster, or can be deployed among with other Presto java workers? I was thinking that Prestissimo(Presto Native Worker) was designed to be deployed with other java workers, so for one query, one of the stages executed by java worker, while another stage executed by another native worker(e.g. for performance reasons). But I find that Velox Aggregation Function's intermediate type is not compatible with Presto's intermediate type. e.g. the
while in Velox it's intermediate type is
So let's assume the partial aggregation task is executed by Presto java, and final aggregation task is executed by Velox. Presto Java will output bigint as intermediate result, which is not what Velox's final aggregation is expecting. [2]. velox/velox/functions/prestosql/aggregates/MinMaxAggregates.cpp Lines 909 to 914 in 2c0b713 |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
@xumingming James, thank you for starting this discussion. I think it belongs to PrestoDB project though. Consider, moving it there. Very early on we were thinking that Prestissimo would be a drop-in replacement for a Presto worker and would support hybrid clusters with a mix of Java and native workers. However, we no longer believe this is a viable or desirable setup. Turns out it is hard to match Presto's intermediate types. Often these do not make sense and are not performant. Why use 8 bytes to encode 4-byte integer or 2-byte smallint for min/max? In addition, Velox uses modern algorithms for approx_xxx functions which means intermediate types are very different from Presto's. Furthermore, hash functions used to partition data for shuffle in Java and native do not match. Also, we believe it is prohibitively expensive to operate hybrid clusters and it is not clear what's the point of running some stages quickly just to be slowed down by inefficient workers downstream. |
Beta Was this translation helpful? Give feedback.
-
CC: @aditi-pandit |
Beta Was this translation helpful? Give feedback.
@xumingming James, thank you for starting this discussion. I think it belongs to PrestoDB project though. Consider, moving it there.
Very early on we were thinking that Prestissimo would be a drop-in replacement for a Presto worker and would support hybrid clusters with a mix of Java and native workers. However, we no longer believe this is a viable or desirable setup. Turns out it is hard to match Presto's intermediate types. Often these do not make sense and are not performant. Why use 8 bytes to encode 4-byte integer or 2-byte smallint for min/max? In addition, Velox uses modern algorithms for approx_xxx functions which means intermediate types are very different from Presto's. Further…