Is it possible to extend Velox to support StringWithDictionary in Clickhouse? #7182
Replies: 3 comments 6 replies
-
| The data is distinct in StringWithDictionary's dictionary, while it is not in DictionaryVector. |
Beta Was this translation helpful? Give feedback.
-
We usually achieve this by having multiple DictionaryVectors (one per batch), and having them wrap around the same internal Vector (the one that has the distinct values). It should achieve the same effect you're looking for, while at the same time being more flexible so it can be used in different scenarios. |
Beta Was this translation helpful? Give feedback.
-
@xumingming Maybe we could look at some specific queries and compare the performance between Velox and ClickHouse. If Velox is sufficiently slower, we can then think about how to optimize. Would you like try this route? |
Beta Was this translation helpful? Give feedback.
-
Clickhouse has a data type named
StringWithDictionary
(see [1] for more details), it is basically a dictionary encoded format for string data, but it is not the DictionaryVector in Velox, two reasons:dictId
tostring value
, while it is not in DictionaryVector.I am thinking that to support feature like StringWithDictionary, we need to support a new type of Vector? is it possible to implement as a plugin of Velox or we have to change Velox core?
[1]. https://presentations.clickhouse.com/meetup19/string_optimization.pdf
Beta Was this translation helpful? Give feedback.
All reactions