You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm failing to improve elapsed time of my benchmark when tuning params for caching. The params of interest are as follows:
spark.gluten.sql.columnar.backend.velox.cacheEnabled // enable or disable velox cache, default false. spark.gluten.sql.columnar.backend.velox.memCacheSize // the total size of in-mem cache, default is 128MB. spark.gluten.sql.columnar.backend.velox.ssdCachePath // the folder to store the cache files, default is "/tmp". spark.gluten.sql.columnar.backend.velox.ssdCacheSize // the total size of the SSD cache, default is 128MB. Velox will do in-mem cache only if this value is 0. spark.gluten.sql.columnar.backend.velox.ssdCacheShards // the shards of the SSD cache, default is 1. spark.gluten.sql.columnar.backend.velox.ssdCacheIOThreads // the IO threads for cache promoting, default is 1. Velox will try to do "read-ahead" if this value is bigger than 1 spark.gluten.sql.columnar.backend.velox.ssdODirect // enable or disable O_DIRECT on cache write, default false.
These are mentioned in the docs at https://gluten.incubator.apache.org/docs/velox/s3#local-caching-support. The slave environment appears to respond to increasing the SSD cache size, but elapsed time is slower. The ssdCacheIOThreads param has no effect at all, meaning the thread count within the Spark executor JVMs (a dynamic affair) is not measurably different.
I have tried all plausible settings. Are there other params for tuning? The machine type is GCP Compute "n2-highmem-32" which has 32C, 256GB.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm failing to improve elapsed time of my benchmark when tuning params for caching. The params of interest are as follows:
spark.gluten.sql.columnar.backend.velox.cacheEnabled // enable or disable velox cache, default false. spark.gluten.sql.columnar.backend.velox.memCacheSize // the total size of in-mem cache, default is 128MB. spark.gluten.sql.columnar.backend.velox.ssdCachePath // the folder to store the cache files, default is "/tmp". spark.gluten.sql.columnar.backend.velox.ssdCacheSize // the total size of the SSD cache, default is 128MB. Velox will do in-mem cache only if this value is 0. spark.gluten.sql.columnar.backend.velox.ssdCacheShards // the shards of the SSD cache, default is 1. spark.gluten.sql.columnar.backend.velox.ssdCacheIOThreads // the IO threads for cache promoting, default is 1. Velox will try to do "read-ahead" if this value is bigger than 1 spark.gluten.sql.columnar.backend.velox.ssdODirect // enable or disable O_DIRECT on cache write, default false.
These are mentioned in the docs at https://gluten.incubator.apache.org/docs/velox/s3#local-caching-support. The slave environment appears to respond to increasing the SSD cache size, but elapsed time is slower. The ssdCacheIOThreads param has no effect at all, meaning the thread count within the Spark executor JVMs (a dynamic affair) is not measurably different.
I have tried all plausible settings. Are there other params for tuning? The machine type is GCP Compute "n2-highmem-32" which has 32C, 256GB.
Beta Was this translation helpful? Give feedback.
All reactions