Open
Description
Redshift Cluster Spec
- Cluster CPU Utilisation: ~50%
- Cluster resources: ra3.xlplus/2nodes
Load Speed
maxSizePerBatch | LoadMinutes | GB/Hour |
---|---|---|
0.5 | 3 | 10 |
0.5 | 5 | 6 |
0.5 | 8 | 3.75 |
1 | 10 | 6 |
1 | 9 | 6.67 |
1 | 7 | 8.57 |
1 | 7 | 8.57 |
1 | 6 | 10 |
4 | 21 | 11.43 |
4 | 21 | 11.43 |
0.5 | 4 | 7.5 |
The load speed reduces when multiple loads are happening and the max Speed is seen around 11.5GB per hour.
Division of time taken in the load task
Below example is for 8GB maxSizePerBatch
I0401 08:05:12.574673 1 load_processor.go:739] ts.inventory.customers, batchId:1, size:16389: processing...
I0401 08:05:12.574702 1 load_processor.go:646] ts.inventory.customers, batchId:1, startOffset:57150
I0401 08:05:13.119588 1 load_processor.go:701] ts.inventory.customers, load staging
I0401 08:05:21.538138 1 redshift.go:868] Running: COPY from s3 to: customers_ts_adx_reload_staged
I0401 08:34:51.824170 1 load_processor.go:212] ts.inventory.customers, copied staging
I0401 08:36:45.631030 1 load_processor.go:235] ts.inventory.customers, deduped
I0401 08:40:02.744066 1 load_processor.go:254] ts.inventory.customers, deleted common
I0401 08:40:04.206752 1 load_processor.go:273] ts.inventory.customers, deleted delete-op
I0401 08:40:04.216792 1 redshift.go:817] Running: UNLOAD from customers_ts_adx_reload_staged to s3
I0401 08:43:33.241421 1 load_processor.go:323] ts.inventory.customers, unloaded
I0401 08:43:33.241453 1 redshift.go:868] Running: COPY from s3 to: customers_ts_adx_reload
I0401 08:49:11.932393 1 load_processor.go:339] ts.inventory.customers, copied
I0401 08:49:19.985916 1 load_processor.go:151] ts.inventory.customers, offset: 73539, marking
I0401 08:49:19.985935 1 load_processor.go:158] ts.inventory.customers, offset: 73539, marked
I0401 08:49:19.985939 1 load_processor.go:161] ts.inventory.customers, committing (autoCommit=false)
I0401 08:49:19.987312 1 load_processor.go:163] ts.inventory.customers, committed (autoCommit=false)
I0401 08:49:19.987344 1 load_processor.go:768] ts.inventory.customers, batchId:1, size:16389, end:73538:, processed in 44m
Task | TimeTaken | % |
---|---|---|
load staging | 29mins | 65.9% |
merge/dedupe | 2mins. | 4.4% |
merge/deleteCommon | 4mins | 8.8% |
merge/deleteOp. | 2 seconds | 0.07% |
unload | 3mins | 6.8% |
load target | 6mins | 13.6% |
Need to find the optimization area and work on optimizing the speed.
Can we load at 100GB/hour?