Fix SortBuffer ensureOutputFits estimateOutputSize inaccurate #11534

jinchengchenghh · 2024-11-14T03:50:04Z

The output batch reserved size should be rowSize * numRows, missed numRows before.

netlify · 2024-11-14T03:50:17Z

✅ Deploy Preview for meta-velox canceled.

Name	Link
🔨 Latest commit	`6232d5a`
🔍 Latest deploy log	https://app.netlify.com/sites/meta-velox/deploys/6735736e140d8400082a8fb9

FelixYBW · 2024-11-14T18:28:41Z

Issue is not fixed:

Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Operator::getOutput failed for [operator: OrderBy, plan node ID: 1]: Error during calling Java code from native code: org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget$OutOfMemoryException: Not enough spark off-heap execution memory. Acquired: 8.0 MiB, granted: 4.0 MiB. Try tweaking config option spark.memory.offHeap.size to get larger space to run this application (if spark.gluten.memory.dynamic.offHeap.sizing.enabled is not enabled). 
Current config settings: 
	spark.gluten.memory.offHeap.size.in.bytes=8.3 GiB
	spark.gluten.memory.task.offHeap.size.in.bytes=8.3 GiB
	spark.gluten.memory.conservative.task.offHeap.size.in.bytes=4.2 GiB
	spark.memory.offHeap.enabled=true
	spark.gluten.memory.dynamic.offHeap.sizing.enabled=false
Memory consumer stats: 
	Task.139909:                                              Current used bytes:  8.3 GiB, peak bytes:        N/A
	\- Gluten.Tree.61:                                        Current used bytes:  8.3 GiB, peak bytes:    8.3 GiB
	   \- root.61:                                            Current used bytes:  8.3 GiB, peak bytes:    8.3 GiB
	      +- NativePlanEvaluator-61.0:                        Current used bytes:  8.3 GiB, peak bytes:    8.3 GiB
	      |  \- single:                                       Current used bytes:  8.3 GiB, peak bytes:    8.3 GiB
	      |     +- root:                                      Current used bytes:  8.3 GiB, peak bytes:    8.3 GiB
	      |     |  +- task.Gluten_Stage_2_TID_139909_VTID_61: Current used bytes:  8.3 GiB, peak bytes:    8.3 GiB
	      |     |  |  +- node.1:                              Current used bytes:  8.3 GiB, peak bytes:    8.3 GiB
	      |     |  |  |  \- op.1.0.0.OrderBy:                 Current used bytes:  8.3 GiB, peak bytes:    8.3 GiB
	      |     |  |  +- node.2:                              Current used bytes: 96.0 KiB, peak bytes: 1024.0 KiB
	      |     |  |  |  \- op.2.0.0.Window:                  Current used bytes: 96.0 KiB, peak bytes:   96.0 KiB
	      |     |  |  +- node.3:                              Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |     |  |  |  \- op.3.0.0.FilterProject:           Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |     |  |  \- node.0:                              Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |     |  |     \- op.0.0.0.ValueStream:             Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |     |  \- default_leaf:                           Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |     \- gluten::MemoryAllocator:                   Current used bytes:    0.0 B, peak bytes:      0.0 B
	      +- ArrowContextInstance.6:                          Current used bytes:  8.0 MiB, peak bytes:    8.0 MiB
	      +- IteratorMetrics.61:                              Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |  \- single:                                       Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |     +- root:                                      Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |     |  \- default_leaf:                           Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |     \- gluten::MemoryAllocator:                   Current used bytes:    0.0 B, peak bytes:      0.0 B
	      +- IndicatorVectorBase#init.61.OverAcquire.0:       Current used bytes:    0.0 B, peak bytes:      0.0 B
	      +- ShuffleReader.3.OverAcquire.0:                   Current used bytes:    0.0 B, peak bytes:      0.0 B
	      +- IndicatorVectorBase#init.61:                     Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |  \- single:                                       Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |     +- root:                                      Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |     |  \- default_leaf:                           Current used bytes:    0.0 B, peak bytes:      0.0 B
	      |     \- gluten::MemoryAllocator:                   Current used bytes:    0.0 B, peak bytes:      0.0 B
	      +- IteratorMetrics.61.OverAcquire.0:                Current used bytes:    0.0 B, peak bytes:      0.0 B
	      +- NativePlanEvaluator-61.0.OverAcquire.0:          Current used bytes:    0.0 B, peak bytes:      0.0 B
	      \- ShuffleReader.3:                                 Current used bytes:    0.0 B, peak bytes:   16.0 MiB
	         \- single:                                       Current used bytes:    0.0 B, peak bytes:   16.0 MiB
	            +- root:                                      Current used bytes:    0.0 B, peak bytes: 1024.0 KiB
	            |  \- default_leaf:                           Current used bytes:    0.0 B, peak bytes:  579.8 KiB
	            \- gluten::MemoryAllocator:                   Current used bytes:    0.0 B, peak bytes:  401.3 KiB

	at org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget.borrow(ThrowOnOomMemoryTarget.java:105)
	at org.apache.gluten.memory.listener.ManagedReservationListener.reserve(ManagedReservationListener.java:49)
	at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeHasNext(Native Method)
	at org.apache.gluten.vectorized.ColumnarBatchOutIterator.hasNext0(ColumnarBatchOutIterator.java:57)
	at org.apache.gluten.iterator.ClosableIterator.hasNext(ClosableIterator.java:39)
	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:43)
	at org.apache.gluten.iterator.IteratorsV1$InvocationFlowProtection.hasNext(IteratorsV1.scala:159)
	at org.apache.gluten.iterator.IteratorsV1$IteratorCompleter.hasNext(IteratorsV1.scala:71)
	at org.apache.gluten.iterator.IteratorsV1$PayloadCloser.hasNext(IteratorsV1.scala:37)
	at org.apache.gluten.iterator.IteratorsV1$LifeTimeAccumulator.hasNext(IteratorsV1.scala:100)
	at scala.collection.Iterator.isEmpty(Iterator.scala:385)
	at scala.collection.Iterator.isEmpty$(Iterator.scala:385)
	at org.apache.gluten.iterator.IteratorsV1$LifeTimeAccumulator.isEmpty(IteratorsV1.scala:90)
	at org.apache.gluten.execution.VeloxColumnarToRowExec$.toRowIterator(VeloxColumnarToRowExec.scala:121)
	at org.apache.gluten.execution.VeloxColumnarToRowExec.$anonfun$doExecuteInternal$1(VeloxColumnarToRowExec.scala:77)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:949)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:949)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:374)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:338)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:374)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:338)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:374)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:338)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:374)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:338)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:131)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1471)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

FelixYBW · 2024-11-14T18:41:08Z

@JkSelf @zhztheplayer

Does the fix make sense? Where the 1.2 comes from?

const uint64_t outputBufferSizeToReserve = estimatedOutputRowSize_.value() * 1.2;

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 14, 2024

Fix SortBuffer ensureOutputFits estimateOutputSize inaccurate

6232d5a

Yuhta requested a review from xiaoxmeng November 14, 2024 15:43

FelixYBW mentioned this pull request Nov 14, 2024

OrderBy OOM in getOutput stage. #10940

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix SortBuffer ensureOutputFits estimateOutputSize inaccurate #11534

Fix SortBuffer ensureOutputFits estimateOutputSize inaccurate #11534

jinchengchenghh commented Nov 14, 2024 •

edited

Loading

netlify bot commented Nov 14, 2024 •

edited

Loading

FelixYBW commented Nov 14, 2024

FelixYBW commented Nov 14, 2024

Fix SortBuffer ensureOutputFits estimateOutputSize inaccurate #11534

Are you sure you want to change the base?

Fix SortBuffer ensureOutputFits estimateOutputSize inaccurate #11534

Conversation

jinchengchenghh commented Nov 14, 2024 • edited Loading

netlify bot commented Nov 14, 2024 • edited Loading

✅ Deploy Preview for meta-velox canceled.

FelixYBW commented Nov 14, 2024

FelixYBW commented Nov 14, 2024

jinchengchenghh commented Nov 14, 2024 •

edited

Loading

netlify bot commented Nov 14, 2024 •

edited

Loading