-
Notifications
You must be signed in to change notification settings - Fork 1.1k
perf: Optimize wide table write performance #16699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // Skip column name | ||
| ReadWriteIOUtils.readString(buffer); | ||
| // Skip data type | ||
| ReadWriteIOUtils.readDataType(buffer); | ||
| // Skip encoding and compression for FIELD columns | ||
| if (category == TsTableColumnCategory.FIELD) { | ||
| ReadWriteIOUtils.readEncoding(buffer); | ||
| ReadWriteIOUtils.readCompressionType(buffer); | ||
| } | ||
| // Skip column props | ||
| ReadWriteIOUtils.readMap(buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May add skipString and skipMap, which only change the buffer position instead of creating temporary objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes wide-table write performance by addressing two main bottlenecks: metadata analysis phase and TSFile table registration. Key optimizations include:
- Optimistic locking in TsTable - Introduces lock-free fast paths for read operations using version tracking and write flags
- Semantic check caching - Adds flags to skip redundant validation of InsertNode measurements
- Direct schema conversion - Eliminates intermediate schema conversions during TSFile registration
- Lower-case transformation optimization - Adds caching to prevent redundant toLowerCase operations
- Test utilities - Introduces TSDataTypeTestUtils for consistent handling of supported data types in tests
Reviewed Changes
Copilot reviewed 35 out of 35 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| pom.xml | Updates tsfile dependency version to 2.2.0-251111-SNAPSHOT |
| TsTable.java | Adds optimistic locking mechanism with version tracking for improved read performance |
| TsTableColumnSchema.java, FieldColumnSchema.java | Adds getMeasurementSchema() method for schema conversion |
| TsFileTableSchemaUtil.java | New utility class for optimized TsTable↔TableSchema conversion without intermediate serialization |
| InsertNodeMeasurementInfo.java | New class encapsulating insert node measurements with lazy evaluation support |
| InsertBaseStatement.java | Adds caching flags for semantic checks, toLowerCase, and attribute columns |
| InsertTabletStatement.java, InsertRowStatement.java | Implements rebuildArraysAfterExpansion for TAG column reordering |
| WrappedInsertStatement.java | Refactors validation to use new InsertNodeMeasurementInfo and optimized TAG column handling |
| TableHeaderSchemaValidator.java | Adds validateInsertNodeMeasurements with custom handlers for optimized validation |
| DataRegion.java | Replaces schema cache with version-aware TableSchema cache |
| LoadTsFileManager.java, UnsealedTsFileRecoverPerformer.java | Uses TsFileTableSchemaUtil instead of intermediate conversions |
| DataNodeTableCache.java | Renames version to instanceVersion for clarity |
| TSDataTypeTestUtils.java | New test utility for filtering unsupported TSDataType values |
| AlignedTVList.java | Optimizes bitmap initialization using markRange |
| TVList.java | Changes hasLimit() to hasSetLimit() for pagination controller |
| IoTDBConfig.java, IoTDBDescriptor.java | Removes deprecated loadTableSchemaCacheSizeInBytes configuration |
Comments suppressed due to low confidence (1)
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/relational/sql/ast/WrappedInsertStatement.java:1
- This duplicates the same incorrect logic from InsertBaseStatement.semanticCheck() (see previous comment). If a measurement has failed and is in failedMeasurementIndex2Info, it could be null but would skip the null check, then get added to deduplicatedMeasurements, causing incorrect behavior. Failed measurements should be completely skipped from duplicate detection.
/*
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/schema/table/TsTable.java
Show resolved
Hide resolved
iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/schema/table/TsTable.java
Show resolved
Hide resolved
...e/src/main/java/org/apache/iotdb/db/queryengine/plan/statement/crud/InsertBaseStatement.java
Show resolved
Hide resolved
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/DataRegion.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/iotdb/db/queryengine/plan/relational/sql/ast/WrappedInsertStatement.java
Outdated
Show resolved
Hide resolved
| measurementValidator.validate( | ||
| i, | ||
| measurementInfo.getMeasurementName(i), | ||
| measurementInfo.getType(i), | ||
| columnCategories[i], | ||
| table.getColumnSchema(measurementInfo.getMeasurementName(i))); | ||
| } |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable measurementValidator may be null at this access as suggested by this null guard.
| measurementValidator.validate( | |
| i, | |
| measurementInfo.getMeasurementName(i), | |
| measurementInfo.getType(i), | |
| columnCategories[i], | |
| table.getColumnSchema(measurementInfo.getMeasurementName(i))); | |
| } | |
| if (measurementValidator != null) { | |
| measurementValidator.validate( | |
| i, | |
| measurementInfo.getMeasurementName(i), | |
| measurementInfo.getType(i), | |
| columnCategories[i], | |
| table.getColumnSchema(measurementInfo.getMeasurementName(i))); | |
| } |
| i, | ||
| measurementInfo.getMeasurementName(i), | ||
| measurementInfo.getType(i), | ||
| columnCategories[i], |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable columnCategories may be null at this access as suggested by this null guard.
| columnCategories[i], | |
| columnCategories != null ? columnCategories[i] : null, |
| .validateTableHeaderSchema( | ||
| database, tableSchema, context, allowCreateTable, isStrictTagColumn); | ||
| } | ||
|
|
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method overrides Metadata.validateInsertNodeMeasurements; it is advisable to add an Override annotation.
| @Override |
| assertEquals(tableSchema, schema); | ||
| return Optional.of(tableSchema); | ||
| } | ||
|
|
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method overrides TestMetadata.validateInsertNodeMeasurements; it is advisable to add an Override annotation.
| @Override |
| for (int oldIdx = 0; oldIdx < oldLength; oldIdx++) { | ||
| final int newIdx = oldToNewMapping[oldIdx]; | ||
| columns[newIdx] = oldColumns[oldIdx]; | ||
| if (nullBitMaps != null && oldNullBitMaps != null) { |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is useless. oldNullBitMaps cannot be null at this check, since it is guarded by ... != ....
| if (nullBitMaps != null && oldNullBitMaps != null) { | |
| if (nullBitMaps != null) { |
| public void testInsertMultiRowWithNull() throws SQLException { | ||
| try (Connection connection = EnvFactory.getEnv().getConnection(BaseEnv.TABLE_SQL_DIALECT); | ||
| Statement st1 = connection.createStatement()) { | ||
| st1.execute("SET CONFIGURATION enable_auto_create_schema='false'"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reset configurations when the test is done.
| newNullBitMaps[newIdx] = new BitMap(rowCount); | ||
| newNullBitMaps[newIdx].markAll(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not relevant to this PR, but we may add an extension of BitMap like AllMarkedBitMap, which cannot be marked/unmarked and always returns true when a position is tested.
This could somehow reduce the memory footprint, because it will not store any underlying array.
Description
This PR primarily addresses two performance bottlenecks in wide-table scenarios:
StatementAnalyzetakes too long (even longer than writing to MemTable time)I. Metadata Phase Optimization
Problem Background
Through flame graph analysis, it was found that in wide-table scenarios, the CPU time consumption of the
StatementAnalyzestage is too high, becoming the main bottleneck for write performance.Optimization Measures
1. Reduce Redundant
TsTable→TableSchemaConversionProblem Positioning: Frequent schema conversion operations occupy a large amount of CPU
2. Reduce Unnecessary
semanticCheckExecutionProblem Positioning: In wide-table scenarios, complete semantic checks are executed on every write, but most check items are repeated
Optimization Solution: Introduce a check item caching mechanism to skip repeated checks
3. Optimize
TsTable.getColumnSchemaRead Lock Time - Introduce Optimistic LockingProblem Positioning: Severe read lock contention under high concurrency
Optimization Solution:
II. TSFile Table Registration Optimization
Problem Background
The TSFile registration stage has unnecessary
TsTableSchemaconversion, causing additional overhead.Optimization Measures
1. Eliminate Redundant
TsTableSchemaConversionImplementation Points:
TsTableSchemaconversion inTsFileRegisterThis PR has:
for an unfamiliar reader.
for code coverage.
Key changed/added classes (or packages if there are too many classes) in this PR