Skip to content

Commit a83acf1

Browse files
Remove feature flag to enable binary doc value compression (#138524)
Binary doc value compression was added behind a feature flag in #137139 . This PR removes the feature flag to enable the feature.
1 parent f7d5fc8 commit a83acf1

File tree

5 files changed

+20
-13
lines changed

5 files changed

+20
-13
lines changed

docs/changelog/138524.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
pr: 138524
2+
summary: Remove feature flag to enable binary doc value compression
3+
area: Mapping
4+
type: feature
5+
issues: []
6+
highlight:
7+
title: Remove feature flag to enable binary doc value compression
8+
body: |-
9+
Add compression for binary doc values using Zstd and blocks with a variable number of values.
10+
11+
Block-wise LZ4 compression was previously added to Lucene in LUCENE-9211 and removed in LUCENE-9378 due to query performance issues. This approach stored a constant number of values per block (specifically 32 values). This made it easy to map a given value index (e.g., docId) to the block containing it by doing blockId = docId / 32.
12+
Unfortunately, if values are very large, we must still have exactly 32 values per block, and (de)compressing a block could cause very high memory usage. As a result, we had to keep the number of values small, meaning that in the average case, a block was much smaller than ideal.
13+
To overcome the issues of blocks with a constant number of values, this PR adds block-wise compression with a variable number of values per block. It stores a minimum of 1 document per block and stops adding values when the size of a block exceeds a threshold or the number of values exceeds a threshold.
14+
Like the previous version, it stores an array of addresses for the start of each block. Additionally, it stores a parallel array with the docId at the start of each block. When looking up a given docId, if it is not in the current block, we binary search the array of docId starts to find the blockId containing the value. We then look up the address of the block. After this, decompression works very similarly to the code from LUCENE-9211; the main difference being that Zstd(1) is used instead of LZ4.
15+
16+
notable: true

server/src/main/java/org/elasticsearch/index/codec/tsdb/es819/ES819TSDBDocValuesFormat.java

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,6 @@
1313
import org.apache.lucene.codecs.DocValuesProducer;
1414
import org.apache.lucene.index.SegmentReadState;
1515
import org.apache.lucene.index.SegmentWriteState;
16-
import org.elasticsearch.common.util.FeatureFlag;
1716
import org.elasticsearch.core.SuppressForbidden;
1817
import org.elasticsearch.index.codec.tsdb.BinaryDVCompressionMode;
1918

@@ -37,8 +36,6 @@
3736
*/
3837
public class ES819TSDBDocValuesFormat extends org.apache.lucene.codecs.DocValuesFormat {
3938

40-
public static final boolean BINARY_DV_COMPRESSION_FEATURE_FLAG = new FeatureFlag("binary_dv_compression").isEnabled();
41-
4239
static final int NUMERIC_BLOCK_SHIFT = 7;
4340
public static final int NUMERIC_BLOCK_SIZE = 1 << NUMERIC_BLOCK_SHIFT;
4441
static final int NUMERIC_BLOCK_MASK = NUMERIC_BLOCK_SIZE - 1;
@@ -145,7 +142,7 @@ public ES819TSDBDocValuesFormat() {
145142
DEFAULT_SKIP_INDEX_INTERVAL_SIZE,
146143
ORDINAL_RANGE_ENCODING_MIN_DOC_PER_ORDINAL,
147144
OPTIMIZED_MERGE_ENABLE_DEFAULT,
148-
BINARY_DV_COMPRESSION_FEATURE_FLAG ? BinaryDVCompressionMode.COMPRESSED_ZSTD_LEVEL_1 : BinaryDVCompressionMode.NO_COMPRESS,
145+
BinaryDVCompressionMode.COMPRESSED_ZSTD_LEVEL_1,
149146
true
150147
);
151148
}

server/src/test/java/org/elasticsearch/index/codec/tsdb/es819/ES819TSDBDocValuesFormatTests.java

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -121,13 +121,9 @@ protected Codec getCodec() {
121121
return codec;
122122
}
123123

124-
public void testBinaryCompressionFeatureFlag() {
124+
public void testBinaryCompressionEnabled() {
125125
ES819TSDBDocValuesFormat docValueFormat = new ES819TSDBDocValuesFormat();
126-
if (ES819TSDBDocValuesFormat.BINARY_DV_COMPRESSION_FEATURE_FLAG) {
127-
assertThat(docValueFormat.binaryDVCompressionMode, equalTo(BinaryDVCompressionMode.COMPRESSED_ZSTD_LEVEL_1));
128-
} else {
129-
assertThat(docValueFormat.binaryDVCompressionMode, equalTo(BinaryDVCompressionMode.NO_COMPRESS));
130-
}
126+
assertThat(docValueFormat.binaryDVCompressionMode, equalTo(BinaryDVCompressionMode.COMPRESSED_ZSTD_LEVEL_1));
131127
}
132128

133129
public void testBlockWiseBinary() throws Exception {

test/test-clusters/src/main/java/org/elasticsearch/test/cluster/FeatureFlag.java

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,7 @@ public enum FeatureFlag {
2727
),
2828
RANDOM_SAMPLING("es.random_sampling_feature_flag_enabled=true", Version.fromString("9.2.0"), null),
2929
INFERENCE_API_CCM("es.inference_api_ccm_feature_flag_enabled=true", Version.fromString("9.3.0"), null),
30-
GENERIC_VECTOR_FORMAT("es.generic_vector_format_feature_flag_enabled=true", Version.fromString("9.3.0"), null),
31-
BINARY_DOC_VALUE_COMPRESSION("es.binary_dv_compression_feature_flag_enabled=true", Version.fromString("9.3.0"), null);
30+
GENERIC_VECTOR_FORMAT("es.generic_vector_format_feature_flag_enabled=true", Version.fromString("9.3.0"), null);
3231

3332
public final String systemProperty;
3433
public final Version from;

x-pack/plugin/logsdb/src/yamlRestTest/java/org/elasticsearch/xpack/logsdb/LogsdbTestSuiteIT.java

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,6 @@ public class LogsdbTestSuiteIT extends ESClientYamlSuiteTestCase {
3737
.setting("xpack.security.autoconfiguration.enabled", "false")
3838
.setting("xpack.license.self_generated.type", "trial")
3939
.feature(FeatureFlag.DOC_VALUES_SKIPPER)
40-
.feature(FeatureFlag.BINARY_DOC_VALUE_COMPRESSION)
4140
.build();
4241

4342
public LogsdbTestSuiteIT(@Name("yaml") ClientYamlTestCandidate testCandidate) {

0 commit comments

Comments
 (0)