-
Notifications
You must be signed in to change notification settings - Fork 0
feat(ste_vec): add jsonb parameter variants for containment functions #161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add overloaded variants of jsonb_contains and jsonb_contained_by that accept mixed types (eql_v2_encrypted, jsonb) to support queries where one operand is an encrypted column and the other is plain JSONB. New functions: - jsonb_contains(eql_v2_encrypted, jsonb) - jsonb_contains(jsonb, eql_v2_encrypted) - jsonb_contained_by(eql_v2_encrypted, jsonb) - jsonb_contained_by(jsonb, eql_v2_encrypted)
Update get_ste_vec_encrypted and add get_ste_vec_sv_element to return serde_json::Value instead of String. This: - Aligns Postgres jsonb type with Rust serde_json::Value - Avoids double-quoting issues with embedded apostrophes - Allows programmatic JSON inspection without extra parsing - Callers use .to_string() when a literal is needed Updated existing tests to use ::jsonb cast instead of ::eql_v2_encrypted since the helper now returns just the .data field (JSONB portion).
These tests expose a bug in jsonb_contains - partial containment (searching for a single sv element) doesn't work correctly. Tests verify: - Single sv element should be found in its parent row - GIN index should be used for partial containment queries - Element should match exactly one row
Tests verify that containment works with prepared statements using sqlx .bind() in addition to literal string substitution.
Tests verify jsonb_contains(encrypted, jsonb) and jsonb_contained_by(jsonb, encrypted) work with partial containment (single sv elements).
Replace with simpler literal-based tests that don't use subqueries. The new tests are clearer and test with both literal values and parameterized statements.
Consolidate from 17 tests to 4 base tests. Removes 13 redundant jsonb_contains(encrypted, jsonb) variations. Next commit will add focused tests covering all 6 operator/type combinations.
Enables encrypted-to-encrypted containment tests by providing a helper to fetch two distinct encrypted values from the same table.
Final cleanup: 10 focused tests covering all 6 operator/type combinations plus 2 sanity tests and 2 helper validation tests. Coverage matrix complete: - jsonb_contains(encrypted, jsonb_param) ✓ - jsonb_contains(encrypted, encrypted) ✓ - jsonb_contains(jsonb_param, encrypted) ✓ - jsonb_contained_by(encrypted, jsonb_param) ✓ - jsonb_contained_by(encrypted, encrypted) ✓ - jsonb_contained_by(jsonb_param, encrypted) ✓
- Remove unnecessary .to_string() calls in format args - Remove redundant serde_json import
Adds enum types and macro declaration for containment matrix tests. ContainmentTestCase::run method not yet implemented - test will not compile. This is the RED phase of TDD.
… GREEN) Adds complete implementation of ContainmentTestCase with: - build_query: SQL generation with proper parameter placeholders - get_param_value: Parameter value selection based on argument type - execute_with_bindings: Dynamic query execution with type-based bindings - verify_index_usage: EXPLAIN-based index verification - run: Orchestrates full test execution All methods include .with_context() for improved error diagnostics. First macro-generated test now passes.
Adds tests for: - macro_contains_encrypted_encrypted_param (column contains full encrypted param) - macro_contains_encrypted_jsonb_param (column contains sv element param) - macro_contains_encrypted_param_encrypted (encrypted param contains column)
Adds tests for: - macro_contained_by_encrypted_encrypted_param (column contained by full param) - macro_contained_by_jsonb_param_encrypted (sv element contained by column) - macro_contained_by_encrypted_param_encrypted (encrypted param contained by column)
…ment Adds run_negative method and two negative test cases: - macro_contains_jsonb_param_encrypted_no_match: sv_element cannot contain full value - macro_contained_by_encrypted_jsonb_param_no_match: full value not contained in sv_element These tests document the expected asymmetric behavior where partial values (sv_element) cannot contain full encrypted values and vice versa.
|
Unfortunately, I don't think this implementation is correct. The problem is that containment operations on SteVec must only consider the selectors and terms and explicitly ignore the ciphertext values. The JsonIndexer in CipherStash client differentiates between a full SteVec and an ExampleLet's say you have the below SteVec record in your table (as a cipher cell): {
"sv": [
{"s":"a","b3":"1234","c":"aaa"},
{"s":"a","b3":"1234","c":"bbb"}
]
}Now, when you generate a query, if the query is also an (SteVec and not an SteQueryVec) then the values will never match. This is because ciphertexts are fully randomized and the same message encrypted twice will result in different values. Say you have a query like (note that the {
"sv": [
{"s":"a","b3":"1234","c":"ccc"}
]
}Then this query will not match: select '[{"s":"a","b3":"1234","c":"aaa"}, {"s":"a","b3":"1234","c":"bbb"}]'::jsonb @> '[{"s":"a","b3":"1234","c":"ccc"}]'::jsonb;However, if the query term is an select '[{"s":"a","b3":"1234","c":"aaa"}, {"s":"a","b3":"1234","c":"bbb"}]'::jsonb @> '[{"s":"a","b3":"1234"}]'::jsonb;I think the reason the tests are passing in this PR is because the query terms are being taken from a value that was already inserted into the test table so the ciphertext will be the same. Its an artificial scenario because in practice the generated ciphertext will always be random. |
coderdan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a problem in the implementation that we need to resolve. See longer comment.
|
Related: it took me a moment to grok what was going on. It might be helpful to use a similar naming convention to the JsonIndexer? For example, instead of |
Rename jsonb_array to jsonb_array_from_array_elements for full element extraction. Create new jsonb_array function that extracts only deterministic fields (s, b3, hm, ocv, ocf) for correct containment comparison using PostgreSQL's native @> operator. The previous implementation included non-deterministic ciphertext which caused containment queries to fail due to differing encrypted values.
Extract macro-generated containment tests from containment_with_index_tests.rs into dedicated jsonb_containment_uses_index_tests.rs module. Original hand-written tests remain in containment_with_index_tests.rs, while macro infrastructure and generated tests move to new module.
|
@coderdan Was already working on the fix. |
|
|
||
| let id: i64 = 1; | ||
|
|
||
| let encrypted = get_ste_vec_encrypted(pool, table, id as i32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be better to generate a fresh ste_vec each time to increase confidence that even with randomized ciphertexts, the test still passes every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed with @tobyhede that this will happen in a follow-up PR once cipherstash-client is integrated into the tests.
Summary
jsonb_containsandjsonb_contained_bythat accept mixed types (eql_v2_encrypted,jsonb)New Functions
jsonb_contains(eql_v2_encrypted, jsonb)jsonb_contains(jsonb, eql_v2_encrypted)jsonb_contained_by(eql_v2_encrypted, jsonb)jsonb_contained_by(jsonb, eql_v2_encrypted)Test Coverage Matrix
14 focused tests covering all operator/type combinations:
jsonb_containscontains_encrypted_jsonb_paramjsonb_containscontains_encrypted_encryptedjsonb_containscontains_encrypted_encrypted_paramjsonb_containscontains_jsonb_param_encryptedjsonb_containscontains_encrypted_param_encryptedjsonb_contained_bycontained_by_encrypted_jsonb_paramjsonb_contained_bycontained_by_encrypted_encryptedjsonb_contained_bycontained_by_encrypted_encrypted_paramjsonb_contained_bycontained_by_jsonb_param_encryptedjsonb_contained_bycontained_by_encrypted_param_encryptedPlus 2 sanity tests and 2 helper validation tests.
Test plan