Skip to content

Conversation

@tobyhede
Copy link
Contributor

Summary

  • Add overloaded variants of jsonb_contains and jsonb_contained_by that accept mixed types (eql_v2_encrypted, jsonb)
  • Enables queries where one operand is an encrypted column and the other is plain JSONB
  • Comprehensive test coverage for all operator/type combinations

New Functions

Function Parameters
jsonb_contains (eql_v2_encrypted, jsonb)
jsonb_contains (jsonb, eql_v2_encrypted)
jsonb_contained_by (eql_v2_encrypted, jsonb)
jsonb_contained_by (jsonb, eql_v2_encrypted)

Test Coverage Matrix

14 focused tests covering all operator/type combinations:

Operator LHS RHS Test
jsonb_contains encrypted jsonb_param contains_encrypted_jsonb_param
jsonb_contains encrypted encrypted contains_encrypted_encrypted
jsonb_contains encrypted encrypted_param contains_encrypted_encrypted_param
jsonb_contains jsonb_param encrypted contains_jsonb_param_encrypted
jsonb_contains encrypted_param encrypted contains_encrypted_param_encrypted
jsonb_contained_by encrypted jsonb_param contained_by_encrypted_jsonb_param
jsonb_contained_by encrypted encrypted contained_by_encrypted_encrypted
jsonb_contained_by encrypted encrypted_param contained_by_encrypted_encrypted_param
jsonb_contained_by jsonb_param encrypted contained_by_jsonb_param_encrypted
jsonb_contained_by encrypted_param encrypted contained_by_encrypted_param_encrypted

Plus 2 sanity tests and 2 helper validation tests.

Test plan

  • Build succeeds
  • All SQLx tests pass (14 containment + full suite)
  • Lint passes (cargo fmt, clippy)

Add overloaded variants of jsonb_contains and jsonb_contained_by
that accept mixed types (eql_v2_encrypted, jsonb) to support queries
where one operand is an encrypted column and the other is plain JSONB.

New functions:
- jsonb_contains(eql_v2_encrypted, jsonb)
- jsonb_contains(jsonb, eql_v2_encrypted)
- jsonb_contained_by(eql_v2_encrypted, jsonb)
- jsonb_contained_by(jsonb, eql_v2_encrypted)
Update get_ste_vec_encrypted and add get_ste_vec_sv_element to return
serde_json::Value instead of String. This:
- Aligns Postgres jsonb type with Rust serde_json::Value
- Avoids double-quoting issues with embedded apostrophes
- Allows programmatic JSON inspection without extra parsing
- Callers use .to_string() when a literal is needed

Updated existing tests to use ::jsonb cast instead of ::eql_v2_encrypted
since the helper now returns just the .data field (JSONB portion).
These tests expose a bug in jsonb_contains - partial containment
(searching for a single sv element) doesn't work correctly.
Tests verify:
- Single sv element should be found in its parent row
- GIN index should be used for partial containment queries
- Element should match exactly one row
Tests verify that containment works with prepared statements using
sqlx .bind() in addition to literal string substitution.
Tests verify jsonb_contains(encrypted, jsonb) and
jsonb_contained_by(jsonb, encrypted) work with partial
containment (single sv elements).
Replace with simpler literal-based tests that don't use subqueries.
The new tests are clearer and test with both literal values and
parameterized statements.
Consolidate from 17 tests to 4 base tests. Removes 13 redundant
jsonb_contains(encrypted, jsonb) variations. Next commit will
add focused tests covering all 6 operator/type combinations.
Enables encrypted-to-encrypted containment tests by providing
a helper to fetch two distinct encrypted values from the same table.
Final cleanup: 10 focused tests covering all 6 operator/type
combinations plus 2 sanity tests and 2 helper validation tests.

Coverage matrix complete:
- jsonb_contains(encrypted, jsonb_param) ✓
- jsonb_contains(encrypted, encrypted) ✓
- jsonb_contains(jsonb_param, encrypted) ✓
- jsonb_contained_by(encrypted, jsonb_param) ✓
- jsonb_contained_by(encrypted, encrypted) ✓
- jsonb_contained_by(jsonb_param, encrypted) ✓
- Remove unnecessary .to_string() calls in format args
- Remove redundant serde_json import
Adds enum types and macro declaration for containment matrix tests.
ContainmentTestCase::run method not yet implemented - test will not compile.
This is the RED phase of TDD.
… GREEN)

Adds complete implementation of ContainmentTestCase with:
- build_query: SQL generation with proper parameter placeholders
- get_param_value: Parameter value selection based on argument type
- execute_with_bindings: Dynamic query execution with type-based bindings
- verify_index_usage: EXPLAIN-based index verification
- run: Orchestrates full test execution

All methods include .with_context() for improved error diagnostics.
First macro-generated test now passes.
Adds tests for:
- macro_contains_encrypted_encrypted_param (column contains full encrypted param)
- macro_contains_encrypted_jsonb_param (column contains sv element param)
- macro_contains_encrypted_param_encrypted (encrypted param contains column)
Adds tests for:
- macro_contained_by_encrypted_encrypted_param (column contained by full param)
- macro_contained_by_jsonb_param_encrypted (sv element contained by column)
- macro_contained_by_encrypted_param_encrypted (encrypted param contained by column)
…ment

Adds run_negative method and two negative test cases:
- macro_contains_jsonb_param_encrypted_no_match: sv_element cannot contain full value
- macro_contained_by_encrypted_jsonb_param_no_match: full value not contained in sv_element

These tests document the expected asymmetric behavior where partial values (sv_element)
cannot contain full encrypted values and vice versa.
@coderdan
Copy link
Contributor

Unfortunately, I don't think this implementation is correct. The problem is that containment operations on SteVec must only consider the selectors and terms and explicitly ignore the ciphertext values. The JsonIndexer in CipherStash client differentiates between a full SteVec and an SteQueryVec (just selectors and terms) for this reason.

Example

Let's say you have the below SteVec record in your table (as a cipher cell):

{
  "sv": [
    {"s":"a","b3":"1234","c":"aaa"},
    {"s":"a","b3":"1234","c":"bbb"}
  ]
}

Now, when you generate a query, if the query is also an (SteVec and not an SteQueryVec) then the values will never match. This is because ciphertexts are fully randomized and the same message encrypted twice will result in different values.

Say you have a query like (note that the c field is different):

{
  "sv": [
    {"s":"a","b3":"1234","c":"ccc"}
  ]
}

Then this query will not match:

select '[{"s":"a","b3":"1234","c":"aaa"}, {"s":"a","b3":"1234","c":"bbb"}]'::jsonb @> '[{"s":"a","b3":"1234","c":"ccc"}]'::jsonb;

However, if the query term is an SteQueryVec (i.e. no c field), then the following will work correctly:

select '[{"s":"a","b3":"1234","c":"aaa"}, {"s":"a","b3":"1234","c":"bbb"}]'::jsonb @> '[{"s":"a","b3":"1234"}]'::jsonb;

I think the reason the tests are passing in this PR is because the query terms are being taken from a value that was already inserted into the test table so the ciphertext will be the same. Its an artificial scenario because in practice the generated ciphertext will always be random.

Copy link
Contributor

@coderdan coderdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a problem in the implementation that we need to resolve. See longer comment.

@coderdan
Copy link
Contributor

Related: it took me a moment to grok what was going on. It might be helpful to use a similar naming convention to the JsonIndexer? For example, instead of jsonb_array, perhaps we could call it ste_vec_entries?

Rename jsonb_array to jsonb_array_from_array_elements for full element extraction.
Create new jsonb_array function that extracts only deterministic fields (s, b3, hm, ocv, ocf)
for correct containment comparison using PostgreSQL's native @> operator.

The previous implementation included non-deterministic ciphertext which caused
containment queries to fail due to differing encrypted values.
Extract macro-generated containment tests from containment_with_index_tests.rs
into dedicated jsonb_containment_uses_index_tests.rs module.

Original hand-written tests remain in containment_with_index_tests.rs,
while macro infrastructure and generated tests move to new module.
@tobyhede
Copy link
Contributor Author

@coderdan Was already working on the fix.
jsonb_array is the user facing name, ste_vec is an internal implementation detail.

@tobyhede tobyhede requested a review from coderdan December 12, 2025 22:11

let id: i64 = 1;

let encrypted = get_ste_vec_encrypted(pool, table, id as i32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be better to generate a fresh ste_vec each time to increase confidence that even with randomized ciphertexts, the test still passes every time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @tobyhede that this will happen in a follow-up PR once cipherstash-client is integrated into the tests.

@tobyhede tobyhede merged commit aad104f into main Dec 13, 2025
4 checks passed
@tobyhede tobyhede deleted the jsonb-contains-variants-fix branch December 13, 2025 03:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants