Skip to content

Conversation

manikBS
Copy link

@manikBS manikBS commented Sep 26, 2025

Summary
This PR adds a new scalar function jsonExtractScalar to enable flexible extraction of scalar values from JSON fields in Pinot queries.

Key Changes

  • Introduced jsonExtractScalar scalar function. Supported scalar data types: STRING, INT, LONG, DOUBLE, BOOLEAN, and newly added TIMESTAMP

Testing

  • Added test cases in JsonFunctionsTest

@manikBS
Copy link
Author

manikBS commented Sep 26, 2025

This would also solve #16495

@manikBS manikBS marked this pull request as draft September 26, 2025 16:54
@manikBS manikBS marked this pull request as ready for review September 26, 2025 16:54
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Let's make sure the behavior for the scalar function is the same as JsonExtractScalarTransformFunction

@ScalarFunction
public static Object jsonExtractScalar(Object jsonInput, String jsonPath, String dataType, Object defaultValue) {
try {
switch (dataType.toUpperCase()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several data types are not supported. Could you please make sure all types supported in JsonExtractScalarTransformFunction are also supported here?

Copy link
Author

@manikBS manikBS Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to make it exactly similar to JsonExtractScalarTransformFunction. Can you please verify?

@manikBS
Copy link
Author

manikBS commented Sep 27, 2025

Hi @Jackie-Jiang,
I noticed that in JsonFunctions, all the scalar functions ignore exceptions when the JSON object or JSONPath is invalid and default value is null, and instead return the provided default value. However, in JsonExtractScalarTransformFunction, an exception is thrown in such cases.

Should both follow a consistent behavior? If so, which approach would be preferable, returning the default value (like JsonFunctions) or throwing an exception?

@Jackie-Jiang
Copy link
Contributor

@manikBS Good finding! Yeah we should make the behavior consistent, and I'd suggest following the behavior of JsonExtractScalarTransformFunction given it is much more commonly used

@manikBS
Copy link
Author

manikBS commented Sep 28, 2025

@Jackie-Jiang
I have kept the existing scalar functions behavior intact as it could effect the existing injestion flows. For JsonExtractScalar, the behavior (same as JsonExtractScalarTransformFunction) is as follows:

Case: JSON object is malformed, JSON path is malformed, or JSON path does not exist

  • For single value, If default value is null -> throw error
  • For single value, If default value is not provided -> throw error
  • For single value, If default value is not null -> return default value
  • For multi-value -> return empty array

Case: JSON object and JSON path is correct

  • For single value -> return the result as is
  • For multi-value -> if there are nulls. for example {1, 2, 3, null, 4} return {1, 2, 3, defaultvalue, 4}

can you please review it?

@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 47.09302% with 91 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.27%. Comparing base (b2760be) to head (13350ed).
⚠️ Report is 45 commits behind head on master.

Files with missing lines Patch % Lines
...he/pinot/common/function/scalar/JsonFunctions.java 47.09% 79 Missing and 12 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #16910      +/-   ##
============================================
- Coverage     63.55%   63.27%   -0.29%     
  Complexity     1411     1411              
============================================
  Files          3073     3071       -2     
  Lines        180687   180504     -183     
  Branches      27624    27615       -9     
============================================
- Hits         114837   114206     -631     
- Misses        57003    57546     +543     
+ Partials       8847     8752      -95     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.24% <47.09%> (-0.25%) ⬇️
java-21 63.23% <47.09%> (-0.28%) ⬇️
temurin 63.27% <47.09%> (-0.29%) ⬇️
unittests 63.26% <47.09%> (-0.29%) ⬇️
unittests1 56.13% <47.09%> (-0.34%) ⬇️
unittests2 33.64% <0.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants