[SPARK-56029] Optional sub error conditions#54861
[SPARK-56029] Optional sub error conditions#54861srielau wants to merge 7 commits intoapache:masterfrom
Conversation
common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala
Outdated
Show resolved
Hide resolved
e94d5df to
e328f1f
Compare
| val errorInfo = errorInfoMap.getOrElse( | ||
| mainErrorClass, | ||
| throw SparkException.internalError(s"Cannot find main error class '$errorClass'")) | ||
| assert(errorInfo.subClass.isDefined == subErrorClass.isDefined) |
There was a problem hiding this comment.
while I understand the motivation, I think this might be too lenient if left just like this. it would be nice to have an "opt-in" mechanism, so we have some kind of an indication that this is a desired behavior for a specific error condition. otherwise, we are very silently converting all error conditions to a new behavior - over time this might lead to subclasses being effectively unused, since the path of least resistance is always the main condition.
I think it's not too complex to mitigate though. We can make an opt-in mechanism in error-conditions.json, something like:
"TABLE_OR_VIEW_NOT_FOUND": {
"message": ["..."],
"subClassOptional": true,
"subClass": {
"PATH": { "message": ["..."] }
}
}
And then use subClassOptional in ErrorClassesJSONReader to relax the condition when the value is set to true.
There was a problem hiding this comment.
I'm open to to that. Let's see what other think. Note that this is NOT what we do for SQL Scripting handlers though.
There was a problem hiding this comment.
I'm not sure how is this directly related to handlers? Here, we are changing how errors are raised, not handled. Regarding handler finding mechanism, you already explained that it would handle the new behavior, right?
If you are talking about not having SIGNAL in SQL Scripting, that's a gap that needs to be addressed. Once we add SIGNAL, I guess it should follow the same behavior as raise_error does.
But that's a separate matter - is there any special behavior that we should be aware of when supporting SIGNAL?
There was a problem hiding this comment.
"subClassOptional": true looks wrong to me. We should always allow the error raising side to skip the sub error condition when not needed (we should still check error parameters to be an exact match to avoid mistakes).
while for test, we should still do extract match by default (if error was thrown with sub error condition, checkError should use MAIN.SUB. if error was not thrown with sub error condition, checkError should use MAIN). I'm OK to special case some widely tested errors like TABLE_OR_VIEW_NOT_FOUND: when it starts to have error conditions, we can avoid touching many tests. So checkError should have a new "strict" flag, which by default is true (except for the special cases). Tests can override it if needed, and it's a test behavior, not a property of the error (allow optional sub error condition or not).
There was a problem hiding this comment.
I can't do that. If I need to change code to supend the proper check, I may as well change the code to do the proper check.
There was a problem hiding this comment.
I do agree that I should always be able to raise an error without a subcondition. n excpliict default should not be necessary.
There was a problem hiding this comment.

What changes were proposed in this pull request?
We make raising of error sub conditions and checking for them optional.
SPARK-56029-optional-error-subconditions.plan.md
Why are the changes needed?
This allows conversion of error conditions without subconditions to ones with possible without churning and causing incompatible changes
Does this PR introduce any user-facing change?
Yes, checkError() is more tolerant
How was this patch tested?
Added error condiiton unit tests
Was this patch authored or co-authored using generative AI tooling?
Claude Opus 4.6 high