Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
7cc75b0
a
srielau Mar 12, 2026
18544ab
[SPARK-55964] SYSTEM catalog wins over user catalog fro conflciting s…
srielau Mar 12, 2026
2b263c0
clean up
srielau Mar 12, 2026
b611990
Unify code
srielau Mar 12, 2026
5805a9c
More cleanup, + cache coherence for the win
srielau Mar 12, 2026
d1d2567
Split out cache coherence to SPARK-55964-cache-coherence branch
srielau Mar 13, 2026
ba7e115
Workround cach coherence bug
srielau Mar 13, 2026
7e84fe7
All fucntion in registry are now 3 part names
srielau Mar 13, 2026
0e9c039
Add table function tests, fix SHOW
srielau Mar 13, 2026
1e96b8d
Simplify lookup
srielau Mar 13, 2026
d7631f4
Address review comments, part 2
srielau Mar 13, 2026
4d0ac04
Fix CI testcases
srielau Mar 13, 2026
7641812
Update sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/iden…
srielau Mar 13, 2026
a5f7bdf
Update sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/cata…
srielau Mar 13, 2026
0e27843
Invert default for config
srielau Mar 13, 2026
1b81248
Fix failing CI after trhee part name changes
srielau Mar 15, 2026
70ba7e0
Fix bind-in policy
srielau Mar 15, 2026
8dbe31e
More fixes
srielau Mar 15, 2026
8bb6f9d
Merge master into SPARK-55964-system-first; resolve ConfigBuilder/Con…
srielau Mar 15, 2026
f620e47
Fix internal function lookup
srielau Mar 16, 2026
0f30706
Fix linting
srielau Mar 16, 2026
e39f8cf
Small refinents
srielau Mar 16, 2026
4705dc9
Review comments
srielau Mar 16, 2026
fb05abb
Fix scslastyle + more comments
srielau Mar 16, 2026
a685cfc
style
srielau Mar 16, 2026
8518695
Fix style
srielau Mar 16, 2026
c83322f
More comments by wenchen
srielau Mar 16, 2026
cc1358c
Final fix fro test failures
srielau Mar 17, 2026
d3f3e36
Cleanup
srielau Mar 17, 2026
628da21
Simplify function registry: enforce 3-part keys, add BuiltinRegistryM…
cloud-fan Mar 17, 2026
4dfce7e
Merge pull request #3 from cloud-fan/SPARK-55964-system-first
srielau Mar 17, 2026
d9438c1
Merge master into SPARK-55964-system-first
srielau Mar 17, 2026
ae3e8be
Fix cache coherence
srielau Mar 17, 2026
057171c
Fix list functions
srielau Mar 17, 2026
2ba64e2
[SPARK-54810] PATH
srielau Mar 17, 2026
228bfa9
Add path test suite and SetPathCommand (fix compile)
srielau Mar 17, 2026
3f52964
Use 3-part function identifiers in SessionStateSuite for registry com…
srielau Mar 17, 2026
adfcc17
Update sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/anal…
srielau Mar 17, 2026
fb4e2d7
Fix bloom filter aggregate fucntion
srielau Mar 17, 2026
2662533
Merge SPARK-55964-system-first to sync with refreshed base
srielau Mar 17, 2026
d7aac62
Fix scalastyle
srielau Mar 17, 2026
96d5fd9
Fixes to keywords
srielau Mar 17, 2026
e208477
Merge master into SPARK-54810-path
srielau Mar 17, 2026
fbe9a40
Merge origin/master into SPARK-55964-system-first
srielau Mar 17, 2026
be97728
Fx merge conflics
srielau Mar 17, 2026
e5d9c1f
Merge SPARK-55964-system-first to pick up latest parent commits
srielau Mar 17, 2026
3370f7f
Fix merge issues
srielau Mar 18, 2026
467a0a7
olverguadsuite
srielau Mar 18, 2026
2ee9dcb
fix current_*
srielau Mar 18, 2026
9a5879d
Merge branch 'SPARK-55964-system-first' into SPARK-54810-path
srielau Mar 18, 2026
ff72206
Fix as-of regression
srielau Mar 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions common/utils/src/main/resources/error/error-conditions.json
Original file line number Diff line number Diff line change
Expand Up @@ -1929,6 +1929,12 @@
],
"sqlState" : "23505"
},
"DUPLICATE_PATH_ENTRY" : {
"message" : [
"Duplicate path entry <pathEntry>. The session path cannot contain the same catalog.schema more than once (including after expanding shortcuts like DEFAULT_PATH or SYSTEM_PATH)."
],
"sqlState" : "42P10"
},
"DUPLICATE_ROUTINE_PARAMETER_ASSIGNMENT" : {
"message" : [
"Call to routine <routineName> is invalid because it includes multiple argument assignments to the same parameter name <parameterName>."
Expand Down Expand Up @@ -7615,6 +7621,11 @@
"Cannot have VARIANT type columns in DataFrame which calls set operations (INTERSECT, EXCEPT, etc.), but the type of column <colName> is <dataType>."
]
},
"SET_PATH_VIA_SET" : {
"message" : [
"The session path cannot be set using the SET statement. Use SET PATH = ... instead."
]
},
"SET_PROPERTIES_AND_DBPROPERTIES" : {
"message" : [
"set PROPERTIES and DBPROPERTIES at the same time."
Expand Down
6 changes: 6 additions & 0 deletions docs/sql-ref-ansi-compliance.md
Original file line number Diff line number Diff line change
Expand Up @@ -477,7 +477,10 @@ Below is a list of all the keywords in Spark SQL.
|CROSS|reserved|strict-non-reserved|reserved|
|CUBE|non-reserved|non-reserved|reserved|
|CURRENT|non-reserved|non-reserved|reserved|
|CURRENT_DATABASE|non-reserved|non-reserved|non-reserved|
|CURRENT_DATE|reserved|non-reserved|reserved|
|CURRENT_PATH|non-reserved|non-reserved|reserved|
|CURRENT_SCHEMA|non-reserved|non-reserved|non-reserved|
|CURRENT_TIME|reserved|non-reserved|reserved|
|CURRENT_TIMESTAMP|reserved|non-reserved|reserved|
|CURRENT_USER|reserved|non-reserved|reserved|
Expand All @@ -500,6 +503,7 @@ Below is a list of all the keywords in Spark SQL.
|DEFAULT|non-reserved|non-reserved|non-reserved|
|DEFINED|non-reserved|non-reserved|non-reserved|
|DEFINER|non-reserved|non-reserved|non-reserved|
|DEFAULT_PATH|non-reserved|non-reserved|not a keyword|
|DELAY|non-reserved|non-reserved|non-reserved|
|DELETE|non-reserved|non-reserved|reserved|
|DELIMITED|non-reserved|non-reserved|non-reserved|
Expand Down Expand Up @@ -667,6 +671,7 @@ Below is a list of all the keywords in Spark SQL.
|PARTITION|non-reserved|non-reserved|reserved|
|PARTITIONED|non-reserved|non-reserved|non-reserved|
|PARTITIONS|non-reserved|non-reserved|non-reserved|
|PATH|non-reserved|non-reserved|not a keyword|
|PERCENT|non-reserved|non-reserved|non-reserved|
|PIVOT|non-reserved|non-reserved|non-reserved|
|PLACING|non-reserved|non-reserved|non-reserved|
Expand Down Expand Up @@ -750,6 +755,7 @@ Below is a list of all the keywords in Spark SQL.
|SUBSTR|non-reserved|non-reserved|non-reserved|
|SUBSTRING|non-reserved|non-reserved|non-reserved|
|SYNC|non-reserved|non-reserved|non-reserved|
|SYSTEM_PATH|non-reserved|non-reserved|not a keyword|
|SYSTEM_TIME|non-reserved|non-reserved|non-reserved|
|SYSTEM_VERSION|non-reserved|non-reserved|non-reserved|
|TABLE|reserved|non-reserved|reserved|
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,10 @@ CREATE: 'CREATE';
CROSS: 'CROSS';
CUBE: 'CUBE';
CURRENT: 'CURRENT';
CURRENT_DATABASE: 'CURRENT_DATABASE';
CURRENT_DATE: 'CURRENT_DATE';
CURRENT_PATH: 'CURRENT_PATH';
CURRENT_SCHEMA: 'CURRENT_SCHEMA';
CURRENT_TIME: 'CURRENT_TIME';
CURRENT_TIMESTAMP: 'CURRENT_TIMESTAMP';
CURRENT_USER: 'CURRENT_USER';
Expand All @@ -217,6 +220,7 @@ DEC: 'DEC';
DECIMAL: 'DECIMAL';
DECLARE: 'DECLARE';
DEFAULT: 'DEFAULT';
DEFAULT_PATH: 'DEFAULT_PATH';
DEFINED: 'DEFINED';
DEFINER: 'DEFINER';
DELAY: 'DELAY';
Expand Down Expand Up @@ -385,6 +389,7 @@ OVERWRITE: 'OVERWRITE';
PARTITION: 'PARTITION';
PARTITIONED: 'PARTITIONED';
PARTITIONS: 'PARTITIONS';
PATH: 'PATH';
PERCENTLIT: 'PERCENT';
PIVOT: 'PIVOT';
PLACING: 'PLACING';
Expand Down Expand Up @@ -470,6 +475,7 @@ SUBSTRING: 'SUBSTRING';
SYNC: 'SYNC';
SYSTEM_TIME: 'SYSTEM_TIME';
SYSTEM_VERSION: 'SYSTEM_VERSION';
SYSTEM_PATH: 'SYSTEM_PATH';
TABLE: 'TABLE';
TABLES: 'TABLES';
TABLESAMPLE: 'TABLESAMPLE';
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,7 @@ setResetStatement
| SET TIME ZONE interval #setTimeZone
| SET TIME ZONE timezone #setTimeZone
| SET TIME ZONE .*? #setTimeZone
| SET PATH EQ pathElement (COMMA pathElement)* #setPath
| SET variable assignmentList #setVariable
| SET variable LEFT_PAREN multipartIdentifierList RIGHT_PAREN EQ
LEFT_PAREN query RIGHT_PAREN #setVariable
Expand All @@ -457,6 +458,15 @@ setResetStatement
| RESET .*? #resetConfiguration
;

pathElement
: DEFAULT_PATH
| SYSTEM_PATH
| PATH
| CURRENT_DATABASE (LEFT_PAREN RIGHT_PAREN)?
| CURRENT_SCHEMA (LEFT_PAREN RIGHT_PAREN)?
| multipartIdentifier
;

executeImmediate
: EXECUTE IMMEDIATE queryParam=expression (INTO targetVariable=multipartIdentifierList)? executeImmediateUsing?
;
Expand Down Expand Up @@ -1276,7 +1286,7 @@ datetimeUnit
;

primaryExpression
: name=(CURRENT_DATE | CURRENT_TIMESTAMP | CURRENT_USER | USER | SESSION_USER | CURRENT_TIME) #currentLike
: name=(CURRENT_DATABASE | CURRENT_DATE | CURRENT_PATH | CURRENT_SCHEMA | CURRENT_TIME | CURRENT_TIMESTAMP | CURRENT_USER | USER | SESSION_USER) (LEFT_PAREN RIGHT_PAREN)? #currentLike
| name=(TIMESTAMPADD | DATEADD | DATE_ADD) LEFT_PAREN (unit=datetimeUnit | invalidUnit=stringLit) COMMA unitsAmount=valueExpression COMMA timestamp=valueExpression RIGHT_PAREN #timestampadd
| name=(TIMESTAMPDIFF | DATEDIFF | DATE_DIFF | TIMEDIFF) LEFT_PAREN (unit=datetimeUnit | invalidUnit=stringLit) COMMA startTimestamp=valueExpression COMMA endTimestamp=valueExpression RIGHT_PAREN #timestampdiff
| CASE whenClause+ (ELSE elseExpression=expression)? END #searchedCase
Expand Down Expand Up @@ -1939,6 +1949,9 @@ ansiNonReserved
| CURSOR
| CUBE
| CURRENT
| CURRENT_DATABASE
| CURRENT_PATH
| CURRENT_SCHEMA
| DATA
| DATABASE
| DATABASES
Expand All @@ -1957,6 +1970,7 @@ ansiNonReserved
| DEFAULT
| DEFINED
| DEFINER
| DEFAULT_PATH
| DELAY
| DELETE
| DELIMITED
Expand Down Expand Up @@ -2088,6 +2102,7 @@ ansiNonReserved
| PARTITION
| PARTITIONED
| PARTITIONS
| PATH
| PERCENTLIT
| PIVOT
| PLACING
Expand Down Expand Up @@ -2163,6 +2178,7 @@ ansiNonReserved
| SUBSTR
| SUBSTRING
| SYNC
| SYSTEM_PATH
| SYSTEM_TIME
| SYSTEM_VERSION
| TABLES
Expand Down Expand Up @@ -2316,7 +2332,10 @@ nonReserved
| CUBE
| CURRENT
| CURSOR
| CURRENT_DATABASE
| CURRENT_DATE
| CURRENT_PATH
| CURRENT_SCHEMA
| CURRENT_TIME
| CURRENT_TIMESTAMP
| CURRENT_USER
Expand All @@ -2336,6 +2355,7 @@ nonReserved
| DECIMAL
| DECLARE
| DEFAULT
| DEFAULT_PATH
| DEFINED
| DEFINER
| DELAY
Expand Down Expand Up @@ -2507,6 +2527,7 @@ nonReserved
| PROCEDURES
| PROPERTIES
| PURGE
| PATH
| QUARTER
| QUERY
| RANGE
Expand Down Expand Up @@ -2576,6 +2597,7 @@ nonReserved
| SUBSTR
| SUBSTRING
| SYNC
| SYSTEM_PATH
| SYSTEM_TIME
| SYSTEM_VERSION
| TABLE
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -290,13 +290,20 @@ object Analyzer {
*/
class Analyzer(
override val catalogManager: CatalogManager,
private[sql] val sharedRelationCache: RelationCache = RelationCache.empty)
private[sql] val sharedRelationCache: RelationCache = RelationCache.empty,
private[sql] val sessionConf: Option[SQLConf] = None)
extends RuleExecutor[LogicalPlan]
with CheckAnalysis with AliasHelper with SQLConfHelper with ColumnResolutionHelper {

/** Conf to use for path-based resolution and error messages; uses session conf when available. */
private[sql] def resolutionConf: SQLConf = sessionConf.getOrElse(SQLConf.get)

override protected def confForRoutineResolution: SQLConf = resolutionConf

private val v1SessionCatalog: SessionCatalog = catalogManager.v1SessionCatalog
private val relationResolution = new RelationResolution(catalogManager, sharedRelationCache)
private val functionResolution = new FunctionResolution(catalogManager, relationResolution)
private val functionResolution = new FunctionResolution(catalogManager, relationResolution,
resolutionConf)

override protected def validatePlanChanges(
previousPlan: LogicalPlan,
Expand All @@ -317,20 +324,22 @@ class Analyzer(
if (plan.analyzed) {
plan
} else {
def runAnalysis(): LogicalPlan = HybridAnalyzer.fromLegacyAnalyzer(
legacyAnalyzer = this, tracker = tracker).apply(plan)
def runWithConf(): LogicalPlan = sessionConf match {
case Some(c) => SQLConf.withExistingConf(c) { runAnalysis() }
case None => runAnalysis()
}
if (AnalysisContext.get.isDefault) {
AnalysisContext.reset()
try {
AnalysisHelper.markInAnalyzer {
HybridAnalyzer.fromLegacyAnalyzer(legacyAnalyzer = this, tracker = tracker).apply(plan)
}
AnalysisHelper.markInAnalyzer { runWithConf() }
} finally {
AnalysisContext.reset()
}
} else {
AnalysisContext.withNewAnalysisContext {
AnalysisHelper.markInAnalyzer {
HybridAnalyzer.fromLegacyAnalyzer(legacyAnalyzer = this, tracker = tracker).apply(plan)
}
AnalysisHelper.markInAnalyzer { runWithConf() }
}
}
}
Expand Down Expand Up @@ -2063,7 +2072,9 @@ class Analyzer(
case FunctionType.NotFound =>
val catalogPath =
catalogManager.currentCatalog.name +: catalogManager.currentNamespace
val searchPath = SQLConf.get.resolutionSearchPath(catalogPath.toSeq)
val pathEntries = resolutionConf.effectivePathEntries
.getOrElse(Seq(catalogPath.toSeq))
val searchPath = resolutionConf.resolutionSearchPath(pathEntries)
.map(_.quoted)
throw QueryCompilationErrors.unresolvedRoutineError(
nameParts,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,11 @@ trait CheckAnalysis extends LookupCatalog with QueryErrorsBase with PlanToString

protected def isView(nameParts: Seq[String]): Boolean

protected def conf: org.apache.spark.sql.internal.SQLConf

/** Conf for routine resolution/errors; override in Analyzer to use session conf. */
protected def confForRoutineResolution: SQLConf = conf

import org.apache.spark.sql.connector.catalog.CatalogV2Implicits._

/**
Expand Down Expand Up @@ -304,7 +309,9 @@ trait CheckAnalysis extends LookupCatalog with QueryErrorsBase with PlanToString

case u: UnresolvedFunctionName =>
val catalogPath = currentCatalog.name +: catalogManager.currentNamespace
val searchPath = SQLConf.get.resolutionSearchPath(catalogPath.toSeq)
val pathEntries = confForRoutineResolution.effectivePathEntries
.getOrElse(Seq(catalogPath.toSeq))
val searchPath = confForRoutineResolution.resolutionSearchPath(pathEntries)
.map(_.quoted)
throw QueryCompilationErrors.unresolvedRoutineError(
u.multipartIdentifier,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -218,10 +218,15 @@ trait SimpleFunctionRegistryBase[T] extends FunctionRegistryBase[T] with Logging
protected val functionBuilders =
new mutable.HashMap[FunctionIdentifier, (ExpressionInfo, FunctionBuilder)]

// Resolution of the function name is always case insensitive, but the database name
// depends on the caller
private def normalizeFuncName(name: FunctionIdentifier): FunctionIdentifier = {
FunctionIdentifier(name.funcName.toLowerCase(Locale.ROOT), name.database)
// All function identifiers must be fully qualified (3-part: catalog.database.funcName).
// Normalization lowercases all parts for case-insensitive lookup.
protected def normalizeFuncName(name: FunctionIdentifier): FunctionIdentifier = {
assert(name.database.isDefined && name.catalog.isDefined,
s"Function identifier must be fully qualified (3-part): $name")
new FunctionIdentifier(
name.funcName.toLowerCase(Locale.ROOT),
name.database.map(_.toLowerCase(Locale.ROOT)),
name.catalog.map(_.toLowerCase(Locale.ROOT)))
}

override def registerFunction(
Expand Down Expand Up @@ -336,6 +341,16 @@ class SimpleFunctionRegistry
}
}

/**
* A mixin for builtin-only registries that accepts any identifier and normalizes
* it to the builtin 3-part key (system.builtin.funcName). This allows callers to
* look up builtins by simple name without constructing a fully qualified identifier.
*/
trait BuiltinRegistryMixin[T] extends SimpleFunctionRegistryBase[T] {
override protected def normalizeFuncName(name: FunctionIdentifier): FunctionIdentifier =
super.normalizeFuncName(FunctionRegistry.builtinFunctionIdentifier(name.funcName))
}

object EmptyFunctionRegistry
extends EmptyFunctionRegistryBase[Expression]
with FunctionRegistry {
Expand All @@ -347,6 +362,13 @@ object FunctionRegistry {

type FunctionBuilder = Seq[Expression] => Expression

/** Returns the 3-part identifier for a builtin function: system.builtin.funcName. */
private[sql] def builtinFunctionIdentifier(name: String): FunctionIdentifier =
new FunctionIdentifier(
name.toLowerCase(Locale.ROOT),
Some(CatalogManager.BUILTIN_NAMESPACE),
Some(CatalogManager.SYSTEM_CATALOG_NAME))

val FUNC_ALIAS = TreeNodeTag[String]("functionAliasName")

// ==============================================================================================
Expand Down Expand Up @@ -832,6 +854,7 @@ object FunctionRegistry {
expression[CurrentDatabase]("current_database"),
expression[CurrentDatabase]("current_schema", true, Some("3.4.0")),
expression[CurrentCatalog]("current_catalog"),
expression[CurrentPath]("current_path", true, Some("4.2.0")),
expression[CurrentUser]("current_user"),
expression[CurrentUser]("user", true, Some("3.4.0")),
expression[CurrentUser]("session_user", true, Some("4.0.0")),
Expand Down Expand Up @@ -1003,19 +1026,21 @@ object FunctionRegistry {
expression[ToProtobuf]("to_protobuf")
)

// BuiltinRegistryMixin normalizes any name to the builtin 3-part key (system.builtin.name).
val builtin: SimpleFunctionRegistry = {
val fr = new SimpleFunctionRegistry
val fr = new SimpleFunctionRegistry with BuiltinRegistryMixin[Expression]
expressions.foreach {
case (name, (info, builder)) =>
fr.internalRegisterFunction(FunctionIdentifier(name), info, builder)
fr.registerFunction(FunctionIdentifier(name), info, builder)
}
fr
}

val functionSet: Set[FunctionIdentifier] = builtin.listFunction().toSet

/** Registry for internal functions used by Connect and the Column API. */
private[sql] val internal: SimpleFunctionRegistry = new SimpleFunctionRegistry
private[sql] val internal: SimpleFunctionRegistry =
new SimpleFunctionRegistry with BuiltinRegistryMixin[Expression]

private[spark] def registerInternalExpression[T <: Expression : ClassTag](
name: String,
Expand All @@ -1030,7 +1055,8 @@ object FunctionRegistry {
} else {
builder
}
internal.internalRegisterFunction(FunctionIdentifier(name), info, newBuilder)
// BuiltinRegistryMixin normalizes to the builtin 3-part key (system.builtin.name).
internal.registerFunction(FunctionIdentifier(name), info, newBuilder)
}

registerInternalExpression[Product]("product")
Expand Down Expand Up @@ -1301,11 +1327,12 @@ object TableFunctionRegistry {
PythonWorkerLogs.functionBuilder
)

// BuiltinRegistryMixin normalizes any name to the builtin 3-part key (system.builtin.name).
val builtin: SimpleTableFunctionRegistry = {
val fr = new SimpleTableFunctionRegistry
val fr = new SimpleTableFunctionRegistry with BuiltinRegistryMixin[LogicalPlan]
logicalPlans.foreach {
case (name, (info, builder)) =>
fr.internalRegisterFunction(FunctionIdentifier(name), info, builder)
fr.registerFunction(FunctionIdentifier(name), info, builder)
}
fr
}
Expand Down
Loading