Fix PHP find_symbol: route by file extension, handle flat SymbolInformation#1097
Fix PHP find_symbol: route by file extension, handle flat SymbolInformation#1097EarthmanWeb wants to merge 6 commits intooraios:mainfrom
Conversation
…mation Three related fixes for PHP symbol resolution: 1. ls_manager.get_language_server(): route to language server by file extension (via get_source_fn_matcher) instead of is_ignored_path, so PHP files are correctly dispatched to Intelephense. 2. ls.get_symbols_overview(): when a file is explicitly requested but matches an ignore pattern, warn and proceed instead of returning an empty list. 3. symbol.iter_name_path_components_reversed(): add containerName fallback for flat SymbolInformation responses (Intelephense returns these for some PHP files), enabling name-path patterns like "ClassName/method" to match.
src/serena/symbol.py
Outdated
| has_yielded_ancestor = False | ||
| for ancestor in self.iter_ancestors(up_to_symbol_kind=SymbolKind.File): | ||
| yield NamePathComponent(ancestor.name, ancestor.overload_idx) | ||
| has_yielded_ancestor = True | ||
| if not has_yielded_ancestor: | ||
| # Fallback for flat SymbolInformation (e.g. Intelephense returns SymbolInformation[] | ||
| # instead of DocumentSymbol[] for some PHP files). In that case symbols have no | ||
| # parent-child links, but may carry a containerName that identifies the enclosing | ||
| # class or function. Yield it so that name-path patterns like "ClassName/method" | ||
| # still match correctly. | ||
| container_name = self.symbol_root.get("containerName") | ||
| if container_name: | ||
| yield NamePathComponent(container_name, None) |
There was a problem hiding this comment.
This has to be solved in the language server, not here.
There was a problem hiding this comment.
By that I mean the SolidLanguageServer implementation, not Intelephense itself.
There was a problem hiding this comment.
Agreed — moved the fix to Intelephense._request_document_symbols in ce6ddf3. The override detects flat SymbolInformation[] (by top-level "location" key) and calls a new _reconstruct_document_symbols static method that rebuilds the proper DocumentSymbol[] tree using containerName before returning. symbol.py is reverted to its original form.
There was a problem hiding this comment.
Great!
This means that all cached raw document symbol data needs to be invalidated, however, and we must increment cache_version_raw_document_symbols (which currently uses the default value 1) in the super constructor call to achieve this.
src/serena/ls_manager.py
Outdated
| if not candidate.is_ignored_path(relative_path, ignore_unsupported_files=True): | ||
| fn_matcher = candidate.language.get_source_fn_matcher() | ||
| if fn_matcher.is_relevant_filename(relative_path): |
There was a problem hiding this comment.
Why did you change this? is_ignored_path with ignore_unsupported_files=True has the same semantics for files.
Lines 580 to 599 in 3d4610f
There was a problem hiding this comment.
Changes to is ignored path are leaked over from other changes to allow an option in config to allow dotfiles and to allow gitignored files. That was negating the flags that were set in the config
There was a problem hiding this comment.
The LS implementations ignore only directories starting with ".", not files. For files, the logic I cited applies.
There was a problem hiding this comment.
Both addressed:
symbol.py — moved to the correct layer in ce6ddf3a; Intelephense now reconstructs the DocumentSymbol[] tree from flat SymbolInformation[] before it reaches shared code.
ls_manager.py — you're right, is_ignored_path already does the extension check at ls.py:596–599. Reverted in 50037339. The bypass came from confusion with changes in another branch that don't actually affect file routing here.
Intelephense can return flat SymbolInformation[] (no parent-child links) instead of DocumentSymbol[] for some PHP files. The previous fix placed a containerName fallback inside LanguageServerSymbol.iter_name_path_components_reversed, polluting the generic symbol layer with a language-server-specific quirk. Moved the fix to the correct layer per reviewer feedback: - intelephense.py: Override _request_document_symbols to detect flat SymbolInformation[] (identified by a top-level "location" key) and reconstruct a proper DocumentSymbol[] tree using containerName to wire up parent-child relationships before the result reaches shared code. - symbol.py: Revert iter_name_path_components_reversed to its original clean form; no Intelephense-specific knowledge belongs here. - test_symbol.py: Remove TestLanguageServerSymbolFlatContainerName, which tested the now-removed containerName fallback in the generic class. - test_php_basic.py: Add TestIntelephenseReconstructDocumentSymbols with four unit tests covering _reconstruct_document_symbols directly (root symbols, method-to-class attachment, orphan handling, two-class separation). The existing integration test test_find_symbol_class_method continues to cover the end-to-end path through Intelephense.
The previous implementation bypassed is_ignored_path in favour of calling fn_matcher.is_relevant_filename directly when routing a file path to the correct language server in a multi-LS configuration. This was unnecessary: is_ignored_path(relative_path, ignore_unsupported_files=True) already performs the exact same file-extension check via fn_matcher.is_relevant_filename internally (ls.py:596-599) before it evaluates directory ignore rules or gitignore patterns. The bypass was introduced due to confusion with a separate in-progress branch where unrelated changes to is_ignored_path were inadvertently negating the extension check for files. As noted in PR review, the dot-directory guard (is_ignored_dirname) applies only to directory path components, not to files themselves, so those changes do not affect file-extension routing here. Reverts get_language_server to its original is_ignored_path call and removes the expanded docstring added with the bypass.
Problem
Two related issues cause
find_symbolto fail or return wrong results in PHP projects:1. Incorrect language server routing
In multi-language projects,
get_language_server()routed files usingis_ignored_path()— a method that checks gitignore and unsupportedextension filters. A PHP file could pass or fail this check for the
wrong language server depending on configuration order, causing PHP
files to be routed to the Python LS (or vice versa).
2. Flat
SymbolInformation[]from IntelephenseFor some PHP files, Intelephense returns
SymbolInformation[](theolder LSP format) instead of
DocumentSymbol[]. These symbols have noparent-child hierarchy — methods appear as root-level symbols with only
a
containerNamestring field linking them to their class.find_symbol("ClassName/method")failed silently becauseiter_name_path_components_reversed()never yielded the container,so the name-path match never succeeded.
3. Hard error on explicitly-requested ignored file
When
get_symbols_overview()was called on a file that happened tomatch an ignore pattern, it raised a hard error and returned
[]instead of proceeding. This broke workflows where the caller explicitly
requested a specific file.
Fix
ls_manager.py: Route byfn_matcher.is_relevant_filename()—checks the file extension against the language's known extensions —
instead of
is_ignored_path().symbol.py: AddedcontainerNamefallback initer_name_path_components_reversed(). When a symbol has noancestors (flat
SymbolInformation), yield thecontainerNamevalueso name-path patterns like
"ClassName/method"still match.ls.py: Downgraded hard error to a warning when an explicitlyrequested file is ignored; proceed with the symbol request anyway.
Example
Before:
After: