C#: Reduce location tuple extraction for named types.#20581
Merged
michaelnebel merged 10 commits intogithub:mainfrom Oct 7, 2025
Merged
C#: Reduce location tuple extraction for named types.#20581michaelnebel merged 10 commits intogithub:mainfrom
michaelnebel merged 10 commits intogithub:mainfrom
Conversation
…ting location (if any).
…plicit constructor.
6cb2561 to
6149608
Compare
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR optimizes location extraction for named types by reducing redundant location tuple extraction. Previously, when a partial class was declared across multiple files, the extractor would extract all locations for each encounter, leading to bloated TRAP data. Now, location extraction respects the current context, only extracting locations relevant to the file being processed.
- Improved location extraction efficiency by filtering source locations to the current context
- Added context-aware location filtering for type parameters and constructors
- Enhanced test coverage for multi-file partial classes and location handling
Reviewed Changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| csharp/extractor/Semmle.Extraction.CSharp/Extractor/Context.cs | Added IsLocationInContext method to check if location belongs to current source tree |
| csharp/extractor/Semmle.Extraction.CSharp/Entities/Types/NamedType.cs | Modified GetLocations to filter source locations by context and made method non-static |
| csharp/extractor/Semmle.Extraction.CSharp/Entities/Types/TypeParameter.cs | Added context check before extracting type parameter locations |
| csharp/extractor/Semmle.Extraction.CSharp/Entities/Constructor.cs | Improved implicit constructor location reporting with source location filtering |
| csharp/ql/lib/semmle/code/csharp/Type.qll | Added location override for nested types to use unbound declaration locations |
| csharp/ql/test/library-tests/partial/*.cs | Added test files for partial classes across multiple files |
| csharp/ql/test/library-tests/locations/*.cs | Added test files for location handling with inheritance and partial classes |
| Various .expected files | Updated test expectations to reflect improved location extraction |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
It turns out that the C# extractor does some symmetric location extraction, which leads to bloated TRAP and subsequently also impacts TRAP import performance, as the importer needs to do tuple de-duplication.
The issue can best be explained by a partial class example. That is, assume we have a partial class declaration
Cscattered in two filesfile1.csandfile2.cs.The extractor will encounter the type
Ctwice during extraction and extract its location(s). However, prior to this change the extractor, extracts all the locations ofCon each encounter (and then uses TRAP import de-duplication to "solve" the problem). Instead we could rely on the symmetry and only extract the location from the current context (the context is given by the file currently being extracted). As a consequence we only want to extract the location infile1.cswhen processingfile1.csand the location infile2.cswhen extractingfile2.cs.At first glance, this might look like a corner-case - and for traced extraction this is also the case. However, for buildless (BMN) extraction we cram all code from a repository into a single compilation, which can cause significant overlaps in the locations reported by Roslyn. The larger the repository, the higher the chance for overlaps and the higher the impact we get from this change.
Furthermore, this is a pre-requisite, if we want to switch to using
*IDs for location extraction.The last DCA executions shows that
For traced extraction
For buildless extraction.
mono/monowe get a reduction of around 30%.mono/monois reduced by 30% (and 12% fordotnet/roslyn) and stable (within the statistical uncertainty) for the remaining DCA projects.