-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Don't trigger ICU loading in string.ToUpperInvariant #120685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes string case conversion to avoid unnecessary ICU loading during simple operations like string.ToUpperInvariant()
and string.ToLowerInvariant()
. The change addresses performance concerns where even basic console output could trigger ICU initialization unnecessarily.
Key changes:
- Introduced new static methods
ToUpperInvariant
andToLowerInvariant
in TextInfo that bypass TextInfo.Invariant property access - Modified the ChangeCaseCommon methods to accept nullable TextInfo instances, using null to represent invariant culture
- Updated all call sites to use the new pattern where null TextInfo indicates invariant culture processing
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
File | Description |
---|---|
src/libraries/System.Private.CoreLib/src/System/String.Manipulation.cs | Updated ToUpperInvariant and ToLowerInvariant to call new static TextInfo methods |
src/libraries/System.Private.CoreLib/src/System/Globalization/TextInfo.cs | Added static ToUpperInvariant/ToLowerInvariant methods and refactored ChangeCaseCommon to accept nullable TextInfo |
cc @tarekgh |
Ah, it's actually a bit more complicated because It returns |
I think you would have to frontload the speculative ascii-only path before any |
c6898c0
to
32d8761
Compare
9887f4d
to
d2a96e8
Compare
@EgorBot -amd -arm --envvars DOTNET_SYSTEM_GLOBALIZATION_INVARIANT:1 using BenchmarkDotNet.Attributes;
public class Benchmarks
{
[Benchmark]
[Arguments(".NET Runtime uses third-party libraries or other " +
"resources that may be distributed under licenses " +
"different than the .NET Runtime software.")]
[Arguments("runtime uses third-party libraries or other " +
"resources that may be distributed under licenses")]
[Arguments("runtime")]
public string ToLower(string data) => data.ToLowerInvariant();
} |
Right, I thought I could just avoid loading ICU at all triggered by I've slightly changed the code so the ASCII path is always executed first (for Previously we were unconditionally using the special path for invariant mode and that path was quite slow for full ASCII as it was a scalar impl. So presumably my PR improves ASCII inputs for the Invariant mode but regresses non-ASCII ones. I assume the inputs are more likely to be ASCII in the Invariant mode so it sounds like a reasonable trade-off. |
Flamegraphs for before-after for a hello world: Perf improvements for ASCII only inputs in the InvariantGlobalization mode: EgorBot/runtime-utils#521 |
Would this affect how trimming works? I recall that trimming used to depend on |
src/libraries/System.Private.CoreLib/src/System/Globalization/TextInfo.cs
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Globalization/TextInfo.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Contributes to #120407
It looks like even a simple
may trigger ICU load unnecessarily via
string.ToUpperInvariant
calls:Another source of these
string.ToUpperInvariant
s are EventSource (over EventSources' names to generate guids) - we plan to remove those, though.For most app the ICU most likely will be loaded anyway, but IMO if it doesn't take too much of efforts, we should delay/avoid it. In this PR it is achieved by introducing a few extra null check.
Perf-traces for a hello world:
Also, perf improvements for the InvariantGlobalization mode: EgorBot/runtime-utils#521