-
Notifications
You must be signed in to change notification settings - Fork 786
[Strings] Add a string-builtins feature, and lift/lower automatically when enabled #7601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
5db98f4
36d89d5
956d45a
b72d753
3b21cf2
adad0dc
c636685
7bf1cf7
8ad61b5
48f05f7
60637f9
3195ac2
4dd6747
3c29e7b
8ae39b8
bbd6804
2e82f9a
0feeeae
02d1d23
57ac595
00b3c1b
123b879
dfa5231
77b7f6e
17c5019
0a447e5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -398,7 +398,19 @@ struct OptimizationOptions : public ToolOptions { | |||||||||
passRunner.clear(); | ||||||||||
}; | ||||||||||
|
||||||||||
for (auto& pass : passes) { | ||||||||||
// Find the first and last default opt passes, so we can tell them they are | ||||||||||
// first/last. | ||||||||||
Index firstDefault = passes.size(); | ||||||||||
Index lastDefault = passes.size(); | ||||||||||
for (Index i = 0; i < passes.size(); i++) { | ||||||||||
if (passes[i].name == DEFAULT_OPT_PASSES) { | ||||||||||
firstDefault = std::min(firstDefault, i); | ||||||||||
lastDefault = i; | ||||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
for (Index i = 0; i < passes.size(); i++) { | ||||||||||
auto& pass = passes[i]; | ||||||||||
if (pass.name == DEFAULT_OPT_PASSES) { | ||||||||||
// This is something like -O3 or -Oz. We must run this now, in order to | ||||||||||
// set the proper opt and shrink levels. To do that, first reset the | ||||||||||
|
@@ -416,8 +428,13 @@ struct OptimizationOptions : public ToolOptions { | |||||||||
passRunner.options.optimizeLevel = *pass.optimizeLevel; | ||||||||||
passRunner.options.shrinkLevel = *pass.shrinkLevel; | ||||||||||
|
||||||||||
// Note the ordering of these default passes. | ||||||||||
PassRunner::Ordering ordering; | ||||||||||
ordering.first = (i == firstDefault); | ||||||||||
ordering.last = (i == lastDefault); | ||||||||||
Comment on lines
+432
to
+434
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alternatively:
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I kind of like writing out which is first and which is last? I guess the natural order is (first, last) but it is still easier to read I think. |
||||||||||
|
||||||||||
// Run our optimizations now with the custom levels. | ||||||||||
passRunner.addDefaultOptimizationPasses(); | ||||||||||
passRunner.addDefaultOptimizationPasses(ordering); | ||||||||||
flush(); | ||||||||||
|
||||||||||
// Restore the default optimize/shrinkLevels. | ||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use
string-lowering-magic-imports-assert
to make sure we aren't accidentally doing any optimizations that would result in us emitting a non-standard custom section for non-UTF-8 string constants?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, isn't the section standardized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. The standard solution is the magic imports, which can only handle valid UTF-8 strings. The custom section is a random thing we experimented with on the way to developing magic imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the spec solution for non-UTF8 strings, then?
Separately, if we assert here, then any module with a non-utf8 stringref will assert if you just do
-all -O2
... that seems wrong to me.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no spec solution for non-UTF-8 strings. We could synthesize them in a start function if we had to, but I don't think that should be necessary. No input module should have non-UTF-8 strings because they are not expressible with string builtins, and our optimizations should not produce new invalid non-UTF-8 strings, so no output module should have non-UTF-8 strings. Input modules that use stringref to represent non-UTF-8 strings probably don't want this lowering to occur in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we can do the lifting when either stringref or string-builtins are enabled, then lower only if string-builtins and not stringref are enabled.
string.const should still be valid when string-builtins are enabled. We might want to error in binary writing if they haven't been lowered and stringref is not enabled, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, these are valid options, but they all seem significantly more complex to explain and to use. The readme text would need to be substantially longer.
I don't have a better suggestion for this PR yet, but let's keep thinking here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the contrary, I don't think there's much to explain. As always, we optimize to the extent we can given the enabled features, and we validate that the IR will be written as valid modules given the enabled features
We should document that
string.const
is valid IR if either of the string features are enabled, but only on the condition that it will be lowered before writing if stringref is disabled. We should also document thatstring.const
containing unpaired surrogates is only valid if stringref is enabled. I don't think we need to document how automatic lifting and lowering works, just as we don't need to document the specifics of what other optimizations we run.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say all of these quotes are non-trivial and potentially confusing for our users:
Just adding that text by itself would double the current PR's readme entry.
I really feel we should find something simpler here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just us doing optimizations based on the target features. It doesn't need to be documented or understood by users.
Yeah, this one is weird. If lowering were automatic in the binary writer, this wouldn't need to be documented or understood, either. This complexity is the price we pay to make lowering a separately sequenced pass.
This one just matches the expressive capabilities of the underlying features. I don't think it needs to be documented beyond error messages.
This is the validation behavior we would want independent of how we lift and lower strings. The only reason this complexity comes up in this PR and not before is because we didn't have separate string features before this PR. But it's strictly more useful to users to have separate string features with detailed validation to ensure they actually produce valid binaries for the intended engine feature set.