-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle directives in trailing trivia #75595
base: main
Are you sure you want to change the base?
Conversation
Done with review pass (commit 1) #Closed |
src/EditorFeatures/CSharpTest/QuickInfo/DiagnosticAnalyzerQuickInfoSourceTests.cs
Outdated
Show resolved
Hide resolved
Done with review pass (commit 5) #Closed |
It looks like we're still ending up with a directive in trailing trivia. Is that correct? |
Yes, I chose to parse these trailing directives as BadDirectiveTrivia. Is that a problem? I first tried to attach a skipped trivia onto a hash token and put that into the trailing trivia. But that failed some checks because hash token is not a trivia. Which makes sense. BadDirectiveTrivia is a trivia, so I used that instead. But if you have other syntax tree shape in mind for these scenarios, I'm open to changing this. |
With this approach the hash token should be a skipped trivia as well, i.e. not a token #Closed |
My preference (but I could be convinced otherwise) is that we never have any sort of directive trivia as trailing trivia. That is then very easy to remember and have invariants around. If you guys feel otherwise, I'm happy to discuss and compromise on that. However, given that it's never legal for the language, and adds complexity in all the upper layers to consider this case, my preference instead is to make these to skipped tokens. If you do think allowing BadDirective as trailing is preferable to skipped tokens, I'm ok with that. Lmk what you all think! :-) |
// (1,9): error CS1002: ; expected | ||
// if (#if) | ||
TestBase.Diagnostic(ErrorCode.ERR_SemicolonExpected, "").WithLocation(1, 9), | ||
// (1,9): error CS1027: #endif directive expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same set of diagnostics that would be reported without this PR.
Before commit 8, we missed some diagnostics because we parsed the directive as BadDirective. In commit 8, we parse it again as normal #if
directive, just put it into skipped tokens afterwards, but the diagnostics are preserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit 8, we parse it again as normal
#if
directive, just put it into skipped tokens afterwards
Calling the parse helper is an implementation detail. The tree reflects how we parsed the directive and we don't have the directive in the tree. Effectively, that means that the tokens were simply skipped (the mechanism used to skip them is not important), not treated as a directive. Therefore, it is unexpected and undesired (in my opinion) to produce any diagnostics that might imply otherwise.
but the diagnostics are the same.
We change parsing result, diagnostics should change accordingly, I think.
Done with review pass (commit 8) |
// Directives cannot be in trailing trivia - parse as skipped tokens trivia instead. | ||
|
||
var directiveBuilder = new SyntaxListBuilder(10); | ||
this.LexDirectiveAndExcludedTrivia(afterFirstToken, afterNonWhitespaceOnLine: true, ref directiveBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It accumulates diagnostics attached to the tokens, but those seem useful, I think we want to accumulate those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It accumulates diagnostics attached to the tokens
It looks like we somehow get diagnostics about missing end constructs for directives that we ignore here. Is that diagnostic reported here? Otherwise, this method has side effects beyond just tokens that it returns. I do no think we want those side effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be fine to report a regular unexpected token error, skip just the #
token, and continue parsing as if it wasn't there. What we are trying to do right now feels unnecessary complicated to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be fine to report a regular unexpected token error, skip just the
#
token, and continue parsing as if it wasn't there. What we are trying to do right now feels unnecessary complicated to me.
Sounds good, I will do that, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be fine to report a regular unexpected token error, skip just the
#
token, and continue parsing as if it wasn't there. What we are trying to do right now feels unnecessary complicated to me.
@AlekseyTs I have implemented this. The disadvantage of this approach is that it makes error recovery considerably worse.
To summarize the options:
- Skip
#
token, continue parsing as if it wasn't there. Worse error recovery than before. - Parse the trailing directives into skipped tokens. Should we parse the directive as normal, including disabled text, error reporting, then just convert the tokens to skipped tokens? If not, how should we parse?
- Parse as BadDirective. A bit worse error recovery because we parse as a generic directive. Technically, it's still a directive in trailing trivia (so doesn't hold the invariant).
- Allow directives in trailing trivia in error scenarios. Very simple implementation, just remove the assert and everything continues working (error recovery, IDE refactoring features). We still have the invariant that there are no directives in trailing trivia in success scenarios. (Currently, this seems as the best option to me.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can still parse it like before, but represent in the tree however you want. This is very common in the parser.
My main preference is simply that we do not have this structured trivia as trailing trivia.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CyrusNajmabadi would the current state of the PR work for you? We skip the invalid #
, report an error for it, and continue parsing as if it wasn't there.
You can still parse it like before, but represent in the tree however you want. This is very common in the parser.
If it's so common, is there some helper for it? I found SyntaxParser.AddSkippedSyntax. I don't know how I would use it here directly because I don't have any SyntaxToken to attach the SkippedTokensTrivia onto.
Also Aleksey previously didn't like the fact we are forwarding the diagnostics from parsing the directive onto the skipped tokens. Although it seems to me that's how it's supposed to work because the SyntaxParser.AddSkippedSyntax explicitly does that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how I would use it here directly because I don't have any SyntaxToken to attach the SkippedTokensTrivia onto.
Aren't you producing the token? So you would attach to the token that you're just about to make :)
If it's so common
By common, i mean the concept of: have the parser take steps to determine what is going on, but then determine how best to fit that into the syntactic model.
In other words, just because you parsed out a directive, doesn't mean you need to place it in the tree as a directive.
Also Aleksey #75595 (comment) the fact we are forwarding the diagnostics from parsing the directive onto the skipped tokens
Yeah. I don't have strong feelings/thoughts about that. My general point is simply that i would prefer teh final syntactic model not have directives within trailing trivia.
--
Note: it might be easier for us to chat. But i'll try to share my thoughts clearly here. If you need more clarity on my views, please feel free to ask :)
--
The lexer/parser is effectively a complex program that can execute arbitrary logic to try to decide what the user code means and how it should represent it. So, to me, it's fine for it to do things like actually parse out directives wherever it finds them, to have good error recovery. However, that execution logic is orthogonal to the decisions it then makes in terms of final syntactic tree representation (including diagnostics).
So, personally, i think it would be fine to:
- parse out as a directive, for the best recovery experience.
- take the results of that parsing, and re-represent them as a skipped token when placing in the tree. Note: for this i would make this extremely simple. Literally take the entire contents of the parsed directive, and just make a single '.BadToken' whose contents are teh entirety of the text of the entire directive.
The end representation then never has a directive in trailing trivia. Nor does it expose that we produced special directive diagnostics. All teh user sees is a message saying #
is not expected there. And all the tree has is a skipped token representing that entire chunk that we don't feel is valid there.
Fixes #75583.