Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heredoc rework #193

Merged
merged 9 commits into from
Aug 23, 2023
Merged

Heredoc rework #193

merged 9 commits into from
Aug 23, 2023

Conversation

amaanq
Copy link
Member

@amaanq amaanq commented Aug 23, 2023

So, I wanted to separate heredoc bodies and their endings. It wasn't the most pleasant to do it w/o reworking the entire scan_heredoc_content function, but I managed to get it work. Since this doesn't really fix a bug but rather separates two nodes, I'd like to get your thoughts @verhovsky @ObserverOfTime

After this I'll cut a new release and will add the CI action to publish binaries @verhovsky

Closes #37
Closes #118
Closes #134 (if the action doesn't mess up ;) )
Closes #188
Closes #194
Closes #195

@ObserverOfTime
Copy link
Member

LGTM

@amaanq amaanq force-pushed the heredoc-rework branch 8 times, most recently from 2a161c0 to 6c9ad5d Compare August 23, 2023 07:10
@verhovsky
Copy link
Collaborator

As I said in #118, they should all be children of the heredoc_redirect node. It would be easier to work with the resulting tree, because otherwise when you find that your redirected_statement node has a heredoc_redirect, having to check its next node to find the body is weird.

So for this test

node <<JS
console.log("hi")
JS

It's currently

  (redirected_statement
    (command (command_name (word)))
    (heredoc_redirect (heredoc_start)))
  (heredoc_body)

where and this PR makes it

  (redirected_statement
    (command
      (command_name
        (word)))
    (heredoc_redirect
      (heredoc_start)))
  (simple_heredoc_body)
  (heredoc_end)

but I expected this instead

  (redirected_statement
    (command
      (command_name
        (word)))
    (heredoc_redirect
      (heredoc_start simple_heredoc_body heredoc_end)))

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

I did try to implement that, the issue is commands can be in between a heredoc start and body, it might be possible to partially integrate your goal

The main issue is something like this:

one <<EOF | grep two
three
EOF

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

Also, can you take a look at the action I added? It should run on a release, my question is how does it upload the binaries to the release?

@amaanq amaanq force-pushed the heredoc-rework branch 2 times, most recently from 278127a to 9f774d7 Compare August 23, 2023 07:24
@verhovsky
Copy link
Collaborator

how does it upload the binaries to the release

In the command prebuild -u <GITHUB API KEY>, -u is short for --upload <GITHUB API KEY>. That argument is telling it to take all the binary files it compiled for all the -t/--targets and upload them to github and the passed in API key is what allows it to do that.

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

I managed to get heredocs to work like you wanted, with a hack for pipes. fixing a couple scanner bugs was very annoying...but it ended up being a tiny fix.

@verhovsky
Copy link
Collaborator

verhovsky commented Aug 23, 2023

Remove the -u ${{ secrets.GH_TOKEN }} (in both places) while you're testing the CI command because as soon as it stops failing and gets to that part it'll upload binary files.

@verhovsky
Copy link
Collaborator

the issue is commands can be in between a heredoc start and body

I didn't know/think about that when I made that suggestion, then I think you're right it should be separate, but I still think it should be part of the redirect_statement node, so like

  (redirected_statement
    (command
      (command_name
        (word)))
    (heredoc_redirect
      (heredoc_start))
    (simple_heredoc_body heredoc_end))

I think the end should be with the body so that you don't have to search for it in the tree, unless there can be commands between the body and the end too somehow...

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

Okay, just pushed it. Try out heredocs now @verhovsky

Reviewing the action now

@amaanq amaanq force-pushed the heredoc-rework branch 3 times, most recently from 1e90f57 to d4c7b37 Compare August 23, 2023 08:37
@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

@ObserverOfTime Can you try out my latest changes? I did my best to fix up parameter expansions

Damn nvm script fails

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

Fixed script failures, try now @ObserverOfTime w/ whatever wacky patterns you can come up w/

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

This was the hardest set of fixes I've worked on in a while, good luck downstream

@amaanq amaanq merged commit 42ab5ca into tree-sitter:master Aug 23, 2023
5 checks passed
@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

@verhovsky I released 0.20.0 on GH to see how the pack action would work, but nothing ran

@ObserverOfTime
Copy link
Member

I didn't get the chance to review this before it was merged.

These are still not parsed as expected:

cat <<EOF
${parameter-default}
${!varprefix*}
${!varprefix@}
${parameter@U}
EOF
Parse tree
0:0  - 6:0     program
0:0  - 5:3       redirected_statement
0:0  - 0:3         body: command
0:0  - 0:3           name: command_name
0:0  - 0:3             word `cat`
0:4  - 5:3         redirect: heredoc_redirect
0:4  - 0:6           "<<"
0:6  - 0:9           heredoc_start `EOF`
0:9  - 1:0           "\n"
1:0  - 5:0           heredoc_body
1:0  - 1:20            expansion
1:0  - 1:2               "${"
1:2  - 1:11              variable_name `parameter`
1:11 - 1:19              word `-default`
1:19 - 1:20              "}"
2:0  - 2:14            expansion
2:0  - 2:2               "${"
2:2  - 2:3               "!"
2:3  - 2:12              variable_name `varprefix`
2:12 - 2:13              word `*`
2:13 - 2:14              "}"
3:0  - 3:14            expansion
3:0  - 3:2               "${"
3:2  - 3:3               "!"
3:3  - 3:12              variable_name `varprefix`
3:12 - 3:13              word `@`
3:13 - 3:14              "}"
4:0  - 4:14            expansion
4:0  - 4:2               "${"
4:2  - 4:11              variable_name `parameter`
4:11 - 4:13              word `@U`
4:13 - 4:14              "}"
5:0  - 5:3           heredoc_end `EOF`
5:3  - 6:0       "\n"

Also, they're not all operator fields (which is not necessary but since some of them are, they might as well all be).

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

I was debating whether or not to make them all operators, I might as well, totally forgot about the expansions in heredocs!

@ObserverOfTime
Copy link
Member

It's not a heredoc problem. The expansions are parsed the same way anywhere.

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

my fault, so I had parameter-default parsed correctly, inadvertently messed it up, and the test didn't pick up on that:

0:0  - 1:0     program
0:0  - 0:20      command
0:0  - 0:20        name: command_name
0:0  - 0:20          expansion
0:0  - 0:2             "${"
0:2  - 0:11            variable_name `parameter`
0:11 - 0:12            operator: "-"
0:12 - 0:19            word `default`
0:19 - 0:20            "}"
0:20 - 1:0       "\n"

Since operator is hidden

(command
    (command_name
      (expansion
        (variable_name)
        (word))))

works either way 😛

@ObserverOfTime
Copy link
Member

ObserverOfTime commented Aug 23, 2023

Note that ${parameter- default} currently works but incorrectly.
The whitespace should be part of the word.

The other expansions also have the same issue with whitespace.

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

The 2nd case seems to work fine for me? Unless you want * to not be parsed as a word but a special character

@ObserverOfTime
Copy link
Member

Unless you want * to not be parsed as a word but a special character

Yes, since it can't be a word, just * or @.

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

For echo ${abc:- }, do you expect a word with a single space? I have that working - just need to make sure:

image

@ObserverOfTime
Copy link
Member

ObserverOfTime commented Aug 23, 2023

Yes. The only time whitespace should be ignored is in ${foo:pos} and ${foo:pos:len}.

@ObserverOfTime
Copy link
Member

ObserverOfTime commented Aug 23, 2023

Speaking of ${foo:pos}

# $parameter or '1' if unset
echo "${parameter:-1}"
# the last character of $parameter
echo "${parameter: -1}"
# also the last character of $parameter
echo "${parameter:(-1)}"
Parse tree
0:0  - 6:0     •program
0:0  - 0:28       comment `# $parameter or '1' if unset`
1:0  - 1:22       command
1:0  - 1:4          name: command_name
1:0  - 1:4            word `echo`
1:5  - 1:22         argument: string
1:5  - 1:6            "\""
1:6  - 1:21           expansion
1:6  - 1:8              "${"
1:8  - 1:17             variable_name `parameter`
1:17 - 1:19             operator: ":-"
1:19 - 1:20             number `1`
1:20 - 1:21             "}"
1:21 - 1:22           "\""
1:22 - 2:0        "\n"
2:0  - 2:34       comment `# the last character of $parameter`
3:0  - 3:23       command
3:0  - 3:4          name: command_name
3:0  - 3:4            word `echo`
3:5  - 3:23         argument: string
3:5  - 3:6            "\""
3:6  - 3:22           expansion
3:6  - 3:8              "${"
3:8  - 3:17             variable_name `parameter`
3:17 - 3:18             ":"
3:19 - 3:21             number `-1`
3:21 - 3:22             "}"
3:22 - 3:23           "\""
3:23 - 4:0        "\n"
4:0  - 4:39       comment `# also the last character of $parameter`
5:0  - 5:24      •command
5:0  - 5:4          name: command_name
5:0  - 5:4            word `echo`
5:5  - 5:24        •argument: string
5:5  - 5:6            "\""
5:6  - 5:23          •ERROR
5:6  - 5:8              "${"
5:8  - 5:17             variable_name `parameter`
5:17 - 5:18             ":"
5:18 - 5:19             "("
5:21 - 5:22             ")"
5:22 - 5:23             "}"
5:23 - 5:24           "\""
5:24 - 6:0        "\n"

In the first case, the operator is correct, but 1 should be a word.
The second case is parsed correctly.
The third case results in an error.

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

what is the third one? an array, subshell, or parenthesized expression?

@ObserverOfTime
Copy link
Member

Parenthesized expression.

@amaanq
Copy link
Member Author

amaanq commented Aug 23, 2023

#196

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants