Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use colons before blocks #34

Open
SnirkImmington opened this issue May 11, 2018 · 2 comments
Open

Use colons before blocks #34

SnirkImmington opened this issue May 11, 2018 · 2 comments
Labels
area: AST Issues which affect AST code area: parse The parsing of source code within the compiler area: syntax Syntactical changes to the language points: 1 Simple or straightforward changes to the code priority: high Important requirements, goals, and features

Comments

@SnirkImmington
Copy link
Collaborator

Do things the Python way:

Instead of

fn foo(a: int, b: int) -> int
    // do some stuff

Do

fn foo(a: int, b: int) -> int:
    // do some stuff

This applies to block level constructs:

if cond:
    do:
        // some stuff
else:
    // other stuff

I think that the : ultimately provides more readability and appearance of structure. Now the symmetry is that : is used before a block but => is used to represent that code inline.

This does mean we have to be careful of : in expressions. However, as with Python, I don't think it will come up much.

@SnirkImmington SnirkImmington added this to the Static Typing milestone May 11, 2018
@SnirkImmington SnirkImmington added area: AST Issues which affect AST code area: syntax Syntactical changes to the language area: parse The parsing of source code within the compiler priority: high Important requirements, goals, and features and removed high priority labels May 24, 2018
@Timidger
Copy link

Quoting from the Python FAQ:

The colon is required primarily to enhance readability

Which seems to be the main use case in Python. However, another more important point is noted after that:

Another minor reason is that the colon makes it easier for editors with syntax highlighting; they can look for colons to decide when indentation needs to be increased instead of having to do a more elaborate parsing of the program text.

(emphasis mine)

If you are going primarily for readability (in the Python sense) then the colon makes sense.

However since one of the goals of Snirk seems to be "correctness" I'd like to see arguments in favor of that. Readability is always going to come down to what programmers are familiar with, and by that metric you might as well use braces since most programmers are more familiar with C style syntax (C, C++, C#/Java, Go, Javascript).

I realize I'm expanding this argument away from colons to denote blocks to the use of whitespace itself, but I would like to see a justification for it since I can't find it anywhere else.

@SnirkImmington
Copy link
Collaborator Author

Why whitespace

My main inspiration for whitespace was Coffeescript. In particular, when I was first getting into writing a parser, I was looking at other languages and wondering "why do I need to write all these symbols?"

This is a valid concern for, i.e. parentheses around if conditionals, which was a C thing (specifically, prioritizing if (condition) statement instead of if condition { statement }) that Java/C#/Javascript didn't change, and so languages like Rust and Go (with their reliance on C syntax) can brag about how they don't need them. We think them as having "removed" the parentheses but they've actually required the braces. The alternative, if condition then statement, is used in Lua and Ruby, but is also the case in Python - the : does the work of the then. (This would make Snirk even more "verbose" than Python - a single-line if would require not just the omission of : but the addition of =>.)

I have seen Python's reasons directly from their FAQ, and I've also heard that a survey was done while Python was in development (the FAQ mentions "one of the results of the experimental ABC language") to see if people preferred the colons. (The answer was yes, they found the code more readable.) I also agree that programming languages have to simplify their syntax to make writing syntax highlighters and such easier. I think that requiring the : or => is a good-enough compromise between readability and disambiguity.

Coffeescript tries very hard to have the fewest characters, and leads to a lot of tight, nearly ambiguous constructs. Not only do they tokenize indentation like Python does, they have a rewriter to handle all of the "optional syntax, implicit syntax, and shorthand syntax." To be fair, Snirk does have indentation rules in its parser to allow language constructs to ignore indentation (Python does this around parens, Snirk has this extra for function signatures and to allow midfix operators to parse between lines if indented) but some of that can go away once : is used. There's all sorts of syntax that Coffeescript tries desperately to thin that I'm not trying to do with Snirk.

(To be fair to Coffeescript, they do work around the standard Javascript problems like === and var, motivated the inclusion of =>, template strings, spread syntax, and destructuring, and allow many "block form" things to be expressions, although they do not allow let.)

My motivations for indentation build on top of Python's motivations:

Since there are no begin/end brackets there cannot be a disagreement between grouping perceived by the parser and the human reader.

Furthermore, every reasonable styleguide which seeks to increase readability in brace programming languages requires standardized spacing (or in Go's case, tabbing). This is also true for programming languages that use do/end. For readable code, both are needed in such languages. I think that requiring tabbing and braces is redundant, so instead of needing both but offering braces (to allow what, single-line statements?), I would rather just require indentation.

I think that most programming languages that have { and } care more about looking like C than they do that their code is "correct." Rust, being a systems language targeting C++ wouldn't feel "metal" enough, and Go has used up their quota in removing semicolons and commas from fields in structs. (It has a semicolon inserter, so it ends up caring newlines, so it's technically whitespace significant). Languages targeting the JVM like Scala and Kotlin need to feel Java-y to convince programmers to swap over, and Java itself needed to feel like C.

(In Snirk, indentation is currently handled in the Tokenizer super loosely, no indentation size checks - I've opened #46 to address this)

Correctness

If you want a programming language whose syntax is engineered to be hard to make mistakes with (or just be an antithesis to Coffeescript), look no further than Ada:

with Ada.Text_IO; use Ada.Text_IO;
procedure Hello is
begin
  Put_Line ("Hello, world!");
end Hello;

Two statements to import a library and name, procedure and is around the name, begin after the is, semicolons after the statements, end Hello with the name of the procedure, semicolon at the end of the procedure. All created to make sure programmers don't write the wrong program because of typos.

Although I respect Ada greatly and feel that it is, to this day, an innovative programming language, I don't think we need this number of tokens to write correct code. I don't think we need the begin and end, or { and }. Ada has many other strengths that make it good at what it's for: secure systems programming.

In the end, perhaps braces do allow code to be less ambiguous. I'm already introducing one token (:) to reduce ambiguity, I could instead use { and }. However, I think they're redundant to the point of affecting readability and would rather use the Python style. If I feel in the future that "correctness" is affected, I will consider adding an end keyword.

@SnirkImmington SnirkImmington changed the title Give up and use colons before blocks Use colons before blocks May 30, 2018
@SnirkImmington SnirkImmington removed this from the Static Typing milestone Jun 16, 2018
@SnirkImmington SnirkImmington added the points: 1 Simple or straightforward changes to the code label Jan 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: AST Issues which affect AST code area: parse The parsing of source code within the compiler area: syntax Syntactical changes to the language points: 1 Simple or straightforward changes to the code priority: high Important requirements, goals, and features
Projects
None yet
Development

No branches or pull requests

2 participants