Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

SyntaxError if 'case NAME:' is not the last case #148

Open
gvanrossum opened this issue Aug 26, 2020 · 15 comments
Open

SyntaxError if 'case NAME:' is not the last case #148

gvanrossum opened this issue Aug 26, 2020 · 15 comments
Labels
accepted Discussion leading to a final decision to include in the PEP fully pepped Issues that have been fully documented in the PEP sc-feedback Issues raised in the steering committee feedback

Comments

@gvanrossum
Copy link
Owner

There were some concerns over the typical footgun of writing

case FOO:
    <code>
case BAR:
    <code>

(with the intention that FOO and BAR are global constants, not capture patterns).

The CPython implementation issues a SyntaxWarning for this. Maybe we should just make that a SyntaxError. (Even though it's hard to express even in the PEG grammar.) Mark Shannon also wants this.

I am fine with this.

@gvanrossum gvanrossum added the sc-feedback Issues raised in the steering committee feedback label Aug 26, 2020
@Tobias-Kohn
Copy link
Collaborator

I am also fine with this, and think it actually has merit to point out obvious mistakes.

And to be fair: some syntax errors cannot be sensibly expressed by a context-free language, anyway. Take the rule that no two parameters in a function can share the same name, for instance. So, we do not necessarily have to put that into the grammar.

@brandtbucher
Copy link
Collaborator

I'm okay with this as well. People ignore warnings, and I cannot imagine this ever being "correct" (except perhaps in poorly-generated code).

@brandtbucher brandtbucher changed the title SytaxError if 'case NAME:' is not the last case SyntaxError if 'case NAME:' is not the last case Aug 28, 2020
@gvanrossum
Copy link
Owner Author

gvanrossum commented Aug 28, 2020

If we ever add an else clause, an irrefutable case must not be followed by any else clause. #147

@Tobias-Kohn
Copy link
Collaborator

an irrefutable case must not be followed by any else clause

Generally, I agree with this. But I would be careful with the wording. Irrefutable cases can actually take quite complicated forms, making it impossible for the compiler to decide whether they really are irrefutable or not. So, in short: the rule as it stands cannot be checked.

Perhaps it would be enough to talk about the irrefutable case and define somewhere that it refers to the case <NAME>...?

@gvanrossum
Copy link
Owner Author

Hm, in my naïveté I think irrefutability is easily checked recursively.

  • a single capture pattern is irrefutable
  • a wildcard is irrefutable
  • an OR is irrefutable if any of its sub-patterns is
  • a walrus is irrefutable if the RHS is

That’s it. Presence of a guard makes a case non-irrefutable.

What am I missing?

@Tobias-Kohn
Copy link
Collaborator

Tobias-Kohn commented Aug 28, 2020

I was primarily thinking of the last case you mention:

Presence of a guard makes a case non-irrefutable.

This is not necessarily the case semantically as in, e.g., case x if 2 < 3:. It has a guard, however, it will always succeed, hence it is semantically irrefutable, although it might be impossible for the compiler to discover this for all possible conditions in a guard.

The second objection to the list would be class patterns. case object(): is also irrefutable, and if we allow the isinstance to be overwritten, any class C might end up being irrefutable.

In the context with OR, I would think of a silly example like case (mutable() | immutable()):, which I would claim is also irrefutable.

Long story short: what we need is syntactic irrefutability (irrefutability that can be decided purely based on the syntax), which is naturally a subset of semantic irrefutability (everything that is 'logically' irrefutable). Obviously the compiler can only really check the former.

Personally, I would not have a problem with defining what we mean by syntactic irrefutability and then state that any such case must be the last branch in a match statement (including the potential else:). But come to think of it, I am a bit afraid that it would add another set of rules to the PEP, which is already perceived as having too many rules... So, the question is: is it really worth the trouble? Or do we just go with the irrefutable case (where the <NAME> in my post was meant to include the wildcard, of course)?


EDIT: We could also just say that the compiler will reject any case branch (again, including else) that follows a case for which it can prove that it is irrefutable. In other words: the compiler will complain about anything following syntactic irrefutability, but leave it open to the exact extent of syntactic irrefutability. The only condition is that a simple name (or wildcard) is in this set.

What speaks against this, of course, is that if the compiler gets better in detecting such cases we loose backward-compatibility. A program that runs with Python 3.10.0 might not run with 3.10.1 because the latter rejects an unreachable case branch. If pattern matching is used in libraries, this might lead to quite a mess. Hence not a good idea!

@gvanrossum
Copy link
Owner Author

I think it is worth it defining which patterns are syntactically irrefutable (basically my definition) and stating that any guard-less case whose pattern is syntactically irrefutable must not be followed by another case or else.

Compilers that are smarter than that can issue SyntaxWarnings.

Note that case object(): may not be irrefutable if a global variable object exists or if builtins.object has been replaced.

I am not worried that adding this rule will be considered as a problem (and if it is, we can offer only to declare case NAME: to have this requirement).

@dmoisset
Copy link
Collaborator

Having an or-pattern where one in the left is irrefutable is a good candidate for errors too. For example in:

match value:
    case None: ...
    case 0: ...
    case MAX_VALUE|MIN_VALUE: ... # I intended to use these names as constants here

@gvanrossum
Copy link
Owner Author

Agreed.

@brandtbucher
Copy link
Collaborator

Hm, in my naïveté I think irrefutability is easily checked recursively.

  • a single capture pattern is irrefutable
  • a wildcard is irrefutable
  • an OR is irrefutable if any of its sub-patterns is
  • a walrus is irrefutable if the RHS is

That’s it. Presence of a guard makes a case non-irrefutable.

Does this sound good, then? If so, I'll update the spec and implementation.

@gvanrossum
Copy link
Owner Author

Yup, still sounds good to me. Let's do it.

@gvanrossum gvanrossum added accepted Discussion leading to a final decision to include in the PEP needs more pep An issue which needs to be documented in the PEP labels Sep 16, 2020
@gvanrossum
Copy link
Owner Author

See language in python/peps#1649 -- please review.

@gvanrossum
Copy link
Owner Author

(Landed.)

@brandtbucher
Copy link
Collaborator

Just got back from a trip. Sorry for the delayed review... but this looks good to me anyways! My one lingering question about patterns like case _ | _ was addressed in the linked PR: as expected, this too is a syntax error.

@gvanrossum gvanrossum added fully pepped Issues that have been fully documented in the PEP and removed needs more pep An issue which needs to be documented in the PEP labels Oct 20, 2020
@gvanrossum
Copy link
Owner Author

Labeling as fully pepped -- this is now incorporated in PEP 634. (And also implemented, thanks Brandt!)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
accepted Discussion leading to a final decision to include in the PEP fully pepped Issues that have been fully documented in the PEP sc-feedback Issues raised in the steering committee feedback
Projects
None yet
Development

No branches or pull requests

4 participants