Skip to content

do not reference LLVM in our definition of UB #1750

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 28, 2025
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 7 additions & 11 deletions src/behavior-considered-undefined.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,11 @@ r[undefined.place-projection]
[array/slice index expression][project-slice].

r[undefined.alias]
* Breaking the [pointer aliasing rules]. `Box<T>`, `&mut T` and `&T` follow
LLVM’s scoped [noalias] model, except if the `&T` contains an
[`UnsafeCell<U>`]. References and boxes must not be [dangling] while they are
live. The exact liveness duration is not specified, but some bounds exist:
* Breaking the pointer aliasing rules. The exact aliasing rules are not determined yet, but here is a rough sketch of what the requirements look like:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were somewhat uncomfortable with the "rough sketch" nature of this statement. Are the statements here accurate on their own? "rough sketch" implies to me that this isn't just incomplete, but possibly wrong in some way, so I wanted to clarify this point.

If this is just incomplete, can we delete this whole sentence? Like this:

Suggested change
* Breaking the pointer aliasing rules. The exact aliasing rules are not determined yet, but here is a rough sketch of what the requirements look like:
* Breaking the pointer aliasing rules:

There is a giant warning just above this that this is not complete, so I was wondering if that covers your concern about the status here?

Or if you wanted to really point to this, maybe a footnote like [^alias-incomplete]: The exact aliasing rules are not determined yet.

Or maybe another way to phrase that if the rules written here are accurate, but just incomplete: The exact aliasing rules are incomplete.

Copy link
Member Author

@RalfJung RalfJung Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statements given here are not on their own sufficient to confidently write sound code. That is already the case before this PR, so nothing changes, we're just being more honest. The aliasing model is the least stable part of our unsafe code semantics so I feel they deserve a dedicated warning.

I don't know what you mean by "accurate and incomplete". I assume you do not mean "complete" in the formal sense of a logic (soundness/completeness), but I also see no way in which an "incomplete" set of rules could be described as "accurate".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what I was getting at is viewing it as a bound. The current text says X. There might be additional points that relax X in the future, but we haven't determined them, yet. I would say X is still correct, just incomplete. But perhaps you are saying there are too many unknowns to even say that? Or that X is too permissive, and we will be adding constraints in the future, in which case X is incorrect in the face of those constraints?

Regardless, having a specific warning here seems worthwhile, though we (on the team call) felt a little uncomfortable with the current wording. How about doing something like pulling the sentence out into a dedicated warning block with something like:

Suggested change
* Breaking the pointer aliasing rules. The exact aliasing rules are not determined yet, but here is a rough sketch of what the requirements look like:
* Breaking the pointer aliasing rules.
> [!WARNING]
> These aliasing rules are incomplete and are still under development.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But perhaps you are saying there are too many unknowns to even say that?

Yeah, I am not very comfortable making any hard promises of the form "if you follow these rules, you're definitely good" -- not for the case where references and raw pointers are interleaved in any way. We could conceivably carve out a fragment that must work, where you only ever use raw pointers or references but never both "at the same time", but it'd be a bunch of work to spell this out in sufficient detail, and doing so was not my intent with this PR.

How about doing something like pulling the sentence out into a dedicated warning block with something like:

That still basically say that what follows are "rules", that might just be altered later. I wouldn't call them rules. They are a sketch or a guiding principle that rules will be developed from, but they are nowhere near precise enough to be described as a set of "rules".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced "rough sketch" by "outline", is that better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can go with that. Thanks @RalfJung!

`&T` must point to memory that is not mutated while they are live (except for data inside an [`UnsafeCell<U>`]),
and `&mut T` must point to memory that is not read or written by any pointer not derived from the reference and that no other reference points to while they are live.
`Box<T>` is treated similar to `&'static mut T` for the purpose of these rules.
The exact liveness duration is not specified, but some bounds exist:
* For references, the liveness duration is upper-bounded by the syntactic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a lower-bound too, right? e.g., if you have a use of some reference x at time T1 and then a subsequent use of x (or some reference derived from x) at time T2, then it is live from T1 ..= T2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note that I didn't change this part in this PR.)

Yes, of course each time a reference is dereferenced, it must be live. That is a lower bound.

lifetime assigned by the borrow checker; it cannot be live any *longer* than
that lifetime.
Expand All @@ -56,9 +57,7 @@ r[undefined.alias]
least as long as that function call, again except if the `&T` contains an
[`UnsafeCell<U>`].

All this also applies when values of these
types are passed in a (nested) field of a compound type, but not behind
pointer indirections.
All this also applies when values of these types are passed in a (nested) field of a compound type, but not behind pointer indirections.

r[undefined.immutable]
* Mutating immutable bytes.
Expand Down Expand Up @@ -201,7 +200,7 @@ r[undefined.validity.never]

r[undefined.validity.scalar]
* An integer (`i*`/`u*`), floating point value (`f*`), or raw pointer must be
initialized, i.e., must not be obtained from [uninitialized memory][undef].
initialized, i.e., must not be obtained from uninitialized memory.

r[undefined.validity.str]
* A `str` value is treated like `[u8]`, i.e. it must be initialized.
Expand Down Expand Up @@ -248,10 +247,7 @@ reading uninitialized memory is permitted are inside `union`s and in "padding"

[`bool`]: types/boolean.md
[`const`]: items/constant-items.md
[noalias]: http://llvm.org/docs/LangRef.html#noalias
[pointer aliasing rules]: http://llvm.org/docs/LangRef.html#pointer-aliasing-rules
[abi]: items/external-blocks.md#abi
[undef]: http://llvm.org/docs/LangRef.html#undefined-values
[`target_feature`]: attributes/codegen.md#the-target_feature-attribute
[`UnsafeCell<U>`]: std::cell::UnsafeCell
[Rustonomicon]: ../nomicon/index.html
Expand Down