-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guarantee soundness of pointer-to-int transmutes #1752
Open
joshlf
wants to merge
1
commit into
rust-lang:master
Choose a base branch
from
joshlf:patch-7
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The phrasing is odd here -- it should say that a ptr or reference requires all its bytes to be initialized, shouldn't it? Bit validity primarily defines "which sequences of bytes can I transmute to this type T" (and which values do they then represent). This here is like saying "a
bool
value is guaranteed to be represented as0x00
or0x01
", which is true but not what we usually say when we discuss validity ofbool
.Furthermore, it is the representation relation of integers that decides whether bytes with provenance are permitted there, so it is odd to discuss that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My goal here is to document what guarantees I can rely on if I currently have a pointer and want to look at its bytes. This prose amounts to a guarantee that we won't loosen the validity of pointers to permit uninit bytes. I would be concerned that the opposite wording would read like a sufficient-but-not-necessary condition that we could theoretically loosen in the future.
This is not just hypothetical: there's an argument that the current state of affairs is that int-to-ptr transmutes are well-defined, but that is not currently taken to imply that ptr-to-int transmutes are well-defined.
I just wanted to give the looser warning that ptr-to-int-to-ptr might not result in an identical ptr from the AM's perspective. Do you have a suggestion of how to word this better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am objecting mostly to the placement, not the wording. Conceptually, what we eventually need is for every type a description of which byte sequences are valid for this type, and which value is represented by each valid byte sequence. This is, however, complicated by the fact that we're trying to give partial guarantees while some details are still being worked out... I don't have a coherent plan for how to deal with this transitional phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 and we also need a similar description for the library invariant of such language types
Just mentioning a possible trap here. The programmer is not allowed to assume the language invariant by name (at least until that invariant is advertised as stable). This is an implicit rule. For example1 you cannot write the following robust function:
Instead, one has to explicitly list all properties from the library invariant that are not relied on (of course, you can't go beyond the language invariant):
Said otherwise, as a programmer you can't assume that a value is valid (sounds weird yes), only the language implementation can do this. As a programmer you assume the property given to you (which is usually stronger than the language invariant) described as a diff from the library invariant.
Another possible problem with the word "guarantee", is who guarantees what to whom? There's at least 2 options:
Footnotes
For illustration purposes, those examples assume that the library invariant of pointers requires the bytes to be initialized and the language invariant doesn't. This doesn't reflect any consensus and is not meant to. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FTR, by the stability promise, you can actually assume a given value is valid (assuming it is ever documented as such or otherwise permitted, such as by allowing its creation in safe code). What you cannot do is assume a given value is not valid.
Also, you can rely on the validity invariant by name (or any other invariant) if your reliance is for the fact of the matter alone. IE. if your only requirement to the function parameter is "Call is valid under language rules/language invariant", then you can name the validity invariant by name. Of course, there are few useful functions with such an invariant. (
mem::forget()
however, is a trivial example of a function where the definition can be such that the parameter only need be valid, and not necessarily safe - note however that there is no library guaranteemem::forget
is, in fact, such a function).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, if your function is polymorphic over the invariant it supports, then you can overspecify by naming the validity invariant. But I believe this is almost always useless and inaccurate (you better say you are polymorphic over the invariant).