This splits "valid" into "valid for reads" and "valid for writes", and
also adds the concept of operation size to validity. Now functions which
operate on sequences state that e.g. pointer args must be "valid for reads of
size x".
The enumerated list of conditions is replaced by an explanation that
rust doesn't have a formal memory model. It does say that pointers
created directly from references are guaranteed to be valid, and links
to both the "Unsafe Code" section of the book and the "Undefined
Behavior" section of the reference.
- Add links to the GNU libc docs for `memmove`, `memcpy`, and
`memset`, as well as internally linking to other functions in `std::ptr`
- List invariants which, when violated, cause UB for all functions
- Add example to `ptr::drop_in_place` and compares it to `ptr::read`.
- Add examples which more closely mirror real world uses for the
functions in `std::ptr`. Also, move the reimplementation of `mem::swap`
to the examples of `ptr::read` and use a more interesting example for
`copy_nonoverlapping`.
- Change module level description
- Define what constitutes a "valid" pointer.
- Centralize discussion of ownership of bitwise copies in `ptr::read` and
provide an example.
Suggest direct raw-pointer dereference
People often come looking for some kind of `as_ref_unchecked` method on
raw pointers that would give them `&T` and not `Option<&T>` when they
are sure the pointer is not NULL.
There's no such method, but taking a reference of the dereferenced
pointer accomplishes the same thing. Therefore, suggest using that, at
the `as_ref` site ‒ it's a place people are likely going to look into.
People often come looking for some kind of `as_ref_unchecked` method on
raw pointers that would give them `&T` and not `Option<&T>` when they
are sure the pointer is not NULL.
There's no such method, but taking a reference of the dereferenced
pointer accomplishes the same thing. Therefore, suggest using that, at
the `as_ref` site ‒ it's a place people are likely going to look into.
LLVM isn't able to remove the alloca for the unaligned block in the SIMD tail in some cases, so doing this helps SRoA work in cases where it currently doesn't. Found in the `replace_with` RFC discussion.
The documentation of Unique::empty() and NonNull::dangling() could
potentially suggest that they work as sentinel values indicating a
not-yet-initialized pointer. However, they both declare a non-null
pointer equal to the alignment of the type, which could potentially
reference a valid value of that type (specifically, the first such valid
value in memory). Explicitly document that the return value of these
functions does not work as a sentinel value.
Implement [T]::align_to
Note that this PR deviates from what is accepted by RFC slightly by making `align_offset` to return an offset in elements, rather than bytes. This is necessary to sanely support `[T]::align_to` and also simply makes more sense™. The caveat is that trying to align a pointer of ZST is now an equivalent to `is_aligned` check, rather than anything else (as no number of ZST elements will align a misaligned ZST pointer).
It also implements the `align_to` slightly differently than proposed in the RFC to properly handle cases where size of T and U aren’t co-prime.
Furthermore, a promise is made that the slice containing `U`s will be as large as possible (contrary to the RFC) – otherwise the function is quite useless.
The implementation uses quite a few underhanded tricks and takes advantage of the fact that alignment is a power-of-two quite heavily to optimise the machine code down to something that results in as few known-expensive instructions as possible. Currently calling `ptr.align_offset` with an unknown-at-compile-time `align` results in code that has just a single "expensive" modulo operation; the rest is "cheap" arithmetic and bitwise ops.
cc https://github.com/rust-lang/rust/issues/44488 @oli-obk
As mentioned in the commit message for align_offset, many thanks go to Chris McDonald.
Keep only the language item. This removes some indirection and makes
codegen worse for debug builds, but simplifies code significantly, which
is a good tradeoff to make, in my opinion.
Besides, the codegen can be improved even further with some constant
evaluation improvements that we expect to happen in the future.
This is necessary if we want to implement `[T]::align_to` and is more
useful in general.
This implementation effort has begun during the All Hands and represents
a month of my futile efforts to do any sort of maths. Luckily, I
found the very very nice Chris McDonald (cjm) on IRC who figured out the
core formulas for me! All the thanks for existence of this PR go to
them!
Anyway… Those formulas were mangled by yours truly into the arcane forms
you see here to squeeze out the best assembly possible on most of the
modern architectures (x86 and ARM were evaluated in practice). I mean,
just look at it: *one actual* modulo operation and everything else is
just the cheap single cycle ops! Admitedly, the naive solution might be
faster in some common scenarios, but this code absolutely butchers the
naive solution on the worst case scenario.
Alas, the result of this arcane magic also means that the code pretty
heavily relies on the preconditions holding true and breaking those
preconditions will unleash the UB-est of all UBs! So don’t.
Rewrite docs for `std::ptr`
This PR attempts to resolve#29371.
This is a fairly major rewrite of the `std::ptr` docs, and deserves a fair bit of scrutiny. It adds links to the GNU libc docs for various instrinsics, adds internal links to types and functions referenced in the docs, adds new, more complex examples for many functions, and introduces a common template for discussing unsafety of functions in `std::ptr`.
All functions in `std::ptr` (with the exception of `ptr::eq`) are unsafe because they either read from or write to a raw pointer. The "Safety" section now informs that the function is unsafe because it dereferences a raw pointer and requires that any pointer to be read by the function points to "a valid vaue of type `T`".
Additionally, each function imposes some subset of the following conditions on its arguments.
* The pointer points to valid memory.
* The pointer points to initialized memory.
* The pointer is properly aligned.
These requirements are discussed in the "Undefined Behavior" section along with the consequences of using functions that perform bitwise copies without requiring `T: Copy`. I don't love my new descriptions of the consequences of making such copies. Perhaps the old ones were good enough?
Some issues which still need to be addressed before this is merged:
- [ ] The new docs assert that `drop_in_place` is equivalent to calling `read` and discarding the value. Is this correct?
- [ ] Do `write_bytes` and `swap_nonoverlapping` require properly aligned pointers?
- [ ] The new example for `drop_in_place` is a lackluster.
- [ ] Should these docs rigorously define what `valid` memory is? Or should is that the job of the reference? Should we link to the reference?
- [ ] Is it correct to require that pointers that will be read from refer to "valid values of type `T`"?
- [x] I can't imagine ever using `{read,write}_volatile` with non-`Copy` types. Should I just link to {read,write} and say that the same issues with non-`Copy` types apply?
- [x] `write_volatile` doesn't link back to `read_volatile`.
- [ ] Update docs for the unstable [`swap_nonoverlapping`](https://github.com/rust-lang/rust/issues/42818)
- [ ] Update docs for the unstable [unsafe pointer methods RFC](https://github.com/rust-lang/rfcs/pull/1966)
Looking forward to your feedback.
r? @steveklabnik
Make `Vec::new` a `const fn`
`RawVec::empty/_in` are a hack. They're there because `if size_of::<T> == 0 { !0 } else { 0 }` is not allowed in `const` yet. However, because `RawVec` is unstable, the `empty/empty_in` constructors can be removed when #49146 is done...