Fix deduplication mismatches in vtables leading to upcasting unsoundness
We currently have two cases where subtleties in supertraits can trigger disagreements in the vtable layout, e.g. leading to a different vtable layout being accessed at a callsite compared to what was prepared during unsizing. Namely:
### #135315
In this example, we were not normalizing supertraits when preparing vtables. In the example,
```
trait Supertrait<T> {
fn _print_numbers(&self, mem: &[usize; 100]) {
println!("{mem:?}");
}
}
impl<T> Supertrait<T> for () {}
trait Identity {
type Selff;
}
impl<Selff> Identity for Selff {
type Selff = Selff;
}
trait Middle<T>: Supertrait<()> + Supertrait<T> {
fn say_hello(&self, _: &usize) {
println!("Hello!");
}
}
impl<T> Middle<T> for () {}
trait Trait: Middle<<() as Identity>::Selff> {}
impl Trait for () {}
fn main() {
(&() as &dyn Trait as &dyn Middle<()>).say_hello(&0);
}
```
When we prepare `dyn Trait`, we see a supertrait of `Middle<<() as Identity>::Selff>`, which itself has two supertraits `Supertrait<()>` and `Supertrait<<() as Identity>::Selff>`. These two supertraits are identical, but they are not duplicated because we were using structural equality and *not* considering normalization. This leads to a vtable layout with two trait pointers.
When we upcast to `dyn Middle<()>`, those two supertraits are now the same, leading to a vtable layout with only one trait pointer. This leads to an offset error, and we call the wrong method.
### #135316
This one is a bit more interesting, and is the bulk of the changes in this PR. It's a bit similar, except it uses binder equality instead of normalization to make the compiler get confused about two vtable layouts. In the example,
```
trait Supertrait<T> {
fn _print_numbers(&self, mem: &[usize; 100]) {
println!("{mem:?}");
}
}
impl<T> Supertrait<T> for () {}
trait Trait<T, U>: Supertrait<T> + Supertrait<U> {
fn say_hello(&self, _: &usize) {
println!("Hello!");
}
}
impl<T, U> Trait<T, U> for () {}
fn main() {
(&() as &'static dyn for<'a> Trait<&'static (), &'a ()>
as &'static dyn Trait<&'static (), &'static ()>)
.say_hello(&0);
}
```
When we prepare the vtable for `dyn for<'a> Trait<&'static (), &'a ()>`, we currently consider the PolyTraitRef of the vtable as the key for a supertrait. This leads two two supertraits -- `Supertrait<&'static ()>` and `for<'a> Supertrait<&'a ()>`.
However, we can upcast[^up] without offsetting the vtable from `dyn for<'a> Trait<&'static (), &'a ()>` to `dyn Trait<&'static (), &'static ()>`. This is just instantiating the principal trait ref for a specific `'a = 'static`. However, when considering those supertraits, we now have only one distinct supertrait -- `Supertrait<&'static ()>` (which is deduplicated since there are two supertraits with the same substitutions). This leads to similar offsetting issues, leading to the wrong method being called.
[^up]: I say upcast but this is a cast that is allowed on stable, since it's not changing the vtable at all, just instantiating the binder of the principal trait ref for some lifetime.
The solution here is to recognize that a vtable isn't really meaningfully higher ranked, and to just treat a vtable as corresponding to a `TraitRef` so we can do this deduplication more faithfully. That is to say, the vtable for `dyn for<'a> Tr<'a>` and `dyn Tr<'x>` are always identical, since they both would correspond to a set of free regions on an impl... Do note that `Tr<for<'a> fn(&'a ())>` and `Tr<fn(&'static ())>` are still distinct.
----
There's a bit more that can be cleaned up. In codegen, we can stop using `PolyExistentialTraitRef` basically everywhere. We can also fix SMIR to stop storing `PolyExistentialTraitRef` in its vtable allocations.
As for testing, it's difficult to actually turn this into something that can be tested with `rustc_dump_vtable`, since having multiple supertraits that are identical is a recipe for ambiguity errors. Maybe someone else is more creative with getting that attr to work, since the tests I added being run-pass tests is a bit unsatisfying. Miri also doesn't help here, since it doesn't really generate vtables that are offset by an index in the same way as codegen.
r? `@lcnr` for the vibe check? Or reassign, idk. Maybe let's talk about whether this makes sense.
<sup>(I guess an alternative would also be to not do any deduplication of vtable supertraits (or only a really conservative subset) rather than trying to normalize and deduplicate more faithfully here. Not sure if that works and is sufficient tho.)</sup>
cc `@steffahn` -- ty for the minimizations
cc `@WaffleLapkin` -- since you're overseeing the feature stabilization :3
Fixes#135315Fixes#135316
Target option to require explicit cpu
Some targets have many different CPUs and no generic CPU that can be used as a default. For these targets, the user needs to explicitly specify a CPU through `-C target-cpu=`.
Add an option for targets and an error message if no CPU is set.
This affects the proposed amdgpu and avr targets.
amdgpu tracking issue: #135024
AVR MCP: https://github.com/rust-lang/compiler-team/issues/800
Improve documentation for file locking
Add notes to each method stating that locks get dropped on close.
Clarify the return values of the try methods: they're only defined if
the lock is held via a *different* file handle/descriptor. That goes
along with the documentation that calling them while holding a lock via
the *same* file handle/descriptor may deadlock.
Document the behavior of unlock if no lock is held.
r? `@m-ou-se`
(Documentation changes requested in https://github.com/rust-lang/rust/issues/130994 .)
Remove minor future footgun in `impl Debug for MaybeUninit`
No longer breaks if `MaybeUninit` moves modules (technically it could break if `MaybeUninit` were renamed but realistically that will never happen)
Debug impl originally added in #133282
Introduce a wrapper for "typed valtrees" and properly check the type before extracting the value
This PR adds a new wrapper type `ty::Value` to replace the tuple `(Ty, ty::ValTree)` and become the new canonical representation of type-level constant values.
The value extraction methods `try_to_bits`/`try_to_bool`/`try_to_target_usize` are moved to this new type. For `try_to_bits` in particular, this avoids some redundant matches on `ty::ConstKind::Value`. Furthermore, these methods and will now properly check the type before extracting the value, which fixes some ICEs.
The name `ty::Value` was chosen to be consistent with `ty::Expr`.
Commit 1 should be non-functional and commit 2 adds the type check.
---
fixes https://github.com/rust-lang/rust/issues/131102
supercedes https://github.com/rust-lang/rust/pull/136130
r? `@oli-obk`
cc `@FedericoBruzzone` `@BoxyUwU`
override build profile for bootstrap tests
Using the release profile for bootstrap self tests puts too much load on the CPU and makes it quite hot on `x test bootstrap` invocation for no good reason. It also makes the compilation take longer than usual (see https://github.com/rust-lang/rust/pull/136048#issuecomment-2616908484). This change turns off the release flag for bootstrap self tests.
tests: Skip const OOM tests on aarch64-unknown-linux-gnu
Skip const OOM tests on AArch64 Linux through explicit annotations instead of inside opt-dist.
Intended to avoid confusion in cases like #135952.
Prerequisite for https://github.com/rust-lang/rust/pull/135960.
r? `@Kobzol`
cc `@workingjubilee`
try-job: dist-aarch64-linux
Add `AsyncFn*` to `core` prelude
In https://github.com/rust-lang/rust/pull/132611 these got added to the `std` prelude only, which looks like an oversight.
r? libs-api
cc `@compiler-errors`
uefi: Implement path
This PR is split off from https://github.com/rust-lang/rust/pull/135368 to reduce noise.
UEFI paths can be of 4 types:
1. Absolute Shell Path: Uses shell mappings
2. Absolute Device Path: this is what we want
3. Relative root: path relative to the current root.
4. Relative
Absolute shell path can be identified with `:` and Absolute Device path can be identified with `/`. Relative root path will start with `\`.
The algorithm is mostly taken from edk2 UEFI shell implementation and is somewhat simple. Check for the path type in order.
For Absolute Shell path, use `EFI_SHELL->GetDevicePathFromMap` to get a BorrowedDevicePath for the volume.
For Relative paths, we use the current working directory to construct the new path.
BorrowedDevicePath abstraction is needed to interact with `EFI_SHELL->GetDevicePathFromMap` which returns a Device Path Protocol with the lifetime of UEFI shell.
Absolute Shell paths cannot exist if UEFI shell is missing.
cc `@nicholasbishop`
Cast global variables to default address space
Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if a global variable is created in a different address space, it is casted to the default address space before its value is used.
This is necessary for the amdgpu target and others where the default address space for global variables is not 0.
For example `core` does not compile in debug mode when not casting the address space to the default one because it tries to emit the following (simplified) LLVM IR, containing a type mismatch:
```llvm
`@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
`@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) `@alloc_0` }>, align 8
; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)`
```
For this to compile, we need to insert a constant `addrspacecast` before we use a global variable:
```llvm
`@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
`@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) `@alloc_0` to ptr) }>, align 8
```
As vtables are global variables as well, they are also created with an `addrspacecast`. In the SSA backend, after a vtable global is created, metadata is added to it. To add metadata, we need the non-casted global variable. Therefore we strip away an addrspacecast if there is one, to get the underlying global.
Tracking issue: #135024
Implement `int_from_ascii` (#134821)
Provides unstable `T::from_ascii()` and `T::from_ascii_radix()` for integer types `T`, as drafted in tracking issue #134821.
To deduplicate documentation without additional macros, implementations of `isize` and `usize` no longer delegate to equivalent integer types. After #132870 they are inlined anyway.
Fix a couple Emscripten tests
This fixes a couple Emscripten tests where the correct fix is more or less obvious. A couple UI tests are still broken with this PR:
- `tests/ui/abi/numbers-arithmetic/return-float.rs` (#136197)
- `tests/ui/no_std/no-std-unwind-binary.rs` (haven't debugged yet)
- `tests/ui/test-attrs/test-passed.rs` (haven't debugged this either)
`````@rustbot````` label +T-compiler +O-emscripten
Allow transmuting generic pattern types to and from their base
Pattern types always have the same size as their base type, so we can just ignore the pattern and look at the base type for figuring out whether transmuting is possible.
Clean up uses of the unstable `dwarf_version` option
- Consolidate calculation of the effective value.
- Check the target `DebuginfoKind` instead of using `is_like_msvc`.
- Add the tracking issue to the unstable book page for this feature.
cc #103057
Simplify and consolidate the way we handle construct `OutlivesEnvironment` for lexical region resolution
This is best reviewed commit-by-commit. I tried to consolidate the API for lexical region resolution *first*, then change the API when it was finally behind a single surface.
r? lcnr or reassign
Add notes to each method stating that locks get dropped on close.
Clarify the return values of the try methods: they're only defined if
the lock is held via a *different* file handle/descriptor. That goes
along with the documentation that calling them while holding a lock via
the *same* file handle/descriptor may deadlock.
Document the behavior of unlock if no lock is held.
Cleanup docs for Allocator
This is an attempt to remove ungrammatical constructions and clean up the prose. I've sometimes had to try hard to understand what was being stated, so it is possible that I've misunderstood the original meaning. In particular, I did not see a difference between:
- the borrow-checker lifetime of the allocator type itself.
- as long as at least one of the allocator instance and all of its clones has not been dropped.
optimize slice::ptr_rotate for small rotates
r? `@scottmcm`
This swaps the positions and numberings of algorithms 1 and 2 in `slice::ptr_rotate`, and pulls the entire outer loop into algorithm 3 since it was redundant for the first two. Effectively, `ptr_rotate` now always does the `memcpy`+`memmove`+`memcpy` sequence if the shifts fit into the stack buffer.
With this change, an `IndexMap`-style `move_index` function is optimized correctly.
Assembly comparisons:
- `move_index`, before: https://godbolt.org/z/Kr616KnYM
- `move_index`, after: https://godbolt.org/z/1aoov6j8h
- the code from `#89714`, before: https://godbolt.org/z/Y4zaPxEG6
- the code from `#89714`, after: https://godbolt.org/z/1dPx83axc
related to #89714
some relevant discussion in https://internals.rust-lang.org/t/idea-shift-move-to-efficiently-move-elements-in-a-vec/22184
Behavior tests pass locally. I can't get any consistent microbenchmark results on my machine, but the assembly diffs look promising.
miri: optimize zeroed alloc
When allocating zero-initialized memory in MIR interpretation, rustc allocates zeroed memory, marks it as initialized and then re-zeroes it. Remove the last step.
I don't expect this to have much of an effect on performance normally, but in my case in which I'm creating a large allocation via mmap it gets in the way.
tests: Port `translation` to rmake.rs
Part of #121876.
This PR partially supersedes #129011 and is co-authored with `@Oneirical.`
## Summary
This PR ports `tests/run-make/translation` to rmake.rs. Notable changes from the Makefile version include:
- We now actually fail if the rustc invocations fail... The Makefile did not have `SHELL=/bin/bash -o pipefail`, so all the piped rustc invocations to grep vacuously succeeded, even if the broken ftl test case actually regressed over time and ICEs on current master.
- That test case is converted to assert it fails with a FIXME backlinking to #135817.
- The test coverage is expanded to not ignore windows. Instead, the test now uses symlink capability detection to gate test execution.
- Added some backlinks to relevant tracking issues and the initial translation infra implementation PR.
## Review advice
Best reviewed commit-by-commit.
r? compiler
try-job: aarch64-apple
try-job: i686-mingw
Merge `PatKind::Path` into `PatKind::Expr`
Follow-up to #134228
We always had a duplication where `Path`s could be represented as `PatKind::Path` or `PatKind::Lit(ExprKind::Path)`. We had to handle both everywhere, and still do after #134228, so I'm removing it now.