Clean up various LLVM FFI things in codegen_llvm
cc ```@ZuseZ4``` I touched some autodiff parts
The major change of this PR is [bfd88ce](bfd88cead0) which makes `CodegenCx` generic just like `GenericBuilder`
The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks.
best reviewed commit-by-commit
Rollup of 12 pull requests
Successful merges:
- #135767 (Future incompatibility warning `unsupported_fn_ptr_calling_conventions`: Also warn in dependencies)
- #137852 (Remove layouting dead code for non-array SIMD types.)
- #137863 (Fix pretty printing of unsafe binders)
- #137882 (do not build additional stage on compiler paths)
- #137894 (Revert "store ScalarPair via memset when one side is undef and the other side can be memset")
- #137902 (Make `ast::TokenKind` more like `lexer::TokenKind`)
- #137921 (Subtree update of `rust-analyzer`)
- #137922 (A few cleanups after the removal of `cfg(not(parallel))`)
- #137939 (fix order on shl impl)
- #137946 (Fix docker run-local docs)
- #137955 (Always allow rustdoc-json tests to contain long lines)
- #137958 (triagebot.toml: Don't label `test/rustdoc-json` as A-rustdoc-search)
r? `@ghost`
`@rustbot` modify labels: rollup
That unstable feature completed fcp-close, so the compiler needs to be
migrated away to allow its removal. In this case, `cg_llvm` and `cg_gcc`
were using raw entries to optimize their `const_str_cache` lookup and
insertion. We can change that to separate `get` and (on miss) `insert`
calls, so we still have the fast path avoiding string allocation when
the cache hits.
`rustc_codegen_llvm` relied on `Deref` impls where `Deref::Target` was
or contained an extern type - in my experimental implementation of
rust-lang/rfcs#3729, this isn't possible as the `Target` associated
type's `?Sized` bound cannot be relaxed backwards compatibly (unless we
come up with some way of doing this).
In later pull requests with the rust-lang/rfcs#3729 implementation,
breakage like this could only occur for nightly users relying on the
`extern_types` feature.
Upstreaming this to avoid needing to keep carrying this patch locally,
and I think it'll necessarily need to change eventually.
Fix deduplication mismatches in vtables leading to upcasting unsoundness
We currently have two cases where subtleties in supertraits can trigger disagreements in the vtable layout, e.g. leading to a different vtable layout being accessed at a callsite compared to what was prepared during unsizing. Namely:
### #135315
In this example, we were not normalizing supertraits when preparing vtables. In the example,
```
trait Supertrait<T> {
fn _print_numbers(&self, mem: &[usize; 100]) {
println!("{mem:?}");
}
}
impl<T> Supertrait<T> for () {}
trait Identity {
type Selff;
}
impl<Selff> Identity for Selff {
type Selff = Selff;
}
trait Middle<T>: Supertrait<()> + Supertrait<T> {
fn say_hello(&self, _: &usize) {
println!("Hello!");
}
}
impl<T> Middle<T> for () {}
trait Trait: Middle<<() as Identity>::Selff> {}
impl Trait for () {}
fn main() {
(&() as &dyn Trait as &dyn Middle<()>).say_hello(&0);
}
```
When we prepare `dyn Trait`, we see a supertrait of `Middle<<() as Identity>::Selff>`, which itself has two supertraits `Supertrait<()>` and `Supertrait<<() as Identity>::Selff>`. These two supertraits are identical, but they are not duplicated because we were using structural equality and *not* considering normalization. This leads to a vtable layout with two trait pointers.
When we upcast to `dyn Middle<()>`, those two supertraits are now the same, leading to a vtable layout with only one trait pointer. This leads to an offset error, and we call the wrong method.
### #135316
This one is a bit more interesting, and is the bulk of the changes in this PR. It's a bit similar, except it uses binder equality instead of normalization to make the compiler get confused about two vtable layouts. In the example,
```
trait Supertrait<T> {
fn _print_numbers(&self, mem: &[usize; 100]) {
println!("{mem:?}");
}
}
impl<T> Supertrait<T> for () {}
trait Trait<T, U>: Supertrait<T> + Supertrait<U> {
fn say_hello(&self, _: &usize) {
println!("Hello!");
}
}
impl<T, U> Trait<T, U> for () {}
fn main() {
(&() as &'static dyn for<'a> Trait<&'static (), &'a ()>
as &'static dyn Trait<&'static (), &'static ()>)
.say_hello(&0);
}
```
When we prepare the vtable for `dyn for<'a> Trait<&'static (), &'a ()>`, we currently consider the PolyTraitRef of the vtable as the key for a supertrait. This leads two two supertraits -- `Supertrait<&'static ()>` and `for<'a> Supertrait<&'a ()>`.
However, we can upcast[^up] without offsetting the vtable from `dyn for<'a> Trait<&'static (), &'a ()>` to `dyn Trait<&'static (), &'static ()>`. This is just instantiating the principal trait ref for a specific `'a = 'static`. However, when considering those supertraits, we now have only one distinct supertrait -- `Supertrait<&'static ()>` (which is deduplicated since there are two supertraits with the same substitutions). This leads to similar offsetting issues, leading to the wrong method being called.
[^up]: I say upcast but this is a cast that is allowed on stable, since it's not changing the vtable at all, just instantiating the binder of the principal trait ref for some lifetime.
The solution here is to recognize that a vtable isn't really meaningfully higher ranked, and to just treat a vtable as corresponding to a `TraitRef` so we can do this deduplication more faithfully. That is to say, the vtable for `dyn for<'a> Tr<'a>` and `dyn Tr<'x>` are always identical, since they both would correspond to a set of free regions on an impl... Do note that `Tr<for<'a> fn(&'a ())>` and `Tr<fn(&'static ())>` are still distinct.
----
There's a bit more that can be cleaned up. In codegen, we can stop using `PolyExistentialTraitRef` basically everywhere. We can also fix SMIR to stop storing `PolyExistentialTraitRef` in its vtable allocations.
As for testing, it's difficult to actually turn this into something that can be tested with `rustc_dump_vtable`, since having multiple supertraits that are identical is a recipe for ambiguity errors. Maybe someone else is more creative with getting that attr to work, since the tests I added being run-pass tests is a bit unsatisfying. Miri also doesn't help here, since it doesn't really generate vtables that are offset by an index in the same way as codegen.
r? `@lcnr` for the vibe check? Or reassign, idk. Maybe let's talk about whether this makes sense.
<sup>(I guess an alternative would also be to not do any deduplication of vtable supertraits (or only a really conservative subset) rather than trying to normalize and deduplicate more faithfully here. Not sure if that works and is sufficient tho.)</sup>
cc `@steffahn` -- ty for the minimizations
cc `@WaffleLapkin` -- since you're overseeing the feature stabilization :3
Fixes#135315Fixes#135316
Cast global variables to default address space
Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if a global variable is created in a different address space, it is casted to the default address space before its value is used.
This is necessary for the amdgpu target and others where the default address space for global variables is not 0.
For example `core` does not compile in debug mode when not casting the address space to the default one because it tries to emit the following (simplified) LLVM IR, containing a type mismatch:
```llvm
`@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
`@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) `@alloc_0` }>, align 8
; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)`
```
For this to compile, we need to insert a constant `addrspacecast` before we use a global variable:
```llvm
`@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
`@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) `@alloc_0` to ptr) }>, align 8
```
As vtables are global variables as well, they are also created with an `addrspacecast`. In the SSA backend, after a vtable global is created, metadata is added to it. To add metadata, we need the non-casted global variable. Therefore we strip away an addrspacecast if there is one, to get the underlying global.
Tracking issue: #135024
Pointers for variables all need to be in the same address space for
correct compilation. Therefore ensure that even if a global variable is
created in a different address space, it is casted to the default
address space before its value is used.
This is necessary for the amdgpu target and others where the default
address space for global variables is not 0.
For example `core` does not compile in debug mode when not casting the
address space to the default one because it tries to emit the following
(simplified) LLVM IR, containing a type mismatch:
```llvm
@alloc_0 = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
@alloc_1 = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) @alloc_0 }>, align 8
; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)`
```
For this to compile, we need to insert a constant `addrspacecast` before
we use a global variable:
```llvm
@alloc_0 = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
@alloc_1 = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) @alloc_0 to ptr) }>, align 8
```
As vtables are global variables as well, they are also created with an
`addrspacecast`. In the SSA backend, after a vtable global is created,
metadata is added to it. To add metadata, we need the non-casted global
variable. Therefore we strip away an addrspacecast if there is one, to
get the underlying global.
llvm: replace some deprecated functions
`LLVMMDStringInContext` and `LLVMMDNodeInContext` are deprecated, replace them with `LLVMMDStringInContext2` and `LLVMMDNodeInContext2`.
Also replace `Value` with `Metadata` in some function signatures for better consistency.
It's crazy to have the integer methods in something close to random
order.
The reordering makes the gaps clear: `const_i64`, `const_i128`,
`const_isize`, and `const_u16`. I guess they just aren't needed.
Supertraits of `BuilderMethods` are all called `XyzBuilderMethods`.
Supertraits of `CodegenMethods` are all called `XyzMethods`. This commit
changes the latter to `XyzCodegenMethods`, for consistency.
const vector passed through to codegen
This allows constant vectors using a repr(simd) type to be propagated
through to the backend by reusing the functionality used to do a similar
thing for the simd_shuffle intrinsic
#118209
r? RalfJung
This makes sure that &[] is just as efficient as indirecting through
unsafe code (from_raw_parts). No new stable guarantee is intended about
whether or not we do this, this is just an optimization.
Co-authored-by: Ralf Jung <post@ralfj.de>
cleanup: remove pointee types
This can't be merged until the oldest LLVM version we support uses opaque pointers, which will be the case after #114148. (Also note `-Cllvm-args="-opaque-pointers=0"` can technically be used in LLVM 15, though I don't think we should support that configuration.)
I initially hoped this would provide some minor perf win, but in https://github.com/rust-lang/rust/pull/105412#issuecomment-1341224450 it had very little impact, so this is only valuable as a cleanup.
As a followup, this will enable #96242 to be resolved.
r? `@ghost`
`@rustbot` label S-blocked
In cases where it is legal, we should prefer poison values over
undef values.
This replaces undef with poison for aggregate construction and
for uninhabited types. There are more places where we can likely
use poison, but I wanted to stay conservative to start with.
In particular the aggregate case is important for newer LLVM
versions, which are not able to handle an undef base value during
early optimization due to poison-propagation concerns.