Commit graph

2607 commits

Author SHA1 Message Date
bjorn3
382e4031c2 Remove Linkage::Private
This is the same as Linkage::Internal except that it doesn't emit any
symbol. Some backends may not support it and it isn't all that useful
anyway.
2025-02-07 16:02:19 +00:00
Daniel Paoliello
2a6b27444a Remove dead code from rustc_codegen_llvm and the LLVM wrapper 2025-02-06 16:53:52 -08:00
Zalathar
4385a9e063 Debuginfo for function ZSTs should have alignment of 8 bits, not 1 bit 2025-02-06 23:01:29 +11:00
bors
2f92f050e8 Auto merge of #136471 - safinaskar:parallel, r=SparrowLii
tree-wide: parallel: Fully removed all `Lrc`, replaced with `Arc`

tree-wide: parallel: Fully removed all `Lrc`, replaced with `Arc`

This is continuation of https://github.com/rust-lang/rust/pull/132282 .

I'm pretty sure I did everything right. In particular, I searched all occurrences of `Lrc` in submodules and made sure that they don't need replacement.

There are other possibilities, through.

We can define `enum Lrc<T> { Rc(Rc<T>), Arc(Arc<T>) }`. Or we can make `Lrc` a union and on every clone we can read from special thread-local variable. Or we can add a generic parameter to `Lrc` and, yes, this parameter will be everywhere across all codebase.

So, if you think we should take some alternative approach, then don't merge this PR. But if it is decided to stick with `Arc`, then, please, merge.

cc "Parallel Rustc Front-end" ( https://github.com/rust-lang/rust/issues/113349 )

r? SparrowLii

`@rustbot` label WG-compiler-parallel
2025-02-06 10:50:05 +00:00
Zalathar
bd855b6c9e coverage: Remove the old code for simplifying counters after MIR opts 2025-02-06 21:44:31 +11:00
Zalathar
20d051ec87 coverage: Defer part of counter-creation until codegen 2025-02-06 21:44:31 +11:00
Zalathar
ee7dc06cf1 coverage: Store BCB node IDs in mappings, and resolve them in codegen
Even though the coverage graph itself is no longer available during codegen,
its nodes can still be used as opaque IDs.
2025-02-06 21:44:29 +11:00
Zalathar
042fd8c24a Remove some unused glob re-exports
These were detected by temporarily making `mod llvm` non-public.
2025-02-06 12:10:45 +11:00
Zalathar
65d7e6937b Remove the mod llvm_ hack, which should no longer be necessary 2025-02-06 12:10:42 +11:00
Manuel Drehwald
70b9ba3d6e fix fwd-mode autodiff case 2025-02-05 18:47:23 -05:00
León Orell Valerian Liehr
75989e98d8
Rollup merge of #136375 - Zalathar:llvm-di-builder, r=workingjubilee
cg_llvm: Replace some DIBuilder wrappers with LLVM-C API bindings (part 1)

Part of #134001, follow-up to #136326, extracted from #134009.

This PR performs an arbitrary subset of the LLVM-C binding migrations from #134009, which should make it less tedious to review. The remaining migrations can occur in one or more subsequent PRs.
2025-02-05 05:03:03 +01:00
bors
3f33b30e19 Auto merge of #135760 - scottmcm:disjoint-bitor, r=WaffleLapkin
Add `unchecked_disjoint_bitor` per ACP373

Following the names from libs-api in https://github.com/rust-lang/libs-team/issues/373#issuecomment-2085686057

Includes a fallback implementation so this doesn't have to update cg_clif or cg_gcc, and overrides it in cg_llvm to use `or disjoint`, which [is available in LLVM 18](https://releases.llvm.org/18.1.0/docs/LangRef.html#or-instruction) so hopefully we don't need any version checks.
2025-02-04 17:46:06 +00:00
Augie Fackler
e9cb36bd0f nvptx64: update default alignment to match LLVM 21
This changed in llvm/llvm-project@91cb8f5d32.
The commit itself is mostly about some intrinsic instructions, but as an
aside it also mentions something about addrspace for tensor memory,
which I believe is what this string is telling us.

@rustbot label: +llvm-main
2025-02-04 10:37:07 -05:00
Ralf Jung
04e7a10af6 intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic 2025-02-04 16:27:29 +01:00
Askar Safin
0a21f1d0a2 tree-wide: parallel: Fully removed all Lrc, replaced with Arc 2025-02-03 13:25:57 +03:00
Scott McMurray
f46e6be190 Handle the case where the or disjoint folds immediately to a constant 2025-02-02 21:04:10 -08:00
Matthias Krüger
f5ae630f10
Rollup merge of #136426 - oli-obk:push-nkpuulwurykn, r=compiler-errors
Explain why we retroactively change a static initializer to have a different type

I keep getting confused about it and in turn confused `@GuillaumeGomez` while trying to explain it badly
2025-02-02 23:06:57 +01:00
Oli Scherer
b89263605a Explain why we retroactively change a static initializer to have a different type 2025-02-01 22:39:38 +00:00
Scott McMurray
4ee1602eab Override disjoint_or in the LLVM backend 2025-01-31 22:29:08 -08:00
Zalathar
c3f2930edc Explain why (some) pointer/length strings are *const c_uchar 2025-02-01 14:14:40 +11:00
Zalathar
5413d2bd6f Add FIXME for auditing optional parameters passed to DIBuilder 2025-02-01 14:14:40 +11:00
Zalathar
8ddd9c38f6 Use LLVMDIBuilderCreateDebugLocation
The LLVM-C binding takes an explicit context, whereas our binding obtained the
context from the scope argument.
2025-02-01 14:14:40 +11:00
Zalathar
949b4673ce Use LLVMDIBuilderCreateLexicalBlockFile 2025-02-01 14:14:40 +11:00
Zalathar
70d41bc711 Use LLVMDIBuilderCreateLexicalBlock 2025-02-01 14:14:40 +11:00
Zalathar
878ab125a1 Use LLVMDIBuilderCreateNameSpace 2025-02-01 14:14:39 +11:00
Zalathar
cd2af2dd9a Use LLVMDIBuilderFinalize 2025-02-01 13:38:12 +11:00
Zalathar
832fcfb64f Introduce DIBuilderBox, an owning pointer to DIBuilder 2025-02-01 13:34:14 +11:00
Ben Kimock
ce7cb312fa Add link attribute for Enzyme's FFI 2025-01-31 21:11:23 -05:00
bors
7f36543a48 Auto merge of #136332 - jhpratt:rollup-aa69d0e, r=jhpratt
Rollup of 9 pull requests

Successful merges:

 - #132156 (When encountering unexpected closure return type, point at return type/expression)
 - #133429 (Autodiff Upstreaming - rustc_codegen_ssa, rustc_middle)
 - #136281 (`rustc_hir_analysis` cleanups)
 - #136297 (Fix a typo in profile-guided-optimization.md)
 - #136300 (atomic: extend compare_and_swap migration docs)
 - #136310 (normalize `*.long-type.txt` paths for compare-mode tests)
 - #136312 (Disable `overflow_delimited_expr` in edition 2024)
 - #136313 (Filter out RPITITs when suggesting unconstrained assoc type on too many generics)
 - #136323 (Fix a typo in conventions.md)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-01-31 09:42:28 +00:00
Jacob Pratt
c19c4b91f5
Rollup merge of #133429 - EnzymeAD:autodiff-middle, r=oli-obk
Autodiff Upstreaming - rustc_codegen_ssa, rustc_middle

This PR should not be merged until the rustc_codegen_llvm part is merged.
I will also alter it a little based on what get's shaved off from the cg_llvm PR,
and address some of the feedback I received in the other PR (including cleanups).

I am putting it already up to
1) Discuss with `@jieyouxu` if there is more work needed to add tests to this and
2) Pray that there is someone reviewing who can tell me why some of my autodiff invocations get lost.

Re 1: My test require fat-lto. I also modify the compilation pipeline. So if there are any other llvm-ir tests in the same compilation unit then I will likely break them. Luckily there are two groups who currently have the same fat-lto requirement for their GPU code which I have for my autodiff code and both groups have some plans to enable support for thin-lto. Once either that work pans out, I'll copy it over for this feature. I will also work on not changing the optimization pipeline for functions not differentiated, but that will require some thoughts and engineering, so I think it would be good to be able to run the autodiff tests isolated from the rest for now. Can you guide me here please?
For context, here are some of my tests in the samples folder: https://github.com/EnzymeAD/rustbook

Re 2: This is a pretty serious issue, since it effectively prevents publishing libraries making use of autodiff: https://github.com/EnzymeAD/rust/issues/173. For some reason my dummy code persists till the end, so the code which calls autodiff, deletes the dummy, and inserts the code to compute the derivative never gets executed. To me it looks like the rustc_autodiff attribute just get's dropped, but I don't know WHY? Any help would be super appreciated, as rustc queries look a bit voodoo to me.

Tracking:

- https://github.com/rust-lang/rust/issues/124509

r? `@jieyouxu`
2025-01-31 00:26:30 -05:00
bors
c37fbd873a Auto merge of #135318 - compiler-errors:vtable-fixes, r=lcnr
Fix deduplication mismatches in vtables leading to upcasting unsoundness

We currently have two cases where subtleties in supertraits can trigger disagreements in the vtable layout, e.g. leading to a different vtable layout being accessed at a callsite compared to what was prepared during unsizing. Namely:

### #135315

In this example, we were not normalizing supertraits when preparing vtables. In the example,

```
trait Supertrait<T> {
    fn _print_numbers(&self, mem: &[usize; 100]) {
        println!("{mem:?}");
    }
}
impl<T> Supertrait<T> for () {}

trait Identity {
    type Selff;
}
impl<Selff> Identity for Selff {
    type Selff = Selff;
}

trait Middle<T>: Supertrait<()> + Supertrait<T> {
    fn say_hello(&self, _: &usize) {
        println!("Hello!");
    }
}
impl<T> Middle<T> for () {}

trait Trait: Middle<<() as Identity>::Selff> {}
impl Trait for () {}

fn main() {
    (&() as &dyn Trait as &dyn Middle<()>).say_hello(&0);
}
```

When we prepare `dyn Trait`, we see a supertrait of `Middle<<() as Identity>::Selff>`, which itself has two supertraits `Supertrait<()>` and `Supertrait<<() as Identity>::Selff>`. These two supertraits are identical, but they are not duplicated because we were using structural equality and *not* considering normalization. This leads to a vtable layout with two trait pointers.

When we upcast to `dyn Middle<()>`, those two supertraits are now the same, leading to a vtable layout with only one trait pointer. This leads to an offset error, and we call the wrong method.

### #135316

This one is a bit more interesting, and is the bulk of the changes in this PR. It's a bit similar, except it uses binder equality instead of normalization to make the compiler get confused about two vtable layouts. In the example,

```
trait Supertrait<T> {
    fn _print_numbers(&self, mem: &[usize; 100]) {
        println!("{mem:?}");
    }
}
impl<T> Supertrait<T> for () {}

trait Trait<T, U>: Supertrait<T> + Supertrait<U> {
    fn say_hello(&self, _: &usize) {
        println!("Hello!");
    }
}
impl<T, U> Trait<T, U> for () {}

fn main() {
    (&() as &'static dyn for<'a> Trait<&'static (), &'a ()>
        as &'static dyn Trait<&'static (), &'static ()>)
        .say_hello(&0);
}
```

When we prepare the vtable for `dyn for<'a> Trait<&'static (), &'a ()>`, we currently consider the PolyTraitRef of the vtable as the key for a supertrait. This leads two two supertraits -- `Supertrait<&'static ()>` and `for<'a> Supertrait<&'a ()>`.

However, we can upcast[^up] without offsetting the vtable from `dyn for<'a> Trait<&'static (), &'a ()>` to `dyn Trait<&'static (), &'static ()>`. This is just instantiating the principal trait ref for a specific `'a = 'static`. However, when considering those supertraits, we now have only one distinct supertrait -- `Supertrait<&'static ()>` (which is deduplicated since there are two supertraits with the same substitutions). This leads to similar offsetting issues, leading to the wrong method being called.

[^up]: I say upcast but this is a cast that is allowed on stable, since it's not changing the vtable at all, just instantiating the binder of the principal trait ref for some lifetime.

The solution here is to recognize that a vtable isn't really meaningfully higher ranked, and to just treat a vtable as corresponding to a `TraitRef` so we can do this deduplication more faithfully. That is to say, the vtable for `dyn for<'a> Tr<'a>` and `dyn Tr<'x>` are always identical, since they both would correspond to a set of free regions on an impl... Do note that `Tr<for<'a> fn(&'a ())>` and `Tr<fn(&'static ())>` are still distinct.

----

There's a bit more that can be cleaned up. In codegen, we can stop using `PolyExistentialTraitRef` basically everywhere. We can also fix SMIR to stop storing `PolyExistentialTraitRef` in its vtable allocations.

As for testing, it's difficult to actually turn this into something that can be tested with `rustc_dump_vtable`, since having multiple supertraits that are identical is a recipe for ambiguity errors. Maybe someone else is more creative with getting that attr to work, since the tests I added being run-pass tests is a bit unsatisfying. Miri also doesn't help here, since it doesn't really generate vtables that are offset by an index in the same way as codegen.

r? `@lcnr` for the vibe check? Or reassign, idk. Maybe let's talk about whether this makes sense.

<sup>(I guess an alternative would also be to not do any deduplication of vtable supertraits (or only a really conservative subset) rather than trying to normalize and deduplicate more faithfully here. Not sure if that works and is sufficient tho.)</sup>

cc `@steffahn` -- ty for the minimizations
cc `@WaffleLapkin` -- since you're overseeing the feature stabilization :3

Fixes #135315
Fixes #135316
2025-01-31 04:09:11 +00:00
bors
6c1d960d88 Auto merge of #136318 - matthiaskrgr:rollup-a159mzo, r=matthiaskrgr
Rollup of 9 pull requests

Successful merges:

 - #135026 (Cast global variables to default address space)
 - #135475 (uefi: Implement path)
 - #135852 (Add `AsyncFn*` to `core` prelude)
 - #136004 (tests: Skip const OOM tests on aarch64-unknown-linux-gnu)
 - #136157 (override build profile for bootstrap tests)
 - #136180 (Introduce a wrapper for "typed valtrees" and properly check the type before extracting the value)
 - #136256 (Add release notes for 1.84.1)
 - #136271 (Remove minor future footgun in `impl Debug for MaybeUninit`)
 - #136288 (Improve documentation for file locking)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-01-30 23:11:38 +00:00
Matthias Krüger
6a66a270b0
Rollup merge of #136180 - lukas-code:typed-valtree, r=oli-obk
Introduce a wrapper for "typed valtrees" and properly check the type before extracting the value

This PR adds a new wrapper type `ty::Value` to replace the tuple `(Ty, ty::ValTree)` and become the new canonical representation of type-level constant values.

The value extraction methods `try_to_bits`/`try_to_bool`/`try_to_target_usize` are moved to this new type. For `try_to_bits` in particular, this avoids some redundant matches on `ty::ConstKind::Value`. Furthermore, these methods and will now properly check the type before extracting the value, which fixes some ICEs.

The name `ty::Value` was chosen to be consistent with `ty::Expr`.

Commit 1 should be non-functional and commit 2 adds the type check.

---

fixes https://github.com/rust-lang/rust/issues/131102
supercedes https://github.com/rust-lang/rust/pull/136130

r? `@oli-obk`
cc `@FedericoBruzzone` `@BoxyUwU`
2025-01-30 20:47:07 +01:00
Matthias Krüger
89f8abe8b4
Rollup merge of #135026 - Flakebi:global-addrspace, r=saethlin
Cast global variables to default address space

Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if a global variable is created in a different address space, it is casted to the default address space before its value is used.

This is necessary for the amdgpu target and others where the default address space for global variables is not 0.

For example `core` does not compile in debug mode when not casting the address space to the default one because it tries to emit the following (simplified) LLVM IR, containing a type mismatch:

```llvm
`@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
`@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) `@alloc_0` }>, align 8
; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)`
```

For this to compile, we need to insert a constant `addrspacecast` before we use a global variable:

```llvm
`@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
`@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) `@alloc_0` to ptr) }>, align 8
```

As vtables are global variables as well, they are also created with an `addrspacecast`. In the SSA backend, after a vtable global is created, metadata is added to it. To add metadata, we need the non-casted global variable. Therefore we strip away an addrspacecast if there is one, to get the underlying global.

Tracking issue: #135024
2025-01-30 20:47:02 +01:00
Lukas Markeffsky
10fc0b159e introduce ty::Value
Co-authored-by: FedericoBruzzone <federico.bruzzone.i@gmail.com>
2025-01-30 17:47:44 +01:00
Michael Goulet
9dc41a048d Use ExistentialTraitRef throughout codegen 2025-01-30 15:34:00 +00:00
Michael Goulet
fdc4bd22b7 Do not treat vtable supertraits as distinct when bound with different bound vars 2025-01-30 15:33:58 +00:00
Wesley Wiser
51eaa0d56a Clean up uses of the unstable dwarf_version option
- Consolidate calculation of the effective value.
- Check the target `DebuginfoKind` instead of using `is_like_msvc`.
2025-01-29 21:44:21 -06:00
Manuel Drehwald
1f30517d40 upstream rustc_codegen_ssa/rustc_middle changes for enzyme/autodiff 2025-01-29 21:31:13 -05:00
León Orell Valerian Liehr
7e123e4940
Rollup merge of #136147 - RalfJung:required-target-features-check-not-add, r=workingjubilee
ABI-required target features: warn when they are missing in base CPU

Part of https://github.com/rust-lang/rust/pull/135408:
instead of adding ABI-required features to the target we build for LLVM, check that they are already there. Crucially we check this after applying `-Ctarget-cpu` and `-Ctarget-feature`, by reading `sess.unstable_target_features`. This means we can tweak the ABI target feature check without changing the behavior for any existing user; they will get warnings but the target features behave as before.

The test changes here show that we are un-doing the "add all required target features" part. Without the full #135408, there is no way to take a way an ABI-required target feature with `-Ctarget-cpu`, so we cannot yet test that part.

Cc ``@workingjubilee``
2025-01-29 03:12:21 +01:00
Taiki Endo
93465e6c31 Mark condition/carry bit as clobbered in C-SKY inline assembly 2025-01-29 06:46:05 +09:00
Ralf Jung
93ee180cfa ABI-required target features: warn when they are missing in base CPU (rather than silently enabling them) 2025-01-28 04:40:42 +01:00
Oli Scherer
b24f674520 Change collect_and_partition_mono_items tuple return type to a struct 2025-01-27 09:38:12 +00:00
Jörn Horstmann
3779b8e32e Consistently use the most significant bit of vector masks
This improves the codegen for vector `select`, `gather`, `scatter` and
boolean reduction intrinsics and fixes rust-lang/portable-simd#316.

The current behavior of most mask operations during llvm codegen is to
truncate the mask vector to <N x i1>, telling llvm to use the least
significat bit. The exception is the `simd_bitmask` intrinsics, which
already used the most signifiant bit.

Since sse/avx instructions are defined to use the most significant bit,
truncating means that llvm has to insert a left shift to move the bit
into the most significant position, before the mask can actually be
used.

Similarly on aarch64, mask operations like blend work bit by bit,
repeating the least significant bit across the whole lane involves
shifting it into the sign position and then comparing against zero.

By shifting before truncating to <N x i1>, we tell llvm that we only
consider the most significant bit, removing the need for additional
shift instructions in the assembly.
2025-01-26 16:44:23 +01:00
bors
6365178a6b Auto merge of #128657 - clubby789:optimize-none, r=fee1-dead,WaffleLapkin
Add `#[optimize(none)]`

cc #54882

This extends the `optimize` attribute to add `none`, which corresponds to the LLVM `OptimizeNone` attribute.

Not sure if an MCP is required for this, happy to file one if so.
2025-01-25 05:50:36 +00:00
Matthias Krüger
1e454fe725
Rollup merge of #135581 - EnzymeAD:refactor-codgencx, r=oli-obk
Separate Builder methods from tcx

As part of the autodiff upstreaming we noticed, that it would be nice to have various builder methods available without the TypeContext, which prevents the normal CodegenCx to be passed around between threads.
We introduce a SimpleCx which just owns the llvm module and llvm context, to encapsulate them.
The previous CodegenCx now implements deref and forwards access to the llvm module or context to it's SimpleCx sub-struct. This gives us a bit more flexibility, because now we can pass (or construct) the SimpleCx in locations where we don't have enough information to construct a CodegenCx, or are not able to pass it around due to the tcx lifetimes (and it not implementing send/sync).

This also introduces an SBuilder, similar to the SimpleCx. The SBuilder uses a SimpleCx, whereas the existing Builder uses the larger CodegenCx. I will push updates to make  implementations generic (where possible) to be implemented once and work for either of the two. I'll also clean up the leftover code.

`call` is a bit tricky, because it requires a tcx, I probably need to duplicate it after all.

Tracking:

- https://github.com/rust-lang/rust/issues/124509
2025-01-24 23:25:42 +01:00
Manuel Drehwald
386c233858 Make CodegenCx and Builder generic
Co-authored-by: Oli Scherer <github35764891676564198441@oli-obk.de>
2025-01-24 16:05:26 -05:00
clubby789
5ac95a5c47 Rename OptimizeAttr::None to Default 2025-01-24 19:34:01 +00:00
Zalathar
ff48331588 coverage: Make query coverage_ids_info return an Option
This reflects the fact that we can't compute meaningful info for a function
that wasn't instrumented and therefore doesn't have `function_coverage_info`.
2025-01-24 16:13:11 +11:00
Flakebi
b06e840d9e
Add comments about address spaces 2025-01-24 00:37:05 +01:00