Commit graph

2424 commits

Author SHA1 Message Date
Jacob Pratt
c19c4b91f5
Rollup merge of #133429 - EnzymeAD:autodiff-middle, r=oli-obk
Autodiff Upstreaming - rustc_codegen_ssa, rustc_middle

This PR should not be merged until the rustc_codegen_llvm part is merged.
I will also alter it a little based on what get's shaved off from the cg_llvm PR,
and address some of the feedback I received in the other PR (including cleanups).

I am putting it already up to
1) Discuss with `@jieyouxu` if there is more work needed to add tests to this and
2) Pray that there is someone reviewing who can tell me why some of my autodiff invocations get lost.

Re 1: My test require fat-lto. I also modify the compilation pipeline. So if there are any other llvm-ir tests in the same compilation unit then I will likely break them. Luckily there are two groups who currently have the same fat-lto requirement for their GPU code which I have for my autodiff code and both groups have some plans to enable support for thin-lto. Once either that work pans out, I'll copy it over for this feature. I will also work on not changing the optimization pipeline for functions not differentiated, but that will require some thoughts and engineering, so I think it would be good to be able to run the autodiff tests isolated from the rest for now. Can you guide me here please?
For context, here are some of my tests in the samples folder: https://github.com/EnzymeAD/rustbook

Re 2: This is a pretty serious issue, since it effectively prevents publishing libraries making use of autodiff: https://github.com/EnzymeAD/rust/issues/173. For some reason my dummy code persists till the end, so the code which calls autodiff, deletes the dummy, and inserts the code to compute the derivative never gets executed. To me it looks like the rustc_autodiff attribute just get's dropped, but I don't know WHY? Any help would be super appreciated, as rustc queries look a bit voodoo to me.

Tracking:

- https://github.com/rust-lang/rust/issues/124509

r? `@jieyouxu`
2025-01-31 00:26:30 -05:00
bors
6c1d960d88 Auto merge of #136318 - matthiaskrgr:rollup-a159mzo, r=matthiaskrgr
Rollup of 9 pull requests

Successful merges:

 - #135026 (Cast global variables to default address space)
 - #135475 (uefi: Implement path)
 - #135852 (Add `AsyncFn*` to `core` prelude)
 - #136004 (tests: Skip const OOM tests on aarch64-unknown-linux-gnu)
 - #136157 (override build profile for bootstrap tests)
 - #136180 (Introduce a wrapper for "typed valtrees" and properly check the type before extracting the value)
 - #136256 (Add release notes for 1.84.1)
 - #136271 (Remove minor future footgun in `impl Debug for MaybeUninit`)
 - #136288 (Improve documentation for file locking)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-01-30 23:11:38 +00:00
Matthias Krüger
6a66a270b0
Rollup merge of #136180 - lukas-code:typed-valtree, r=oli-obk
Introduce a wrapper for "typed valtrees" and properly check the type before extracting the value

This PR adds a new wrapper type `ty::Value` to replace the tuple `(Ty, ty::ValTree)` and become the new canonical representation of type-level constant values.

The value extraction methods `try_to_bits`/`try_to_bool`/`try_to_target_usize` are moved to this new type. For `try_to_bits` in particular, this avoids some redundant matches on `ty::ConstKind::Value`. Furthermore, these methods and will now properly check the type before extracting the value, which fixes some ICEs.

The name `ty::Value` was chosen to be consistent with `ty::Expr`.

Commit 1 should be non-functional and commit 2 adds the type check.

---

fixes https://github.com/rust-lang/rust/issues/131102
supercedes https://github.com/rust-lang/rust/pull/136130

r? `@oli-obk`
cc `@FedericoBruzzone` `@BoxyUwU`
2025-01-30 20:47:07 +01:00
Matthias Krüger
89f8abe8b4
Rollup merge of #135026 - Flakebi:global-addrspace, r=saethlin
Cast global variables to default address space

Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if a global variable is created in a different address space, it is casted to the default address space before its value is used.

This is necessary for the amdgpu target and others where the default address space for global variables is not 0.

For example `core` does not compile in debug mode when not casting the address space to the default one because it tries to emit the following (simplified) LLVM IR, containing a type mismatch:

```llvm
`@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
`@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) `@alloc_0` }>, align 8
; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)`
```

For this to compile, we need to insert a constant `addrspacecast` before we use a global variable:

```llvm
`@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
`@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) `@alloc_0` to ptr) }>, align 8
```

As vtables are global variables as well, they are also created with an `addrspacecast`. In the SSA backend, after a vtable global is created, metadata is added to it. To add metadata, we need the non-casted global variable. Therefore we strip away an addrspacecast if there is one, to get the underlying global.

Tracking issue: #135024
2025-01-30 20:47:02 +01:00
Lukas Markeffsky
10fc0b159e introduce ty::Value
Co-authored-by: FedericoBruzzone <federico.bruzzone.i@gmail.com>
2025-01-30 17:47:44 +01:00
Wesley Wiser
51eaa0d56a Clean up uses of the unstable dwarf_version option
- Consolidate calculation of the effective value.
- Check the target `DebuginfoKind` instead of using `is_like_msvc`.
2025-01-29 21:44:21 -06:00
Manuel Drehwald
1f30517d40 upstream rustc_codegen_ssa/rustc_middle changes for enzyme/autodiff 2025-01-29 21:31:13 -05:00
León Orell Valerian Liehr
7e123e4940
Rollup merge of #136147 - RalfJung:required-target-features-check-not-add, r=workingjubilee
ABI-required target features: warn when they are missing in base CPU

Part of https://github.com/rust-lang/rust/pull/135408:
instead of adding ABI-required features to the target we build for LLVM, check that they are already there. Crucially we check this after applying `-Ctarget-cpu` and `-Ctarget-feature`, by reading `sess.unstable_target_features`. This means we can tweak the ABI target feature check without changing the behavior for any existing user; they will get warnings but the target features behave as before.

The test changes here show that we are un-doing the "add all required target features" part. Without the full #135408, there is no way to take a way an ABI-required target feature with `-Ctarget-cpu`, so we cannot yet test that part.

Cc ``@workingjubilee``
2025-01-29 03:12:21 +01:00
Ralf Jung
93ee180cfa ABI-required target features: warn when they are missing in base CPU (rather than silently enabling them) 2025-01-28 04:40:42 +01:00
Oli Scherer
b24f674520 Change collect_and_partition_mono_items tuple return type to a struct 2025-01-27 09:38:12 +00:00
Jörn Horstmann
3779b8e32e Consistently use the most significant bit of vector masks
This improves the codegen for vector `select`, `gather`, `scatter` and
boolean reduction intrinsics and fixes rust-lang/portable-simd#316.

The current behavior of most mask operations during llvm codegen is to
truncate the mask vector to <N x i1>, telling llvm to use the least
significat bit. The exception is the `simd_bitmask` intrinsics, which
already used the most signifiant bit.

Since sse/avx instructions are defined to use the most significant bit,
truncating means that llvm has to insert a left shift to move the bit
into the most significant position, before the mask can actually be
used.

Similarly on aarch64, mask operations like blend work bit by bit,
repeating the least significant bit across the whole lane involves
shifting it into the sign position and then comparing against zero.

By shifting before truncating to <N x i1>, we tell llvm that we only
consider the most significant bit, removing the need for additional
shift instructions in the assembly.
2025-01-26 16:44:23 +01:00
bors
6365178a6b Auto merge of #128657 - clubby789:optimize-none, r=fee1-dead,WaffleLapkin
Add `#[optimize(none)]`

cc #54882

This extends the `optimize` attribute to add `none`, which corresponds to the LLVM `OptimizeNone` attribute.

Not sure if an MCP is required for this, happy to file one if so.
2025-01-25 05:50:36 +00:00
Matthias Krüger
1e454fe725
Rollup merge of #135581 - EnzymeAD:refactor-codgencx, r=oli-obk
Separate Builder methods from tcx

As part of the autodiff upstreaming we noticed, that it would be nice to have various builder methods available without the TypeContext, which prevents the normal CodegenCx to be passed around between threads.
We introduce a SimpleCx which just owns the llvm module and llvm context, to encapsulate them.
The previous CodegenCx now implements deref and forwards access to the llvm module or context to it's SimpleCx sub-struct. This gives us a bit more flexibility, because now we can pass (or construct) the SimpleCx in locations where we don't have enough information to construct a CodegenCx, or are not able to pass it around due to the tcx lifetimes (and it not implementing send/sync).

This also introduces an SBuilder, similar to the SimpleCx. The SBuilder uses a SimpleCx, whereas the existing Builder uses the larger CodegenCx. I will push updates to make  implementations generic (where possible) to be implemented once and work for either of the two. I'll also clean up the leftover code.

`call` is a bit tricky, because it requires a tcx, I probably need to duplicate it after all.

Tracking:

- https://github.com/rust-lang/rust/issues/124509
2025-01-24 23:25:42 +01:00
Manuel Drehwald
386c233858 Make CodegenCx and Builder generic
Co-authored-by: Oli Scherer <github35764891676564198441@oli-obk.de>
2025-01-24 16:05:26 -05:00
clubby789
5ac95a5c47 Rename OptimizeAttr::None to Default 2025-01-24 19:34:01 +00:00
Zalathar
ff48331588 coverage: Make query coverage_ids_info return an Option
This reflects the fact that we can't compute meaningful info for a function
that wasn't instrumented and therefore doesn't have `function_coverage_info`.
2025-01-24 16:13:11 +11:00
Flakebi
b06e840d9e
Add comments about address spaces 2025-01-24 00:37:05 +01:00
clubby789
cd848c9f3e Implement optimize(none) attribute 2025-01-23 17:19:53 +00:00
Ken Matsui
44e8c43976
rustc_codegen_llvm: remove outdated asm-to-obj codegen note
Remove comment about missing integrated assembler handling, which was
removed in commit 02840ca.
2025-01-22 17:58:50 -05:00
Matthias Krüger
e0d74c0667
Rollup merge of #135156 - Zalathar:debuginfo-flags, r=cuviper
Make our `DIFlags` match `LLVMDIFlags` in the LLVM-C API

In order to be able to use a mixture of LLVM-C and C++ bindings for debuginfo, our Rust-side `DIFlags` needs to have the same layout as LLVM-C's `LLVMDIFlags`, and we also need to be able to convert it to the `DIFlags` accepted by LLVM's C++ API.

Internally, LLVM converts between the two types with a simple cast. We can't necessarily rely on that always being true, and LLVM doesn't expose a conversion function, so we have two potential options:
- Convert each bit/subvalue individually
- Statically assert that doing a cast is actually fine

As long as both types do remain the same under the hood (which seems likely), the static-assert-and-cast approach is easier and faster. If the static assertions ever start failing against some future version of LLVM, we'll have to switch over to the convert-each-subvalue approach, which is a bit more error-prone.

---

Extracted from #134009, though this PR ended up choosing the static-assert-and-cast approach over the convert-each-subvalue approach.
2025-01-22 19:29:39 +01:00
bors
ed43cbcb88 Auto merge of #134299 - RalfJung:remove-start, r=compiler-errors
remove support for the (unstable) #[start] attribute

As explained by `@Noratrieb:`
`#[start]` should be deleted. It's nothing but an accidentally leaked implementation detail that's a not very useful mix between "portable" entrypoint logic and bad abstraction.

I think the way the stable user-facing entrypoint should work (and works today on stable) is pretty simple:
- `std`-using cross-platform programs should use `fn main()`. the compiler, together with `std`, will then ensure that code ends up at `main` (by having a platform-specific entrypoint that gets directed through `lang_start` in `std` to `main` - but that's just an implementation detail)
- `no_std` platform-specific programs should use `#![no_main]` and define their own platform-specific entrypoint symbol with `#[no_mangle]`, like `main`, `_start`, `WinMain` or `my_embedded_platform_wants_to_start_here`. most of them only support a single platform anyways, and need cfg for the different platform's ways of passing arguments or other things *anyways*

`#[start]` is in a super weird position of being neither of those two. It tries to pretend that it's cross-platform, but its signature is  a total lie. Those arguments are just stubbed out to zero on ~~Windows~~ wasm, for example. It also only handles the platform-specific entrypoints for a few platforms that are supported by `std`, like Windows or Unix-likes. `my_embedded_platform_wants_to_start_here` can't use it, and neither could a libc-less Linux program.
So we have an attribute that only works in some cases anyways, that has a signature that's a total lie (and a signature that, as I might want to add, has changed recently, and that I definitely would not be comfortable giving *any* stability guarantees on), and where there's a pretty easy way to get things working without it in the first place.

Note that this feature has **not** been RFCed in the first place.

*This comment was posted [in May](https://github.com/rust-lang/rust/issues/29633#issuecomment-2088596042) and so far nobody spoke up in that issue with a usecase that would require keeping the attribute.*

Closes https://github.com/rust-lang/rust/issues/29633

try-job: x86_64-gnu-nopt
try-job: x86_64-msvc-1
try-job: x86_64-msvc-2
try-job: test-various
2025-01-21 19:46:20 +00:00
Ralf Jung
56c90dc31e remove support for the #[start] attribute 2025-01-21 06:59:15 -07:00
Oli Scherer
dfa4c01b2e Treat undef bytes as equal to any other byte 2025-01-21 08:27:21 +00:00
Zalathar
d10bdafa26 Note that cg_llvm's gimli should match the version used elsewhere 2025-01-21 14:41:44 +11:00
Zalathar
32f1c1d85e Make our DIFlags match LLVMDIFlags in the LLVM-C API 2025-01-21 14:41:44 +11:00
bors
6a64e3b897 Auto merge of #135643 - khuey:135332, r=jieyouxu
When LLVM's location discriminator value limit is exceeded, emit locations with dummy spans instead of dropping them entirely

Dropping them fails `-Zverify-llvm-ir`.

Fixes #135332.

r? `@jieyouxu`
2025-01-20 14:16:22 +00:00
Yotam Ofek
264fa0fc54 Run clippy --fix for unnecessary_map_or lint 2025-01-19 19:15:00 +00:00
Kyle Huey
45ef92731b When LLVM's location discriminator value limit is exceeded, emit locations with dummy spans instead of dropping them entirely
Revert most of #133194 (except the test and the comment fixes). Then refix
not emitting locations at all when the correct location discriminator value
exceeds LLVM's capacity.
2025-01-19 07:17:33 -08:00
bors
0c2c096e1a Auto merge of #135047 - Flakebi:amdgpu-kernel-cc, r=workingjubilee
Add gpu-kernel calling convention

The amdgpu-kernel calling convention was reverted in commit f6b21e90d1 (#120495 and https://github.com/rust-lang/rust-analyzer/pull/16463) due to inactivity in the amdgpu target.

Introduce a `gpu-kernel` calling convention that translates to `ptx_kernel` or `amdgpu_kernel`, depending on the target that rust compiles for.

Tracking issue: #135467
amdgpu target tracking issue: #135024
2025-01-17 04:36:09 +00:00
Flakebi
e7e5202978
Add gpu-kernel calling convention
The amdgpu-kernel calling convention was reverted in commit
f6b21e90d1 due to inactivity in the amdgpu
target.

Introduce a `gpu-kernel` calling convention that translates to
`ptx_kernel` or `amdgpu_kernel`, depending on the target that rust
compiles for.
2025-01-16 00:26:55 +01:00
Matthias Krüger
448bad9eba
Rollup merge of #133752 - klensy:cp, r=davidtwco
replace copypasted ModuleLlvm::parse

replaced code same as in bd36e69d25/compiler/rustc_codegen_llvm/src/lib.rs (L426-L445)

except before error message was emitted via `write::llvm_err`, which returned other error kind, but it still ok?
2025-01-13 15:56:55 +01:00
Matthias Krüger
0bb0f0412f
Rollup merge of #135205 - lqd:bitsets, r=Mark-Simulacrum
Rename `BitSet` to `DenseBitSet`

r? `@Mark-Simulacrum` as you requested this in https://github.com/rust-lang/rust/pull/134438#discussion_r1890659739 after such a confusion.

This PR renames `BitSet` to `DenseBitSet` to make it less obvious as the go-to solution for bitmap needs, as well as make its representation (and positives/negatives) clearer. It also expands the comments there to hopefully make it clearer when it's not a good fit, with some alternative bitsets types.

(This migrates the subtrees cg_gcc and clippy to use the new name in separate commits, for easier review by their respective owners, but they can obvs be squashed)
2025-01-11 18:13:47 +01:00
Matthias Krüger
b8e230a824
Rollup merge of #134030 - folkertdev:min-fn-align, r=workingjubilee
add `-Zmin-function-alignment`

tracking issue: https://github.com/rust-lang/rust/issues/82232

This PR adds the `-Zmin-function-alignment=<align>` flag, that specifies a minimum alignment for all* functions.

### Motivation

This feature is requested by RfL [here](https://github.com/rust-lang/rust/issues/128830):

> i.e. the equivalents of `-fmin-function-alignment` ([GCC](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fmin-function-alignment_003dn), Clang does not support it) / `-falign-functions` ([GCC](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-falign-functions), [Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang1-falign-functions)).
>
> For the Linux kernel, the behavior wanted is that of GCC's `-fmin-function-alignment` and Clang's `-falign-functions`, i.e. align all functions, including cold functions.
>
> There is [`feature(fn_align)`](https://github.com/rust-lang/rust/issues/82232), but we need to do it globally.

### Behavior

The `fn_align` feature does not have an RFC. It was decided at the time that it would not be necessary, but maybe we feel differently about that now? In any case, here are the semantics of this flag:

- `-Zmin-function-alignment=<align>` specifies the minimum alignment of all* functions
- the `#[repr(align(<align>))]` attribute can be used to override the function alignment on a per-function basis: when `-Zmin-function-alignment` is specified, the attribute's value is only used when it is higher than the value passed to `-Zmin-function-alignment`.
- the target may decide to use a higher value (e.g. on x86_64 the minimum that LLVM generates is 16)
- The highest supported alignment in rust is `2^29`: I checked a bunch of targets, and they all emit the `.p2align        29` directive for targets that align functions at all (some GPU stuff does not have function alignment).

*: Only with `build-std` would the minimum alignment also be applied to `std` functions.

---

cc `@ojeda`

r? `@workingjubilee` you were active on the tracking issue
2025-01-11 18:13:45 +01:00
Rémy Rakic
a13354bea0 rename BitSet to DenseBitSet
This should make it clearer that this bitset is dense, with the
advantages and disadvantages that it entails.
2025-01-11 11:34:01 +00:00
Folkert de Vries
47573bf61e
add -Zmin-function-alignment 2025-01-10 22:53:54 +01:00
David Wood
f86169a58f
mir_transform: implement forced inlining
Adds `#[rustc_force_inline]` which is similar to always inlining but
reports an error if the inlining was not possible, and which always
attempts to inline annotated items, regardless of optimisation levels.
It can only be applied to free functions to guarantee that the MIR
inliner will be able to resolve calls.
2025-01-10 18:37:54 +00:00
Guillaume Gomez
020d8758f4
Rollup merge of #135177 - maurer:rename-module, r=nikic
llvm: Ignore error value that is always false

See llvm/llvm-project#121851

For LLVM 20+, this function (`renameModuleForThinLTO`) has no return value. For prior versions of LLVM, this never failed, but had a signature which allowed an error value people were handling.

`@rustbot` label: +llvm-main
r? `@nikic`

Wait a moment before approving while the llvm-main infrastructure picks it up.
2025-01-07 15:30:25 +01:00
Jacob Pratt
4e4a93c2dd
Rollup merge of #131830 - hoodmane:emscripten-wasm-eh, r=workingjubilee
Add support for wasm exception handling to Emscripten target

This is a draft because we need some additional setting for the Emscripten target to select between the old exception handling and the new exception handling. I don't know how to add a setting like that, would appreciate advice from Rust folks. We could maybe choose to use the new exception handling if `Ctarget-feature=+exception-handling` is passed? I tried this but I get errors from llvm so I'm not doing it right.
2025-01-06 22:04:13 -05:00
Matthew Maurer
fc32dd49cb llvm: Ignore error value that is always false
See llvm/llvm-project#121851

For LLVM 20+, this function (`renameModuleForThinLTO`) has no return
value. For prior versions of LLVM, this never failed, but had a
signature which allowed an error value people were handling.
2025-01-07 01:02:22 +00:00
Hood Chatham
49c74234a7 Add support for wasm exception handling to Emscripten target
Gated behind an unstable `-Z emscripten-wasm-eh` flag
2025-01-06 10:29:54 +01:00
bors
56f9e6f935 Auto merge of #135140 - jhpratt:rollup-pn2gi84, r=jhpratt
Rollup of 3 pull requests

Successful merges:

 - #135115 (cg_llvm: Use constants for DWARF opcodes, instead of FFI calls)
 - #135118 (Clarified the documentation on `core::iter::from_fn` and `core::iter::successors`)
 - #135121 (Mark `slice::reverse` unstably const)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-01-06 02:30:55 +00:00
bors
feb32c6546 Auto merge of #134794 - RalfJung:abi-required-target-features, r=workingjubilee
Add a notion of "some ABIs require certain target features"

I think I finally found the right shape for the data and checks that I recently added in https://github.com/rust-lang/rust/pull/133099, https://github.com/rust-lang/rust/pull/133417, https://github.com/rust-lang/rust/pull/134337: we have a notion of "this ABI requires the following list of target features, and it is incompatible with the following list of target features". Both `-Ctarget-feature` and `#[target_feature]` are updated to ensure we follow the rules of the ABI.  This removes all the "toggleability" stuff introduced before, though we do keep the notion of a fully "forbidden" target feature -- this is needed to deal with target features that are actual ABI switches, and hence are needed to even compute the list of required target features.

We always explicitly (un)set all required and in-conflict features, just to avoid potential trouble caused by the default features of whatever the base CPU is. We do this *before* applying `-Ctarget-feature` to maintain backward compatibility; this poses a slight risk of missing some implicit feature dependencies in LLVM but has the advantage of not breaking users that deliberately toggle ABI-relevant target features. They get a warning but the feature does get toggled the way they requested.

For now, our logic supports x86, ARM, and RISC-V (just like the previous logic did). Unsurprisingly, RISC-V is the nicest. ;)

As a side-effect this also (unstably) allows *enabling* `x87` when that is harmless. I used the opportunity to mark SSE2 as required on x86-64, to better match the actual logic in LLVM and because all x86-64 chips do have SSE2. This infrastructure also prepares us for requiring SSE on x86-32 when we want to use that for our ABI (and for float semantics sanity), see https://github.com/rust-lang/rust/issues/133611, but no such change is happening in this PR.

r? `@workingjubilee`
2025-01-05 23:21:06 +00:00
Zalathar
f50721ebad Explain why the DW_TAG_* constants remain as-is for now 2025-01-05 22:16:49 +11:00
Zalathar
1b62645418 Use constants for DWARF opcodes, instead of FFI calls 2025-01-05 22:16:25 +11:00
Zalathar
e267106104 Use gimli to get the values of DWARF constants needed by codegen
The `gimli` crate is already a dependency of `thorin-dwp`, which is already a
dependency of `rustc_codegen_ssa`.
2025-01-05 22:07:48 +11:00
Ralf Jung
2e64b5352b add dedicated type for ABI target feature constraints 2025-01-05 10:46:30 +01:00
bors
3dc3c524f7 Auto merge of #133990 - Walnut356:static_const, r=workingjubilee
[Debuginfo] Force enum `DISCR_*` to `static const u64` to allow for inspection via LLDB

see [here](486614878) for more info.

This change mainly helps `*-msvc` debugged with LLDB. Currently, LLDB cannot inspect `static` struct fields, so the intended visualization for enums is only borderline functional, and niche enums with ranges of discriminant cannot be determined at all .

LLDB *can* inspect `static const` values (though for whatever reason, non-enum/non-u64 consts don't work).

This change adds the `LLVMRustDIBuilderCreateQualifiedType` to the rust FFI layer to wrap the discr type with a `const` modifier, as well as forcing all generated integer enum `DISCR_*` values to be u64's. Those values will only ever be used by debugger visualizers anyway, so it shouldn't be a huge deal, but I left a fixme comment for it just in case.. The `tag` also still properly reflects the discriminant type, so no information is lost.
2025-01-04 23:56:29 +00:00
Flakebi
56bf673f0a
Remove range-metadata amdgpu workaround
Range metadata was disabled for amdgpu due to a backend bug. I did not
encounter any problems when removing the workaround to enable range
metadata (tried compiling `core` and `alloc`), so I assume this has
been fixed in LLVM in the last years.

Remove the workaround to re-enable range metadata.
2025-01-02 15:45:04 +01:00
Flakebi
436e4fb647
Cast global variables to default address space
Pointers for variables all need to be in the same address space for
correct compilation. Therefore ensure that even if a global variable is
created in a different address space, it is casted to the default
address space before its value is used.

This is necessary for the amdgpu target and others where the default
address space for global variables is not 0.

For example `core` does not compile in debug mode when not casting the
address space to the default one because it tries to emit the following
(simplified) LLVM IR, containing a type mismatch:

```llvm
@alloc_0 = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
@alloc_1 = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) @alloc_0 }>, align 8
; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)`
```

For this to compile, we need to insert a constant `addrspacecast` before
we use a global variable:

```llvm
@alloc_0 = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
@alloc_1 = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) @alloc_0 to ptr) }>, align 8
```

As vtables are global variables as well, they are also created with an
`addrspacecast`. In the SSA backend, after a vtable global is created,
metadata is added to it. To add metadata, we need the non-casted global
variable. Therefore we strip away an addrspacecast if there is one, to
get the underlying global.
2025-01-02 15:42:00 +01:00
Manuel Drehwald
d753cbf779 upstream rustc_codegen_llvm changes for enzyme/autodiff 2025-01-01 21:42:45 +01:00