Avoid wrapping constant allocations in packed structs when not necessary
This way LLVM will set the string merging flag if the alloc is a nul terminated string, reducing binary sizes.
try-job: armhf-gnu
Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics
Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics.
These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19.
I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something.
r? `@scottmcm`
Some tests expect to be compiled for a specific CPU or require certain
target features to be present (or absent). These tests work fine with
default CPUs but fail in downstream builds for RHEL and Fedora, where
we use non-default CPUs such as z13 on s390x, pwr9 on ppc64le, or
x86-64-v2/x86-64-v3 on x86_64.
attempt to support `BinaryFormat::Xcoff` in `naked_asm!`
Fixes https://github.com/rust-lang/rust/issues/137219
So, the inline assembly support for xcoff is extremely limited. The LLVM [XCOFFAsmParser](1b25c0c4da/llvm/lib/MC/MCParser/XCOFFAsmParser.cpp) does not support many of the attributes that LLVM itself emits, and that should exist based on [the assembler docs](https://www.ibm.com/docs/en/ssw_aix_71/assembler/assembler_pdf.pdf). It also does accept some that should not exist based on those docs.
So, I've tried to do the best I can given those limitations. At least it's better than emitting the directives for elf and having that fail somewhere deep in LLVM. Given that inline assembly for this target is incomplete (under `asm_experimental_arch`), I think that's OK (and again I don't see how we can do better given the limitations in LLVM).
r? ```@Noratrieb``` (given that you reviewed https://github.com/rust-lang/rust/pull/136637)
It seems reasonable to ping the [`powerpc64-ibm-aix` target maintainers](https://doc.rust-lang.org/rustc/platform-support/aix.html), hopefully they have thoughts too: ```@daltenty``` ```@gilamn5tr```
Remove i586-pc-windows-msvc
See [MCP 840](https://github.com/rust-lang/compiler-team/issues/840).
I left a specialized error message that should help users that hit this in the wild (for example, because they use it in their CI).
```
error: Error loading target specification: the `i586-pc-windows-msvc` target has been removed. Use the `i686-pc-windows-msvc` target instead.
Windows 10 (the minimum required OS version) requires a CPU baseline of at least i686 so you can safely switch. Run `rustc --print target-list` for a list of built-in targets
```
``@workingjubilee`` ``@calebzulawski`` fyi portable-simd uses this target in CI, if you wanna remove it already before this happens
Remove `:` from `stack-protector-heuristics-effect.rs` Filecheck Pattern
With function sections, the assembly label does not necessarily end in `:`.
Remove trailing `:` to be more consistent with the rest of the existing Filecheck patterns.
```
// CHECK-LABEL: local_string_addr_taken
#[no_mangle]
pub fn local_string_addr_taken(f: fn(&String)) {
let x = String::new();
f(&x);
```
Make x86 QNX target name consistent with other Rust targets
Rename target to be consistent with other Rust targets: Use `i686` instead of `i586`
See also
- #136495
- #109173
CC: `@jonathanpallant` `@japaric` `@gh-tr` `@samkearney`
Create a generic AVR target: avr-none
This commit removes the `avr-unknown-gnu-atmega328` target and replaces it with a more generic `avr-none` variant that must be specialized using `-C target-cpu` (e.g. `-C target-cpu=atmega328p`).
Seizing the day, I'm adding myself as the maintainer of this target - I've been already fixing the bugs anyway, might as well make it official 🙂
Related discussions:
- https://github.com/rust-lang/rust/pull/131171
- https://github.com/rust-lang/compiler-team/issues/800
try-job: x86_64-gnu-debug
This commit removes the `avr-unknown-gnu-atmega328` target and replaces
it with a more generic `avr-none` variant that must be specialized with
the `-C target-cpu` flag (e.g. `-C target-cpu=atmega328p`).
Stabilize target_feature_11
# Stabilization report
This is an updated version of https://github.com/rust-lang/rust/pull/116114, which is itself a redo of https://github.com/rust-lang/rust/pull/99767. Most of this commit and report were copied from those PRs. Thanks ```@LeSeulArtichaut``` and ```@calebzulawski!```
## Summary
Allows for safe functions to be marked with `#[target_feature]` attributes.
Functions marked with `#[target_feature]` are generally considered as unsafe functions: they are unsafe to call, cannot *generally* be assigned to safe function pointers, and don't implement the `Fn*` traits.
However, calling them from other `#[target_feature]` functions with a superset of features is safe.
```rust
// Demonstration function
#[target_feature(enable = "avx2")]
fn avx2() {}
fn foo() {
// Calling `avx2` here is unsafe, as we must ensure
// that AVX is available first.
unsafe {
avx2();
}
}
#[target_feature(enable = "avx2")]
fn bar() {
// Calling `avx2` here is safe.
avx2();
}
```
Moreover, once https://github.com/rust-lang/rust/pull/135504 is merged, they can be converted to safe function pointers in a context in which calling them is safe:
```rust
// Demonstration function
#[target_feature(enable = "avx2")]
fn avx2() {}
fn foo() -> fn() {
// Converting `avx2` to fn() is a compilation error here.
avx2
}
#[target_feature(enable = "avx2")]
fn bar() -> fn() {
// `avx2` coerces to fn() here
avx2
}
```
See the section "Closures" below for justification of this behaviour.
## Test cases
Tests for this feature can be found in [`tests/ui/target_feature/`](f6cb952dc1/tests/ui/target-feature).
## Edge cases
### Closures
* [target-feature 1.1: should closures inherit target-feature annotations? #73631](https://github.com/rust-lang/rust/issues/73631)
Closures defined inside functions marked with #[target_feature] inherit the target features of their parent function. They can still be assigned to safe function pointers and implement the appropriate `Fn*` traits.
```rust
#[target_feature(enable = "avx2")]
fn qux() {
let my_closure = || avx2(); // this call to `avx2` is safe
let f: fn() = my_closure;
}
```
This means that in order to call a function with #[target_feature], you must guarantee that the target-feature is available while the function, any closures defined inside it, as well as any safe function pointers obtained from target-feature functions inside it, execute.
This is usually ensured because target features are assumed to never disappear, and:
- on any unsafe call to a `#[target_feature]` function, presence of the target feature is guaranteed by the programmer through the safety requirements of the unsafe call.
- on any safe call, this is guaranteed recursively by the caller.
If you work in an environment where target features can be disabled, it is your responsibility to ensure that no code inside a target feature function (including inside a closure) runs after this (until the feature is enabled again).
**Note:** this has an effect on existing code, as nowadays closures do not inherit features from the enclosing function, and thus this strengthens a safety requirement. It was originally proposed in #73631 to solve this by adding a new type of UB: “taking a target feature away from your process after having run code that uses that target feature is UB” .
This was motivated by userspace code already assuming in a few places that CPU features never disappear from a program during execution (see i.e. 2e29bdf908/crates/std_detect/src/detect/arch/x86.rs); however, concerns were raised in the context of the Linux kernel; thus, we propose to relax that requirement to "causing the set of usable features to be reduced is unsafe; when doing so, the programmer is required to ensure that no closures or safe fn pointers that use removed features are still in scope".
* [Fix #[inline(always)] on closures with target feature 1.1 #111836](https://github.com/rust-lang/rust/pull/111836)
Closures accept `#[inline(always)]`, even within functions marked with `#[target_feature]`. Since these attributes conflict, `#[inline(always)]` wins out to maintain compatibility.
### ABI concerns
* [The extern "C" ABI of SIMD vector types depends on target features #116558](https://github.com/rust-lang/rust/issues/116558)
The ABI of some types can change when compiling a function with different target features. This could have introduced unsoundness with target_feature_11, but recent fixes (#133102, #132173) either make those situations invalid or make the ABI no longer dependent on features. Thus, those issues should no longer occur.
### Special functions
The `#[target_feature]` attribute is forbidden from a variety of special functions, such as main, current and future lang items (e.g. `#[start]`, `#[panic_handler]`), safe default trait implementations and safe trait methods.
This was not disallowed at the time of the first stabilization PR for target_features_11, and resulted in the following issues/PRs:
* [`#[target_feature]` is allowed on `main` #108645](https://github.com/rust-lang/rust/issues/108645)
* [`#[target_feature]` is allowed on default implementations #108646](https://github.com/rust-lang/rust/issues/108646)
* [#[target_feature] is allowed on #[panic_handler] with target_feature 1.1 #109411](https://github.com/rust-lang/rust/issues/109411)
* [Prevent using `#[target_feature]` on lang item functions #115910](https://github.com/rust-lang/rust/pull/115910)
## Documentation
* Reference: [Document the `target_feature_11` feature reference#1181](https://github.com/rust-lang/reference/pull/1181)
---
cc tracking issue https://github.com/rust-lang/rust/issues/69098
cc ```@workingjubilee```
cc ```@RalfJung```
r? ```@rust-lang/lang```
tests: `-Copt-level=3` instead of `-O` in assembly tests
An effective blocker for redefining the meaning of `-O` is to stop reusing this somewhat ambiguous alias in our own assembly test suite. The choice between `-Copt-level=2` and `-Copt-level=3` is arbitrary for most of our tests. In most cases it makes no difference, so I set most of them to `-Copt-level=3`, as it will lead to slightly more "normalized" assembly.
Add amdgpu target
Add amdgpu target to rustc and enable the LLVM target.
Fix compiling `core` with the amdgpu:
The amdgpu backend makes heavy use of different address spaces. This
leads to situations, where a pointer in one addrspace needs to be casted
to a pointer in a different addrspace. `bitcast` is invalid for this
case, `addrspacecast` needs to be used.
Fix compilation failures that created bitcasts for such cases by
creating pointer casts (which creates an `addrspacecast` under the hood)
instead.
MCP: https://github.com/rust-lang/compiler-team/issues/823
Tracking issue: #135024
Kinda related to the original amdgpu tracking issue #51575 (though that one has been closed for a while).
Pick the max DWARF version when LTO'ing modules with different versions
Currently, when rustc compiles code with `-Clto` enabled that was built
with different choices for `-Zdwarf-version`, a warning will be
reported. It's very easy to observe this by compiling most anything (eg,
"hello world") and specifying `-Clto -Zdwarf-version=5` since the
standard library is distributed with `-Zdwarf-version=4`.
This behavior isn't actually useful for a few reasons:
- From observation, LLVM chooses to pick the highest DWARF version
anyway after issuing the warning.
- Clang specifies that in this case, the max version should be picked
without a warning and as a general principle, we want to support
x-lang LTO with Clang which implies using the same module flag merge
behaviors.
- Debuggers need to be able to handle a variety of versions within the
same debugging session as you can easily have some parts of a binary
(or some dynamic libraries within an application) all compiled with
different DWARF versions.
This commit changes the module flag merge behavior to match Clang and
use the highest version of DWARF. It also adds a test to ensure this
behavior is respected in the case of two crates being LTO'd together and
adds a test to ensure no warning is printed.
Fixes#130041 which fails due to these warnings being printed
cc #103057
Avoid using make_direct_deprecated() in extern "ptx-kernel"
This method will be removed in the future as it produces a broken ABI that depends on cg_llvm implementation details. After this PR wasm32-unknown-unknown is the only remaining user of make_direct_deprecated().
Fixes https://github.com/rust-lang/rust/issues/117271
Blocks https://github.com/rust-lang/rust/issues/38788
Consistently use the highest bit of vector masks when converting to i1 vectors
This improves the codegen for vector `select`, `gather`, `scatter` and boolean reduction intrinsics and fixesrust-lang/portable-simd#316.
The current behavior of most mask operations during llvm codegen is to truncate the mask vector to <N x i1>, telling llvm to use the least significat bit. The exception is the `simd_bitmask` intrinsics, which already used the most signifiant bit.
Since sse/avx instructions are defined to use the most significant bit, truncating means that llvm has to insert a left shift to move the bit into the most significant position, before the mask can actually be used.
Similarly on aarch64, mask operations like blend work bit by bit, repeating the least significant bit across the whole lane involves shifting it into the sign position and then comparing against zero.
By shifting before truncating to <N x i1>, we tell llvm that we only consider the most significant bit, removing the need for additional shift instructions in the assembly.
This improves the codegen for vector `select`, `gather`, `scatter` and
boolean reduction intrinsics and fixesrust-lang/portable-simd#316.
The current behavior of most mask operations during llvm codegen is to
truncate the mask vector to <N x i1>, telling llvm to use the least
significat bit. The exception is the `simd_bitmask` intrinsics, which
already used the most signifiant bit.
Since sse/avx instructions are defined to use the most significant bit,
truncating means that llvm has to insert a left shift to move the bit
into the most significant position, before the mask can actually be
used.
Similarly on aarch64, mask operations like blend work bit by bit,
repeating the least significant bit across the whole lane involves
shifting it into the sign position and then comparing against zero.
By shifting before truncating to <N x i1>, we tell llvm that we only
consider the most significant bit, removing the need for additional
shift instructions in the assembly.
Fix tests on LLVM 20
For sparcv8plus.rs, duplicate the test for LLVM 19 and LLVM 20. LLVM 20 resolves one of the FIXME in the test.
For x86_64-bigint-add.rs split the check lines for LLVM 19 and LLVM 20. The difference in codegen here is due to a difference in unroll factor, which I believe is not what the test is interested in.
Fixes https://github.com/rust-lang/rust/issues/132957.
Fixes https://github.com/rust-lang/rust/issues/133754.