This commit fixes an issue with #74695 where the fptosi and fptoui
specializations on wasm were accidentally used on vector types by the
`simd_cast` intrinsic. This issue showed up as broken CI for the stdsimd
crate. Here this commit simply skips the specialization on vector kinds
flowing into `fpto{s,u}i`.
This commit improves code generation for WebAssembly targets when
translating floating to integer casts. This improvement is only relevant
when the `nontrapping-fptoint` feature is not enabled, but the feature
is not enabled by default right now. Additionally this improvement only
affects safe casts since unchecked casts were improved in #74659.
Some more background for this issue is present on #73591, but the
general gist of the issue is that in LLVM the `fptosi` and `fptoui`
instructions are defined to return an `undef` value if they execute on
out-of-bounds values; they notably do not trap. To implement these
instructions for WebAssembly the LLVM backend must therefore generate
quite a few instructions before executing `i32.trunc_f32_s` (for
example) because this WebAssembly instruction traps on out-of-bounds
values. This codegen into wasm instructions happens very late in the
code generator, so what ends up happening is that rustc inserts its own
codegen to implement Rust's saturating semantics, and then LLVM also
inserts its own codegen to make sure that the `fptosi` instruction
doesn't trap. Overall this means that a function like this:
#[no_mangle]
pub unsafe extern "C" fn cast(x: f64) -> u32 {
x as u32
}
will generate this WebAssembly today:
(func $cast (type 0) (param f64) (result i32)
(local i32 i32)
local.get 0
f64.const 0x1.fffffffep+31 (;=4.29497e+09;)
f64.gt
local.set 1
block ;; label = @1
block ;; label = @2
local.get 0
f64.const 0x0p+0 (;=0;)
local.get 0
f64.const 0x0p+0 (;=0;)
f64.gt
select
local.tee 0
f64.const 0x1p+32 (;=4.29497e+09;)
f64.lt
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
i32.and
i32.eqz
br_if 0 (;@2;)
local.get 0
i32.trunc_f64_u
local.set 2
br 1 (;@1;)
end
i32.const 0
local.set 2
end
i32.const -1
local.get 2
local.get 1
select)
This PR improves the situation by updating the code generation for
float-to-int conversions in rustc, specifically only for WebAssembly
targets and only for some situations (float-to-u8 still has not great
codegen). The fix here is to use basic blocks and control flow to avoid
speculatively executing `fptosi`, and instead LLVM's raw intrinsic for
the WebAssembly instruction is used instead. This effectively extends
the support added in #74659 to checked casts. After this commit the
codegen for the above Rust function looks like:
(func $cast (type 0) (param f64) (result i32)
(local i32)
block ;; label = @1
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
local.tee 1
i32.const 1
i32.xor
br_if 0 (;@1;)
local.get 0
f64.const 0x1.fffffffep+31 (;=4.29497e+09;)
f64.le
i32.eqz
br_if 0 (;@1;)
local.get 0
i32.trunc_f64_u
return
end
i32.const -1
i32.const 0
local.get 1
select)
For reference, in Rust 1.44, which did not have saturating
float-to-integer casts, the codegen LLVM would emit is:
(func $cast (type 0) (param f64) (result i32)
block ;; label = @1
local.get 0
f64.const 0x1p+32 (;=4.29497e+09;)
f64.lt
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
i32.and
i32.eqz
br_if 0 (;@1;)
local.get 0
i32.trunc_f64_u
return
end
i32.const 0)
So we're relatively close to the original codegen, although it's
slightly different because the semantics of the function changed where
we're emulating the `i32.trunc_sat_f32_s` instruction rather than always
replacing out-of-bounds values with zero.
There is still work that could be done to improve casts such as `f32` to
`u8`. That form of cast still uses the `fptosi` instruction which
generates lots of branch-y code. This seems less important to tackle now
though. In the meantime this should take care of most use cases of
floating-point conversion and as a result I'm going to speculate that
this...
Closes#73591
WebAssembly supports saturating floating point to integer casts behind a
target feature. The feature is already available on many browsers.
Beginning with 1.45 Rust will start defining the behavior of floating
point to integer casts to be saturating as well. For this Rust
constructs additional checks on top of the `fptoui` / `fptosi`
instructions it emits. Here we introduce the possibility for the codegen
backend to construct saturating casts itself and only fall back to
constructing the checks ourselves if that is not possible.
Implement the va args in codegen for AAPCS, this will be used as the
default va_args implementation for AArch64 rather than the va_args
llvm-ir as it currently is.
Copyright (c) 2020, Arm Limited.
This initial version only injects counters at the top of each function.
Rust Coverage will require injecting additional counters at each
conditional code branch.
Ensure that inliner inserts lifetime markers if they have been emitted during
codegen. Otherwise if allocas from inlined functions are merged together,
lifetime markers from one function might invalidate load & stores performed
by the other one.
Enable use-after-scope checks by default when using AddressSanitizer.
They allow to detect incorrect use of stack objects after their scope
have already ended. The detection is based on LLVM lifetime intrinsics.
To facilitate the use of this functionality, the lifetime intrinsics are
now emitted regardless of optimization level if enabled sanitizer makes
use of them.
This commit builds on #65501 continue to simplify the build system and
compiler now that we no longer have multiple LLVM backends to ship by
default. Here this switches the compiler back to what it once was long
long ago, which is linking LLVM directly to the compiler rather than
dynamically loading it at runtime. The `codegen-backends` directory of
the sysroot no longer exists and all relevant support in the build
system is removed. Note that `rustc` still supports a dynamically loaded
codegen backend as it did previously, it just no longer supports
dynamically loaded codegen backends in its own sysroot.
Additionally as part of this the `librustc_codegen_llvm` crate now once
again explicitly depends on all of its crates instead of implicitly
loading them through the sysroot. This involved filling out its
`Cargo.toml` and deleting all the now-unnecessary `extern crate`
annotations in the header of the crate. (this in turn required adding a
number of imports for names of macros too).
The end results of this change are:
* Rustbuild's build process for the compiler as all the "oh don't forget
the codegen backend" checks can be easily removed.
* Building `rustc_codegen_llvm` is much simpler since it's simply
another compiler crate.
* Managing the dependencies of `rustc_codegen_llvm` is much simpler since
it's "just another `Cargo.toml` to edit"
* The build process should be a smidge faster because there's more
parallelism in the main rustc build step rather than splitting
`librustc_codegen_llvm` out to its own step.
* The compiler is expected to be slightly faster by default because the
codegen backend does not need to be dynamically loaded.
* Disabling LLVM as part of rustbuild is still supported, supporting
multiple codegen backends is still supported, and dynamic loading of a
codegen backend is still supported.
This allows us to remove `static_panic_msg` from the SSA<->LLVM
boundary, along with its fat pointer representation for &str.
Also changes the signature of PanicInfo::internal_contructor to
avoid copying.
Closes#65856.
place: Passing `align` = `layout.align.abi`, when also passing `layout`
Of the calls changed:
7/12 use `align` = `layout.align.abi`.
`from_const_alloc` uses `alloc.align`, but that is `assert_eq!` to `layout.align.abi`.
only 4/11 use something interesting for `align`.
so rename it `new_sized_aligned`.
6/11 use `align` = `layout.align.abi`.
`from_const_alloc` uses `alloc.align`, but that is `assert_eq!` to `layout.align.abi`.
only 4/11 use something interesting for `align`.
The errors are either:
- The meta-variable used in the right-hand side is not bound (or defined) in the
left-hand side.
- The meta-variable used in the right-hand side does not repeat with the same
kleene operator as its binder in the left-hand side. Either it does not repeat
enough, or it uses a different operator somewhere.
This change should have no semantic impact.