bjoernager/rust - mandelbrot.dk

Author	SHA1	Message	Date
Amanieu d'Antras	5d90ccb0fa	Move `select_unpredictable` to the `hint` module	2025-04-13 01:34:25 +01:00
Bennet Bleßmann	7dd57f085c	update/bless tests	2025-04-06 21:41:47 +02:00
Josh Stone	12167d7064	Update the minimum external LLVM to 19	2025-04-05 11:44:38 -07:00
bors	1df5affaca	Auto merge of #133984 - DaniPopes:scmp-ucmp, r=scottmcm Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics. These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19. I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something. r? `@scottmcm`	2025-03-24 22:53:12 +00:00
Scott McMurray	143f39362a	Don't `alloca` just to look at a discriminant Today we're making LLVM do a bunch of extra work for every enum you match on, even trivial stuff like `Option<bool>`. Let's not.	2025-03-12 00:56:43 -07:00
bors	446649d463	Auto merge of #137513 - scottmcm:identity-transmute, r=saethlin Don't re-`assume` in `transmute`s that don't change niches I noticed in nightly 2025-02-21 that `transmute` is emitting way more `assume`s than necessary for newtypes. For example, the three transmutes in <https://rust.godbolt.org/z/fW1KaTc4o> emits ```rust define noundef range(i32 1, 0) i32 `@repeatedly_transparent_transmute(i32` noundef range(i32 1, 0) %_1) unnamed_addr { start: %0 = sub i32 %_1, 1 %1 = icmp ule i32 %0, -2 call void `@llvm.assume(i1` %1) %2 = sub i32 %_1, 1 %3 = icmp ule i32 %2, -2 call void `@llvm.assume(i1` %3) %4 = sub i32 %_1, 1 %5 = icmp ule i32 %4, -2 call void `@llvm.assume(i1` %5) %6 = sub i32 %_1, 1 %7 = icmp ule i32 %6, -2 call void `@llvm.assume(i1` %7) %8 = sub i32 %_1, 1 %9 = icmp ule i32 %8, -2 call void `@llvm.assume(i1` %9) %10 = sub i32 %_1, 1 %11 = icmp ule i32 %10, -2 call void `@llvm.assume(i1` %11) ret i32 %_1 } ``` But those are all just newtypes that don't change size or niches, so none of it's needed. After this PR it's down to just ```rust define noundef range(i32 1, 0) i32 `@repeatedly_transparent_transmute(i32` noundef range(i32 1, 0) %_1) unnamed_addr { start: ret i32 %_1 } ``` because none of those `assume`s in the original actually did anything. (Transmuting to something with a difference niche, though, still has the assumes -- the other tests continue to pass checking that.)	2025-03-09 01:25:48 +00:00
Scott McMurray	d9432acfe1	Use `trunc nuw`+`br` for 0/1 branches even in optimized builds Rather than needing to use `switch` for them to include the `unreachable` arm	2025-03-06 22:25:49 -08:00
DaniPopes	58c10c66c1	Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics, added in LLVM 19.	2025-03-06 22:29:05 +08:00
David Wood	92eb4450fa	tests: use minicore more minicore makes it much easier to add new language items to all of the existing `no_core` tests.	2025-02-24 09:26:54 +00:00
Scott McMurray	23c6b93de8	Don't re-`assume` in `transmute`s that don't change niches	2025-02-23 23:18:04 -08:00
bors	e0be1a0262	Auto merge of #137271 - nikic:gep-nuw-2, r=scottmcm Emit getelementptr inbounds nuw for pointer::add() Lower pointer::add (via intrinsic::offset with unsigned offset) to getelementptr inbounds nuw on LLVM versions that support it. This lets LLVM make use of the pre-condition that the offset addition does not wrap in an unsigned sense. Together with inbounds, this also implies that the offset is non-negative. Fixes https://github.com/rust-lang/rust/issues/137217.	2025-02-24 03:06:16 +00:00
Trevor Gross	a2bb4d748d	Rollup merge of #136543 - RalfJung:round-ties-even, r=tgross35 intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic LLVM has three intrinsics here that all do the same thing (when used in the default FP environment). There's no reason Rust needs to copy that historically-grown mess -- let's just have one intrinsic and leave it up to the LLVM backend to decide how to lower that. Suggested by `@hanna-kruppe` in https://github.com/rust-lang/rust/issues/136459; Cc `@tgross35` try-job: test-various	2025-02-23 14:30:25 -05:00
Scott McMurray	511bf307f0	Emit `trunc nuw` for unchecked shifts and `to_immediate_scalar` - For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information - Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information	2025-02-19 11:36:52 -08:00
Nikita Popov	31cc4c074d	Emit getelementptr inbounds nuw for pointer::add()	2025-02-19 11:32:32 +01:00
bors	17c1c329a5	Auto merge of #135408 - RalfJung:x86-sse2, r=workingjubilee x86: use SSE2 to pass float and SIMD types This builds on the new X86Sse2 ABI landed in https://github.com/rust-lang/rust/pull/137037 to actually make it a separate ABI from the default x86 ABI, and use SSE2 registers. Specifically, we use it in two ways: to return `f64` values in a register rather than by-ptr, and to pass vectors of size up to 128bit in a register (or, well, whatever LLVM does when passing `<4 x float>` by-val, I don't actually know if this ends up in a register). Cc `@workingjubilee` Fixes #133611 try-job: aarch64-apple try-job: aarch64-gnu try-job: aarch64-gnu-debug try-job: test-various try-job: x86_64-gnu-nopt try-job: dist-i586-gnu-i586-i686-musl try-job: x86_64-msvc-1	2025-02-19 01:25:01 +00:00
Ralf Jung	803feb5dc6	x86-sse2 ABI: use SSE registers for floats and SIMD	2025-02-18 16:11:41 +01:00
bors	3b022d8cee	Auto merge of #133852 - x17jiri:cold_path, r=saethlin improve cold_path() #120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely` However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this: ``` if let Some(x) = y { ... } ``` may generate 3-target switch: ``` switch y.discriminator: 0 => true branch 1 = > false branch _ => unreachable ``` and therefore marking a branch as cold will have no effect. This PR improves `cold_path()` to work with arbitrary switch instructions. Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.	2025-02-18 07:49:09 +00:00
Jiri Bobek	7bb5f4dd78	improve cold_path()	2025-02-17 06:39:58 +01:00
Scott McMurray	0cc14b688d	`transmute` should also assume non-null pointers Previously it only did integer-ABI things, but this way it does data pointers too. That gives more information in general to the backend, and allows slightly simplifying one of the helpers in slice iterators.	2025-02-12 23:01:27 -08:00
Jubilee Young	3c0c9b6770	tests/codegen: use -Copt-level=3 instead of -O	2025-02-11 13:41:35 -08:00
Ralf Jung	04e7a10af6	intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic	2025-02-04 16:27:29 +01:00
Scott McMurray	f46e6be190	Handle the case where the `or disjoint` folds immediately to a constant	2025-02-02 21:04:10 -08:00
Scott McMurray	4ee1602eab	Override `disjoint_or` in the LLVM backend	2025-01-31 22:29:08 -08:00
Scott McMurray	6fe82006a4	Update our range `assume`s to the format that LLVM prefers	2025-01-17 20:39:38 -08:00
Trevor Gross	74d2d4bfa4	Expand the `select_unpredictable` test for ZSTs For ZSTs there is no selection that needs to take place, so assert that no `select` statement is emitted.	2025-01-05 08:51:15 +00:00
Trevor Gross	d42c3ae02f	Merge the intrinsic and user tests for `select_unpredictable` [1] mentions that having a single test with `-Zmerge-functions=disabled` is preferable to having two separate tests. Apply that to the new `select_unpredicatble` test here. [1]: https://github.com/rust-lang/rust/pull/133964#issuecomment-2569693325	2025-01-05 01:17:07 +00:00
Matthew Maurer	ed005245c6	Update carrying_mul_add test to tolerate `nuw` LLVM 20 adds nuw to GEP operations in this code, tolerate them.	2025-01-03 20:25:14 +00:00
bors	4e5fec2f1e	Auto merge of #134757 - RalfJung:const_swap, r=scottmcm stabilize const_swap libs-api FCP passed in https://github.com/rust-lang/rust/issues/83163. However, I only just realized that this actually involves an intrinsic. The intrinsic could be implemented entirely with existing stable const functionality, but we choose to make it a primitive to be able to detect more UB. So nominating for `@rust-lang/lang` to make sure they are aware; I leave it up to them whether they want to FCP this. While at it I also renamed the intrinsic to make the "nonoverlapping" constraint more clear. Fixes #83163	2024-12-30 23:46:42 +00:00
Scott McMurray	4669c0d756	Override `carrying_mul_add` in cg_llvm	2024-12-27 08:17:40 -08:00
Ralf Jung	7291b1eaf7	rename typed_swap → typed_swap_nonoverlapping	2024-12-25 10:53:03 +01:00
Jiri Bobek	777003ae9f	Likely unlikely fix	2024-11-17 21:49:10 +01:00
Asuna	6b65524620	Set `signext` or `zeroext` for integer arguments on RISC-V	2024-10-23 04:42:03 +02:00
clubby789	5b96ae7106	Don't codegen `expect` in opt-level=0	2024-09-04 11:49:00 +00:00
DianQK	4508800d20	Don't generate functions with the `rustc_intrinsic_must_be_overridden` attribute	2024-08-19 06:26:52 +08:00
Ralf Jung	75743dc5a0	make the codegen test also cover an ill-behaved arch, and add links	2024-08-12 11:42:38 +02:00
Ralf Jung	28e0907111	nontemporal_store: make sure that the intrinsic is truly just a hint	2024-08-05 10:57:14 +02:00
bors	710ce90fbe	Auto merge of #128250 - Amanieu:select_unpredictable, r=nikic Add `select_unpredictable` to force LLVM to use CMOV Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs into branches if it comes from a `select` marked with an `unpredictable` metadata attribute. This PR introduces `core::intrinsics::select_unpredictable` which emits such a `select` and uses it in the implementation of `binary_search_by`.	2024-07-30 03:22:27 +00:00
Nicholas Nethercote	84ac80f192	Reformat `use` declarations. The previous commit updated `rustfmt.toml` appropriately. This commit is the outcome of running `x fmt --all` with the new formatting options.	2024-07-29 08:26:52 +10:00
Amanieu d'Antras	4f78f9fbb0	Force LLVM to use CMOV for binary search Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs into branches if it comes from a `select` marked with an `unpredictable` metadata attribute. This PR introduces `core::intrinsics::select_unpredictable` which emits such a `select` and uses it in the implementation of `binary_search_by`.	2024-07-28 17:24:57 +01:00
Nicholas Nethercote	72800d3b89	Run rustfmt on `tests/codegen/`. Except for `simd-intrinsic/`, which has a lot of files containing multiple types like `u8x64` which really are better when hand-formatted. There is a surprising amount of two-space indenting in this directory. Non-trivial changes: - `rustfmt::skip` needed in `debug-column.rs` to preserve meaning of the test. - `rustfmt::skip` used in a few places where hand-formatting read more nicely: `enum/enum-match.rs` - Line number adjustments needed for the expected output of `debug-column.rs` and `coroutine-debug.rs`.	2024-05-31 15:56:43 +10:00
Scott McMurray	459ce3f6bb	Add an intrinsic for `ptr::metadata`	2024-05-28 09:28:51 -07:00
Scott McMurray	f60f2e8cb0	Fix ICE in non-operand `aggregate_raw_ptr` instrinsic codegen	2024-05-16 09:43:42 -07:00
Gary Guo	cfee72aa24	Fix tests and bless	2024-04-24 13:12:33 +01:00
bors	29a56a3b1c	Auto merge of #122053 - erikdesjardins:alloca, r=nikic Stop using LLVM struct types for alloca The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation. Split out from #121577. r? `@ghost`	2024-04-24 03:00:44 +00:00
Matthias Krüger	918304b190	Rollup merge of #124003 - WaffleLapkin:dellvmization, r=scottmcm,RalfJung,antoyo Dellvmize some intrinsics (use `u32` instead of `Self` in some integer intrinsics) This implements https://github.com/rust-lang/compiler-team/issues/693 minus what was implemented in #123226. Note: I decided to _not_ change `shl`/... builder methods, as it just doesn't seem worth it. r? ``@scottmcm``	2024-04-23 20:17:51 +02:00
Markus Reiter	33e68aadc9	Stabilize generic `NonZero`.	2024-04-22 18:48:47 +02:00
Maybe Waffle	c2046c4b09	Add codegen tests for changed intrinsics	2024-04-16 12:35:22 +00:00
Erik Desjardins	f4426c189f	use [N x i8] for alloca types	2024-04-11 21:42:35 -04:00
bors	a77322c16f	Auto merge of #118310 - scottmcm:three-way-compare, r=davidtwco Add `Ord::cmp` for primitives as a `BinOp` in MIR Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review. --- There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level: 1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest. 2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations. Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it. As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with `8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113)` -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be much harder.) --- r? `@ghost` But first I should check that perf is ok with this ~~...and my true nemesis, tidy.~~	2024-04-02 19:21:44 +00:00
clubby789	b500693ad7	Don't emit load metadata in debug mode	2024-03-25 18:32:45 +00:00

1 2

72 commits