1
Fork 0
Commit graph

285822 commits

Author SHA1 Message Date
Ayush Singh
af25995d11
std: sys: stdio: uefi: Tread UNSUPPORTED Status as read(0)
Allows implementing Stdio::Null for Command in a deterministic manner.

Signed-off-by: Ayush Singh <ayush@beagleboard.org>
2025-04-13 23:22:59 +05:30
Ayush Singh
a404015775
std: sys: process: uefi: Use NULL stdin by default
According to the docs in `Command::output`:

> By default, stdout and stderr are captured (and used to provide the
resulting output). Stdin is not inherited from the parent and any attempt
by the child process to read from the stdin stream will result in the
stream immediately closing.

This was being violated by UEFI which was inheriting stdin by default.

While the docs don't explicitly state that the default should be NULL,
the behaviour seems like reading from NULL.

UEFI however, has a bit of a problem. The `EFI_SIMPLE_TEXT_INPUT_PROTOCOL`
only provides support for reading 1 key press. This means that you
either get an error, or it is assumed that the keypress was read
successfully. So there is no way to have a successful read of length 0.
Currently, I am returning UNSUPPORTED error when trying to read from
NULL stdin. On linux however, you will get a read of length 0 for Null
stdin.

One possible way to get around this is to translate one of the UEFI
errors to a read 0 (Maybe unsupported?). It is also possible to have a
non-standard error code, but well, not sure if we go that route.

Alternatively, if meaning of Stdio::Null is platform dependent, it
should be fine to keep the current behaviour of returning an error.

Signed-off-by: Ayush Singh <ayush@beagleboard.org>
2025-04-13 23:22:58 +05:30
bors
175dcc7773 Auto merge of #139439 - weihanglo:update-cargo, r=weihanglo
Update cargo

17 commits in a6c604d1b8a2f2a8ff1f3ba6092f9fda42f4b7e9..0e93c5bf6a1d5ee7bc2af63d1afb16cd28793601
2025-03-26 18:11:00 +0000 to 2025-04-05 00:00:24 +0000
- chore(deps): bump openssl from 0.10.71 to 0.10.72 (rust-lang/cargo#15394)
- chore(ci): restore cargo-util semver check (rust-lang/cargo#15389)
- docs(changelog): polish changelog items (rust-lang/cargo#15379)
- chore(deps): update msrv (1 version) to v1.86 (rust-lang/cargo#15381)
- chore: add aarch64 linux runner (rust-lang/cargo#15077)
- Added `build_directory` field to cargo metadata output (rust-lang/cargo#15377)
- chore(deps): update rust crate rusqlite to 0.34.0 (rust-lang/cargo#15373)
- Prevent undeclared public network access (rust-lang/cargo#15368)
- rename the `author` field to be `authors` in book.toml (rust-lang/cargo#15362)
- move modules from kebab-case to snake_case (rust-lang/cargo#14439)
- chore: bump to 0.89.0; update changelog (rust-lang/cargo#15372)
- docs(unstable): update `-Zrustdoc-depinfo` tracking issue link (rust-lang/cargo#15371)
- fix(tree): Make output more deterministic (rust-lang/cargo#15369)
- feat: rustdoc depinfo rebuild detection via -Zrustdoc-depinfo (rust-lang/cargo#15359)
- Rename the gc config table (rust-lang/cargo#15367)
- Revert "Temporarily ignore cargo_test_doctest_xcompile_ignores" (rust-lang/cargo#15357)
- Don't canonicalize executable path in `cargo_exe` (rust-lang/cargo#15355)

r? ghost
2025-04-06 13:23:48 +00:00
bors
f5c510260b Auto merge of #138947 - madsmtm:refactor-apple-versions, r=Noratrieb
Refactor Apple version handling in the compiler

Move various Apple version handling code in the compiler out `rustc_codegen_ssa` and into a place where it can be accessed by `rustc_attr_parsing`, which I found to be necessary when doing https://github.com/rust-lang/rust/pull/136867. Thought I'd split it out to make it easier to land, and to make further changes like https://github.com/rust-lang/rust/pull/131477 have fewer conflicts / PR dependencies.

There should be no functional changes in this PR.

`@rustbot` label O-apple
r? rust-lang/compiler
2025-04-06 10:16:28 +00:00
bors
e7df5b055d Auto merge of #139443 - Zalathar:rollup-c54pncs, r=Zalathar
Rollup of 3 pull requests

Successful merges:

 - #139123 (tidy: Fix paths to `coretests` and `alloctests`)
 - #139347 (Only build `rust_test_helpers` for `{incremental,ui}` test suites)
 - #139438 (Prevent a test from seeing forbidden numbers in the rustc version)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-04-06 07:10:05 +00:00
Stuart Cook
f4aa209e20
Rollup merge of #139438 - Zalathar:fix-test-122600, r=scottmcm
Prevent a test from seeing forbidden numbers in the rustc version

The final CHECK-NOT directive in this test was able to see past the end of the enclosing function, and find the substring `753` or `754` in the git hash in the rustc version number, causing false failures in CI whenever the git hash happens to contain those digits in sequence.

Adding an explicit check for `ret` prevents the CHECK-NOT directive from seeing past the end of the function.

---

Manually tested by adding `// CHECK-NOT: rustc` after the existing CHECK-NOT directives, and demonstrating that the new check prevents it from seeing the rustc version string.
2025-04-06 16:21:03 +10:00
Stuart Cook
f55034b2eb
Rollup merge of #139347 - jieyouxu:rust_test_helpers, r=onur-ozkan
Only build `rust_test_helpers` for `{incremental,ui}` test suites

Only build `rust_test_helpers` for `{incremental,ui}` test suites.

Context: Trying to see what test suites actually need `rust_test_helpers`, because this was causing unnecessary local failures when trying to run `./x test tests/run-make --target=wasm32-unknown-unknown` when `run-make` tests don't need `rust_test_helpers` at all.

r? `@ghost`

try-job: armhf-gnu
try-job: test-various
try-job: x86_64-apple-1
try-job: aarch64-apple
try-job: x86_64-msvc-1
try-job: i686-msvc-1
try-job: x86_64-mingw-1
try-job: i686-mingw-1
2025-04-06 16:21:02 +10:00
Stuart Cook
fededb9906
Rollup merge of #139123 - thaliaarchi:core-alloc-test-paths, r=bjorn3
tidy: Fix paths to `coretests` and `alloctests`

Following `#135937` and `#136642`, tests for core and alloc are in coretests and alloctests. Fix tidy to lint for the new paths. Also, update comments referring to the old locations.

Some context for changes which don't match that pattern:
- `library/std/src/thread/local/dynamic_tests.rs` and `library/std/src/sync/mpsc/sync_tests.rs` were moved under `library/std/tests/` in 332fb7e6f1 (Move std::thread_local unit tests to integration tests, 2025-01-17) and b8ae372e48 (Move std::sync unit tests to integration tests, 2025-01-17), respectively, so are no longer special cases.
- There never was a `library/core/tests/fmt.rs` file. That comment previously referred to `src/test/ui/ifmt.rs`, which was folded into `library/alloc/tests/fmt.rs` in 949c96660c (move format! interface tests, 2020-09-08).

Now, the only matches for `(alloc|core)/tests` are in `compiler/rustc_codegen_{cranelift,gcc}/patches`. I don't know why CI hasn't broken because those patches can't apply. Or maybe they somehow still can apply?

r? `@bjorn3`
2025-04-06 16:21:02 +10:00
Weihang Lo
3034e571a4
Update cargo 2025-04-06 00:04:41 -04:00
Zalathar
f6afb35c61 Prevent a test from seeing forbidden numbers in the rustc version
The final CHECK-NOT directive in this test was able to see past the end of the
enclosing function, and find the substring 753 or 754 in the git hash in the
rustc version number, causing false failures in CI.

Adding an explicit check for `ret` prevents the CHECK-NOT directive from seeing
past the end of the function.
2025-04-06 12:38:20 +10:00
bors
1de931283d Auto merge of #139411 - yotamofek:pr/mir_transform/instsimplify, r=compiler-errors
In `simplify_repeated_aggregate`, don't test first element against itself

r? `@saethlin`
Noticed that in `InstSimplifyContext::simplify_repeated_aggregate`, we're accidentally evaluating the first element's value twice, and then comparing it with itself, instead of just checking whether the rest of the elements are equal to the first one.
This will probably save very few cycles, but since `InstSimplify` is always enabled, this might improve perf by a bit.
2025-04-06 01:45:33 +00:00
bors
c2110769cd Auto merge of #139275 - cuviper:min-llvm-19, r=nikic
Update the minimum external LLVM to 19

With this change, we'll have stable support for LLVM 19 and 20.
For reference, the previous increase to LLVM 18 was #130487.

cc `@rust-lang/wg-llvm`
r? nikic
2025-04-05 22:00:33 +00:00
Thalia Archibald
3af666ea91 tidy: Fix paths to coretests and alloctests
Following `#135937` and `#136642`, tests for core and alloc are in
coretests and alloctests. Fix tidy to lint for the new paths. Also,
update comments referring to the old locations.

Some context for changes which don't match that pattern:
* library/std/src/thread/local/dynamic_tests.rs and
  library/std/src/sync/mpsc/sync_tests.rs were moved under
  library/std/tests/ in 332fb7e6f1 (Move std::thread_local unit tests
  to integration tests, 2025-01-17) and b8ae372e48 (Move std::sync unit
  tests to integration tests, 2025-01-17), respectively, so are no
  longer special cases.
* There never was a library/core/tests/fmt.rs file. That comment
  previously referred to src/test/ui/ifmt.rs, which was folded into
  library/alloc/tests/fmt.rs in 949c96660c (move format! interface
  tests, 2020-09-08).
2025-04-05 12:15:49 -07:00
bors
5e17a2a91d Auto merge of #139417 - matthiaskrgr:rollup-ktf1d6s, r=matthiaskrgr
Rollup of 5 pull requests

Successful merges:

 - #136877 (Fix missing const for inherent pointer `replace` methods)
 - #138797 (Fix `ProvenVia` for global where clauses)
 - #139121 (Rename internal module from `statik` to `no_threads`)
 - #139319 (StableMIR: Prepare for refactoring)
 - #139404 (Small smir cleanup)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-04-05 18:54:05 +00:00
Josh Stone
12167d7064 Update the minimum external LLVM to 19 2025-04-05 11:44:38 -07:00
Matthias Krüger
91377bd4ca
Rollup merge of #139404 - yotamofek:pr/smir/cleanup, r=compiler-errors
Small smir cleanup

First commit might have small positive perf effect, second one is just to make code a bit shorter
2025-04-05 19:40:26 +02:00
Matthias Krüger
2769522e6e
Rollup merge of #139319 - makai410:refactor, r=celinval
StableMIR: Prepare for refactoring

Temporarily make `stable_mir` "parasitic" on the `rustc_smir` crate.

It aims to resolve the circular dependency that would arise if we directly invert the dependency order between `rustc_smir` and `stable_mir`.

Once the refactoring is complete (`rustc_smir` does not depend on `stable_mir`), we will migrate it back to the `stable_mir` crate. See more details: [here](https://hackmd.io/jBRkZLqAQL2EVgwIIeNMHg).
2025-04-05 19:40:25 +02:00
Matthias Krüger
527725b025
Rollup merge of #139121 - thaliaarchi:rename-thread_local-statik, r=Noratrieb
Rename internal module from `statik` to `no_threads`

This module is named in reference to the keyword, but the term is somewhat overloaded. Rename it to more clearly describe it and avoid the misspelling.
2025-04-05 19:40:24 +02:00
Matthias Krüger
6d88291d3c
Rollup merge of #138797 - compiler-errors:global-proven-via, r=lcnr
Fix `ProvenVia` for global where clauses

When we're merging one (or more) global where clauses in the presence of no other candidates, ensure that we return `TraitGoalProvenVia::ParamEnv` so that rigid projections work correctly. This fixes some tests with `feature(trivial_bounds)`.

Fixes #139408
2025-04-05 19:40:24 +02:00
Matthias Krüger
0b342873e3
Rollup merge of #136877 - Sky9x:const-inherent-ptr-replace, r=jhpratt
Fix missing const for inherent pointer `replace` methods

`ptr::replace` (the free fn) is already const stable. However, there are inherent convenience methods on `*mut T` and `NonNull<T>`, allowing you to write eg. `unsafe { foo.replace(bar) }` where `foo` is `*mut T` or `NonNull<T>`.

It seems const was never added to the inherent method (likely oversight), so this PR adds it.
I don't believe this needs another[^1] FCP as the inherent methods are already stable and `ptr::replace` is already const stable, so this adds no new API.

Original tracking issue: #83164
`ptr::replace` constified in #83091
`ptr::replace` const stabilized in #130954

[^1]: `const_replace` FCP completed: https://github.com/rust-lang/rust/issues/83164#issuecomment-2385670050
2025-04-05 19:40:23 +02:00
Michael Goulet
89d0e7c033 Fix ProvenVia for global where clauses 2025-04-05 16:23:25 +00:00
Yotam Ofek
5b596cd28b In simplify_repeated_aggregate, don't test first element against itself 2025-04-05 14:01:41 +00:00
bors
0c478fdfe1 Auto merge of #139292 - compiler-errors:folder-experiment-7, r=lqd
Folder experiment: Micro-optimize RegionEraserVisitor

**NOTE:** This is one of a series of perf experiments that I've come up with while sick in bed. I'm assigning them to lqd b/c you're a good reviewer and you'll hopefully be awake when these experiments finish, lol.

r? lqd

The region eraser is very hot, so let's see if we can avoid erasing types (and visiting consts and preds that don't have region-ful types) unnecessarily.
2025-04-05 12:33:47 +00:00
Makai
707d356d00 let rustc_smir host stable_mir for refactoring 2025-04-05 18:23:07 +08:00
bors
0e9c3e52e4 Auto merge of #139401 - matthiaskrgr:rollup-uqdfj6u, r=matthiaskrgr
Rollup of 4 pull requests

Successful merges:

 - #138368 (KCFI: Add KCFI arity indicator support)
 - #138381 (Implement `SliceIndex` for `ByteStr`)
 - #139092 (Move `fd` into `std::sys`)
 - #139398 (Change notifications for Exploit Mitigations PG)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-04-05 09:25:41 +00:00
Yotam Ofek
33d62dd208 Dedup call to layout query 2025-04-05 08:59:49 +00:00
Yotam Ofek
89d9dd6e15 Only format! error message on failure 2025-04-05 08:59:13 +00:00
Matthias Krüger
0823b34bfb
Rollup merge of #139398 - rcvalle:rust-exploit-mitigations-pg-notifications, r=cuviper
Change notifications for Exploit Mitigations PG

Reduce the amount of notifications sent to all the Exploit Mitigations PG by removing it from some of the paths.
2025-04-05 10:18:05 +02:00
Matthias Krüger
a64ccf4a46
Rollup merge of #139092 - thaliaarchi:move-fd-pal, r=joboet
Move `fd` into `std::sys`

Move platform definitions of `fd` into `std::sys`, as part of https://github.com/rust-lang/rust/issues/117276.

Unlike other modules directly under `std::sys`, this is only available on some platforms and I have not provided a fallback abstraction for unsupported platforms. That is similar to how `std::os::fd` is gated to only supported platforms.

Also, fix the `unsafe_op_in_unsafe_fn` lint, which was allowed for the Unix fd impl. Since macro expansions from `std::sys::pal::unix::weak` trigger this lint, fix it there too.

cc `@joboet,` `@ChrisDenton`

try-job: x86_64-gnu-aux
2025-04-05 10:18:04 +02:00
Matthias Krüger
56ffb43629
Rollup merge of #138381 - thaliaarchi:bstr-sliceindex, r=joshtriplett
Implement `SliceIndex` for `ByteStr`

Implement `Index` and `IndexMut` for `ByteStr` in terms of `SliceIndex`. Implement it for the same types that `&[u8]` supports (a superset of those supported for `&str`, which does not have `usize` and `ops::IndexRange`).

At the same time, move compare and index traits to a separate file in the `bstr` module, to give it more space to grow as more functionality is added (e.g., iterators and string-like ops). Order the items in `bstr/traits.rs` similarly to `str/traits.rs`.

cc `@joshtriplett`

`ByteStr`/`ByteString` tracking issue: https://github.com/rust-lang/rust/issues/134915
2025-04-05 10:18:03 +02:00
Matthias Krüger
543160dd62
Rollup merge of #138368 - rcvalle:rust-kcfi-arity, r=davidtwco
KCFI: Add KCFI arity indicator support

Adds KCFI arity indicator support to the Rust compiler (see https://github.com/rust-lang/rust/issues/138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05 10:18:03 +02:00
bors
da8321773a Auto merge of #139281 - petrochenkov:ctxtdecod6, r=wesleywiser
hygiene: Avoid recursion in syntax context decoding

#139241 has two components
- Avoiding recursion during syntax context decoding
- Encoding/decoding only the non-redundant data, and recalculating the redundant data again during decoding

Both of these parts may influence compilation times, possibly in opposite directions.
So this PR contains only the first part to evaluate its effect in isolation.
2025-04-05 06:18:04 +00:00
Ramon de C Valle
a98546b961 KCFI: Add KCFI arity indicator support
Adds KCFI arity indicator support to the Rust compiler (see rust-lang/rust#138311,
https://github.com/llvm/llvm-project/pull/121070, and
https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05 04:05:04 +00:00
Thalia Archibald
9b889e9198 Rename internal module from statik to no_threads
This module is named in reference to the keyword, but the term is
somewhat overloaded. Rename it to more clearly describe it and avoid the
misspelling.
2025-04-04 20:31:15 -07:00
Thalia Archibald
3ab22fabf1 Fix unsafe_op_in_unsafe_fn for Unix fd and weak 2025-04-04 20:11:08 -07:00
Thalia Archibald
4085af0183 Move fd into sys 2025-04-04 20:11:08 -07:00
bors
1e008dd5d8 Auto merge of #139396 - Zalathar:rollup-lmoqvru, r=Zalathar
Rollup of 11 pull requests

Successful merges:

 - #136457 (Expose algebraic floating point intrinsics)
 - #137880 (Autodiff batching)
 - #137897 (fix pthread-based tls on apple targets)
 - #138024 (Allow optimizing out `panic_bounds_check` in Unicode checks.)
 - #138546 (Add integer to string formatting tests)
 - #138826 (StableMIR: Add `associated_items`.)
 - #138950 (replace extra_filename with strict version hash in metrics file names)
 - #139274 (Rustdoc: typecheck settings.js)
 - #139285 (use lower case to match other error messages)
 - #139341 (Apply `Recovery::Forbidden` when reparsing pasted macro fragments.)
 - #139389 (make `Arguments::as_statically_known_str` doc(hidden))

r? `@ghost`
`@rustbot` modify labels: rollup
2025-04-05 03:09:33 +00:00
Ramon de C Valle
8c891ba222 Change notifications for Exploit Mitigations PG
Reduce the amount of notifications sent to all the Exploit Mitigations
PG by removing it from some of the paths.
2025-04-05 03:08:49 +00:00
Stuart Cook
e31ce50698
Rollup merge of #139389 - mejrs:hidden, r=workingjubilee
make `Arguments::as_statically_known_str` doc(hidden)

Fixes `as_statically_known_str` being [visible](https://doc.rust-lang.org/nightly/std/fmt/struct.Arguments.html#method.as_statically_known_str) ([Rendered](https://github.com/user-attachments/assets/45482d9f-2ec5-4610-be9c-b231bd2850c6))

This snuck in with https://github.com/rust-lang/rust/pull/138650, cc `@thaliaarchi`

This is also visible in the beta docs.

`@rustbot` label +beta-nominated
2025-04-05 13:18:17 +11:00
Stuart Cook
66ccc4fe28
Rollup merge of #139341 - nnethercote:fix-137874, r=petrochenkov
Apply `Recovery::Forbidden` when reparsing pasted macro fragments.

Fixes #137874.

The changes to the output of `tests/ui/associated-consts/issue-93835.rs`
partly undo the changes seen when `NtTy` was removed in #133436, which
is good.

r? ``@petrochenkov``
2025-04-05 13:18:17 +11:00
Stuart Cook
6907e011e4
Rollup merge of #139285 - tshepang:uniform-case, r=jieyouxu
use lower case to match other error messages
2025-04-05 13:18:16 +11:00
Stuart Cook
f04c935cf1
Rollup merge of #139274 - lolbinarycat:rustdoc-js-less-expect-error-part5, r=notriddle
Rustdoc: typecheck settings.js

This makes the file fully typechecked with no instances of ``````@ts-expect-error`````` and no type casts.

r? `````@notriddle`````
2025-04-05 13:18:16 +11:00
Stuart Cook
ae745a06fa
Rollup merge of #138950 - yaahc:svh-metrics-name, r=bjorn3
replace extra_filename with strict version hash in metrics file names

Should resolve the potential issue of overwriting metrics from the same crate when compiled with different features or flags.

r? `````@estebank`````

try-job: test-various
2025-04-05 13:18:15 +11:00
Stuart Cook
93f7583491
Rollup merge of #138826 - makai410:assoc-items, r=celinval
StableMIR: Add `associated_items`.

Resolves: https://github.com/rust-lang/project-stable-mir/issues/87
2025-04-05 13:18:15 +11:00
Stuart Cook
338b8787b9
Rollup merge of #138546 - GuillaumeGomez:integer-to-string-tests, r=Amanieu
Add integer to string formatting tests

As discussed in https://github.com/rust-lang/rust/pull/136264, there doesn't seem to have tests to ensure that int to string conversion is performed correctly, only sporadic tests here and there. Now we have some basic tests. :)

r? `````@Mark-Simulacrum`````
2025-04-05 13:18:14 +11:00
Stuart Cook
a038028eca
Rollup merge of #138024 - reitermarkus:unicode-panic-optimization, r=ibraheemdev
Allow optimizing out `panic_bounds_check` in Unicode checks.

Allow optimizing out `panic_bounds_check` in Unicode checks.

For context, see https://github.com/japaric/ufmt/issues/52#issuecomment-2699207241.
2025-04-05 13:18:14 +11:00
Stuart Cook
92bb7261c4
Rollup merge of #137897 - xTachyon:tls-fix, r=thomcc,jieyouxu
fix pthread-based tls on apple targets

Tries to fix #127773.
2025-04-05 13:18:13 +11:00
Stuart Cook
c6bf3a01ef
Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk
Autodiff batching

Enzyme supports batching, which is especially known from the ML side when training neural networks.
There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights.
That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations.

Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size,
and then each Dual/Duplicated argument has not one, but N shadow arguments.  So instead of
```rs
for i in 0..100 {
   df(x[i], y[i], 1234);
}
```
You can now do
```rs
for i in 0..100.step_by(4) {
   df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234);
}
```
which will give the same results, but allows better compiler optimizations. See the testcase for details.

There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days.

I will also add more tests for both modes.

For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU
I'll also add some other docs to the dev guide and user docs in another PR.

r? ghost

Tracking:

- https://github.com/rust-lang/rust/issues/124509
- https://github.com/rust-lang/rust/issues/135283
2025-04-05 13:18:13 +11:00
Stuart Cook
2e4e196a5b
Rollup merge of #136457 - calder:master, r=tgross35
Expose algebraic floating point intrinsics

# Problem

A stable Rust implementation of a simple dot product is 8x slower than C++ on modern x86-64 CPUs. The root cause is an inability to let the compiler reorder floating point operations for better vectorization.

See https://github.com/calder/dot-bench for benchmarks. Measurements below were performed on a i7-10875H.

### C++: 10us 

With Clang 18.1.3 and `-O2 -march=haswell`:
<table>
<tr>
    <th>C++</th>
    <th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="cc">
float dot(float *a, float *b, size_t len) {
    #pragma clang fp reassociate(on)
    float sum = 0.0;
    for (size_t i = 0; i < len; ++i) {
        sum += a[i] * b[i];
    }
    return sum;
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/739573c0-380a-4d84-9fd9-141343ce7e68" />
</td>
</tr>
</table>

### Nightly Rust: 10us 

With rustc 1.86.0-nightly (8239a37f9) and `-C opt-level=3 -C target-feature=+avx2,+fma`:
<table>
<tr>
    <th>Rust</th>
    <th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="rust">
fn dot(a: &[f32], b: &[f32]) -> f32 {
    let mut sum = 0.0;
    for i in 0..a.len() {
        sum = fadd_algebraic(sum, fmul_algebraic(a[i], b[i]));
    }
    sum
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/9dcf953a-2cd7-42f3-bc34-7117de4c5fb9" />
</td>
</tr>
</table>

### Stable Rust: 84us 

With rustc 1.84.1 (e71f9a9a9) and `-C opt-level=3 -C target-feature=+avx2,+fma`:
<table>
<tr>
    <th>Rust</th>
    <th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="rust">
fn dot(a: &[f32], b: &[f32]) -> f32 {
    let mut sum = 0.0;
    for i in 0..a.len() {
        sum += a[i] * b[i];
    }
    sum
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/936a1f7e-33e4-4ff8-a732-c3cdfe068dca" />
</td>
</tr>
</table>

# Proposed Change

Add `core::intrinsics::f*_algebraic` wrappers to `f16`, `f32`, `f64`, and `f128` gated on a new `float_algebraic` feature.

# Alternatives Considered

https://github.com/rust-lang/rust/issues/21690 has a lot of good discussion of various options for supporting fast math in Rust, but is still open a decade later because any choice that opts in more than individual operations is ultimately contrary to Rust's design principles.

In the mean time, processors have evolved and we're leaving major performance on the table by not supporting vectorization. We shouldn't make users choose between an unstable compiler and an 8x performance hit.

# References

* https://github.com/rust-lang/rust/issues/21690
* https://github.com/rust-lang/libs-team/issues/532
* https://github.com/rust-lang/rust/issues/136469
* https://github.com/calder/dot-bench
* https://www.felixcloutier.com/x86/vfmadd132ps:vfmadd213ps:vfmadd231ps

try-job: x86_64-gnu-nopt
try-job: x86_64-gnu-aux
2025-04-05 13:18:12 +11:00
Calder Coalson
8ff70529f2 Expose algebraic floating point intrinsics 2025-04-04 16:13:57 -07:00