![]() Prepend temp files with per-invocation random string to avoid temp filename conflicts https://github.com/rust-lang/rust/issues/139407 uncovered a very subtle unsoundness with incremental codegen, failing compilation sessions (due to assembler errors), and the "prefer hard linking over copying files" strategy we use in the compiler for file management. Specifically, imagine we're building a single file 3 times, all with `-Csave-temps -Cincremental=...`. Let's call the object file we're building for the codegen unit for `main` "`XXX.o`" just for clarity since it's probably some gigantic hash name: ``` #[inline(never)] #[cfg(any(rpass1, rpass3))] fn a() -> i32 { 0 } #[cfg(any(cfail2))] fn a() -> i32 { 1 } fn main() { evil::evil(); assert_eq!(a(), 0); } mod evil { #[cfg(any(rpass1, rpass3))] pub fn evil() { unsafe { std::arch::asm!("/* */"); } } #[cfg(any(cfail2))] pub fn evil() { unsafe { std::arch::asm!("missing"); } } } ``` Session 1 (`rpass1`): * Type-check, borrow-check, etc. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o` which is spit out in the cwd. * Hard-link[^1] `XXX.rcgu.o` to the incremental working directory `.../s-...-working/XXX.o`. * Save-temps option means we don't delete `XXX.rgcu.o`. * Link the binary and stuff. * Finalize[^2] the working incremental session by renaming `.../s-...-working` to ` s-...-asjkdhsjakd` (some other finalized incr comp session dir name). Session 2 (`cfail2`): * Load artifacts from the previous *finalized* incremental session, namely the dep graph. * Type-check, borrow-check, etc. since the file has changed, so most dep graph nodes are red. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o`. **HERE IS THE PROBLEM**: The hard-link is still set up to point to the inode from `XXX.o` from the first session, so this also modifies the `XXX.o` in the previous finalized session directory. * Codegen emits an error b/c `missing` is not an instruction, so we abort before finalizing the incremental session. Specifically, this means that the *previous* session is the last finalized session. Session 3 (`rpass3`): * Load artifacts from the previous *finalized* incremental session, namely the dep graph. NOTE that this is from session 1. * All the dep graph nodes are green since we are basically replaying session 1. * codegen object file `XXX.o`, which is detected as *reused* from session 1 since dep nodes were green. That means we **reuse** `XXX.o` which had been dirtied from session 2. * Link the binary and stuff. This results in a binary which reuses some of the build artifacts from session 2, but thinks it's from session 1. At this point, I hope it's clear to see that the incremental results from session 1 were dirtied from session 2, but we reuse them as if session 1 was the previous (finalized) incremental session we ran. This is at best really buggy, and at worst **unsound**. This isn't limited to `-C save-temps`, since there are other combinations of flags that may keep around temporary files (hard linked) in the working directory (like `-C debuginfo=1 -C split-debuginfo=unpacked` on darwin, for example). --- This PR implements a fix which is to prepend temp filenames with a random string that is generated per invocation of rustc. This string is not *deterministic*, but temporary files are transient anyways, so I don't believe this is a problem. That means that temp files are now something like... `{crate-name}.{cgu}.{invocation_temp}.rcgu.o`, where `{invocation_temp}` is the new temporary string we generate per invocation of rustc. Fixes https://github.com/rust-lang/rust/issues/139407 [^1]: |
||
---|---|---|
.. | ||
.github/workflows | ||
.vscode | ||
.zed | ||
build_system | ||
docs | ||
example | ||
patches | ||
scripts | ||
src | ||
.cirrus.yml | ||
.gitattributes | ||
.gitignore | ||
Cargo.lock | ||
Cargo.toml | ||
clean_all.sh | ||
config.txt | ||
LICENSE-APACHE | ||
LICENSE-MIT | ||
Readme.md | ||
rust-toolchain | ||
rustfmt.toml | ||
test.sh | ||
y.cmd | ||
y.ps1 | ||
y.sh |
Cranelift codegen backend for rust
The goal of this project is to create an alternative codegen backend for the rust compiler based on Cranelift. This has the potential to improve compilation times in debug mode. If your project doesn't use any of the things listed under "Not yet supported", it should work fine. If not please open an issue.
Download using Rustup
The Cranelift codegen backend is distributed in nightly builds on Linux and x86_64 macOS. If you want to install it using Rustup, you can do that by running:
$ rustup component add rustc-codegen-cranelift-preview --toolchain nightly
Once it is installed, you can enable it with one of the following approaches:
CARGO_PROFILE_DEV_CODEGEN_BACKEND=cranelift cargo +nightly build -Zcodegen-backend
- Add the following to
.cargo/config.toml
:[unstable] codegen-backend = true [profile.dev] codegen-backend = "cranelift"
- Add the following to
Cargo.toml
:# This line needs to come before anything else in Cargo.toml cargo-features = ["codegen-backend"] [profile.dev] codegen-backend = "cranelift"
Precompiled builds
You can also download a pre-built version from the releases page.
Extract the dist
directory in the archive anywhere you want.
If you want to use cargo clif build
instead of having to specify the full path to the cargo-clif
executable, you can add the bin
subdirectory of the extracted dist
directory to your PATH
.
(tutorial for Windows, and for Linux/MacOS).
Building and testing
If you want to build the backend manually, you can download it from GitHub and build it yourself:
$ git clone https://github.com/rust-lang/rustc_codegen_cranelift
$ cd rustc_codegen_cranelift
$ ./y.sh build
To run the test suite replace the last command with:
$ ./y.sh prepare # only needs to be run the first time
$ ./test.sh
For more docs on how to build and test see build_system/usage.txt or the help message of ./y.sh
.
Platform support
OS \ architecture | x86_64 | AArch64 | Riscv64 | s390x (System-Z) |
---|---|---|---|---|
Linux | ✅ | ✅ | ✅1 | ✅1 |
FreeBSD | ✅1 | ❓ | ❓ | ❓ |
AIX | ❌2 | N/A | N/A | ❌2 |
Other unixes | ❓ | ❓ | ❓ | ❓ |
macOS | ✅ | ✅ | N/A | N/A |
Windows | ✅ | ❌ | N/A | N/A |
✅: Fully supported and tested ❓: Maybe supported, not tested ❌: Not supported at all
Not all targets are available as rustup component for nightly. See notes in the platform support matrix.
Usage
rustc_codegen_cranelift can be used as a near-drop-in replacement for cargo build
or cargo run
for existing projects.
Assuming $cg_clif_dir
is the directory you cloned this repo into and you followed the instructions (y.sh prepare
and y.sh build
or test.sh
).
In the directory with your project (where you can do the usual cargo build
), run:
$ $cg_clif_dir/dist/cargo-clif build
This will build your project with rustc_codegen_cranelift instead of the usual LLVM backend.
For additional ways to use rustc_codegen_cranelift like the JIT mode see usage.md.
Building and testing with changes in rustc code
See rustc_testing.md.
Not yet supported
- SIMD (tracked here,
std::simd
fully works,std::arch
is partially supported) - Unwinding on panics (no cranelift support,
-Cpanic=abort
is enabled by default)
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you shall be dual licensed as above, without any additional terms or conditions.