Change WASI's `RawFd` from `u32` to `c_int` (`i32`).
WASI previously used `u32` as its `RawFd` type, since its "file descriptors"
are unsigned table indices, and there's no fundamental reason why WASI can't
have more than 2^31 handles.
However, this creates myriad little incompability problems with code
that also supports Unix platforms, where `RawFd` is `c_int`. While WASI
isn't a Unix, it often shares code with Unix, and this difference made
such shared code inconvenient. #87329 is the most recent example of such
code.
So, switch WASI to use `c_int`, which is `i32`. This will mean that code
intending to support WASI should ideally avoid assuming that negative file
descriptors are invalid, even though POSIX itself says that file descriptors
are never negative.
This is a breaking change, but `RawFd` is considerd an experimental
feature in [the documentation].
[the documentation]: https://doc.rust-lang.org/stable/std/os/wasi/io/type.RawFd.html
r? `@alexcrichton`
WASI previously used `u32` as its `RawFd` type, since its "file descriptors"
are unsigned table indices, and there's no fundamental reason why WASI can't
have more than 2^31 handles.
However, this creates myriad little incompability problems with code
that also supports Unix platforms, where `RawFd` is `c_int`. While WASI
isn't a Unix, it often shares code with Unix, and this difference made
such shared code inconvenient. #87329 is the most recent example of such
code.
So, switch WASI to use `c_int`, which is `i32`. This will mean that code
intending to support WASI should ideally avoid assuming that negative file
descriptors are invalid, even though POSIX itself says that file descriptors
are never negative.
This is a breaking change, but `RawFd` is considerd an experimental
feature in [the documentation].
[the documentation]: https://doc.rust-lang.org/stable/std/os/wasi/io/type.RawFd.html
Add Linux-specific pidfd process extensions (take 2)
Continuation of #77168.
I addressed the following concerns from the original PR:
- make `CommandExt` and `ChildExt` sealed traits
- wrap file descriptors in `PidFd` struct representing ownership over the fd
- add `take_pidfd` to take the fd out of `Child`
- close fd when dropped
Tracking Issue: #82971
Background:
Over the last year, pidfd support was added to the Linux kernel. This
allows interacting with other processes. In particular, this allows
waiting on a child process with a timeout in a race-free way, bypassing
all of the awful signal-handler tricks that are usually required.
Pidfds can be obtained for a child process (as well as any other
process) via the `pidfd_open` syscall. Unfortunately, this requires
several conditions to hold in order to be race-free (i.e. the pid is not
reused).
Per `man pidfd_open`:
```
· the disposition of SIGCHLD has not been explicitly set to SIG_IGN
(see sigaction(2));
· the SA_NOCLDWAIT flag was not specified while establishing a han‐
dler for SIGCHLD or while setting the disposition of that signal to
SIG_DFL (see sigaction(2)); and
· the zombie process was not reaped elsewhere in the program (e.g.,
either by an asynchronously executed signal handler or by wait(2)
or similar in another thread).
If any of these conditions does not hold, then the child process
(along with a PID file descriptor that refers to it) should instead
be created using clone(2) with the CLONE_PIDFD flag.
```
Sadly, these conditions are impossible to guarantee once any libraries
are used. For example, C code runnng in a different thread could call
`wait()`, which is impossible to detect from Rust code trying to open a
pidfd.
While pid reuse issues should (hopefully) be rare in practice, we can do
better. By passing the `CLONE_PIDFD` flag to `clone()` or `clone3()`, we
can obtain a pidfd for the child process in a guaranteed race-free
manner.
This PR:
This PR adds Linux-specific process extension methods to allow obtaining
pidfds for processes spawned via the standard `Command` API. Other than
being made available to user code, the standard library does not make
use of these pidfds in any way. In particular, the implementation of
`Child::wait` is completely unchanged.
Two Linux-specific helper methods are added: `CommandExt::create_pidfd`
and `ChildExt::pidfd`. These methods are intended to serve as a building
block for libraries to build higher-level abstractions - in particular,
waiting on a process with a timeout.
I've included a basic test, which verifies that pidfds are created iff
the `create_pidfd` method is used. This test is somewhat special - it
should always succeed on systems with the `clone3` system call
available, and always fail on systems without `clone3` available. I'm
not sure how to best ensure this programatically.
This PR relies on the newer `clone3` system call to pass the `CLONE_FD`,
rather than the older `clone` system call. `clone3` was added to Linux
in the same release as pidfds, so this shouldn't unnecessarily limit the
kernel versions that this code supports.
Unresolved questions:
* What should the name of the feature gate be for these newly added
methods?
* Should the `pidfd` method distinguish between an error occurring
and `create_pidfd` not being called?
Add std::os::unix::fs::DirEntryExt2::file_name_ref(&self) -> &OsStr
Greetings!
This is my first PR here, so please forgive me if I've missed an important step or otherwise done something wrong. I'm very open to suggestions/fixes/corrections.
This PR adds a function that allows `std::fs::DirEntry` to vend a borrow of its filename on Unix platforms, which is especially useful for sorting. (Windows has (as I understand it) encoding differences that require an allocation.) This new function sits alongside the cross-platform [`file_name(&self) -> OsString`](https://doc.rust-lang.org/std/fs/struct.DirEntry.html#method.file_name) function.
I pitched this idea in an [internals thread](https://internals.rust-lang.org/t/allow-std-direntry-to-vend-borrows-of-its-filename/14328/4), and no one objected vehemently, so here we are.
I understand features in general, I believe, but I'm not at all confident that my whole-cloth invention of a new feature string (as required by the compiler) was correct (or that the name is appropriate). Further, there doesn't appear to be a test for the sibling `ino` function, so I didn't add one for this similarly trivial function either. If it's desirable that I should do so, I'd be happy to [figure out how to] do that.
The following is a trivial sample of a use-case for this function, in which directory entries are sorted without any additional allocations:
```rust
use std::os::unix::fs::DirEntryExt;
use std::{fs, io};
fn main() -> io::Result<()> {
let mut entries = fs::read_dir(".")?.collect::<Result<Vec<_>, io::Error>>()?;
entries.sort_unstable_by(|a, b| a.file_name_ref().cmp(b.file_name_ref()));
for p in entries {
println!("{:?}", p);
}
Ok(())
}
```
Redefine `ErrorKind::Other` and stop using it in std.
This implements the idea I shared yesterday in the libs meeting when we were discussing how to handle adding new `ErrorKind`s to the standard library: This redefines `Other` to be for *user defined errors only*, and changes all uses of `Other` in the standard library to a `#[doc(hidden)]` and permanently `#[unstable]` `ErrorKind` that users can not match on. This ensures that adding `ErrorKind`s at a later point in time is not a breaking change, since the user couldn't match on these errors anyway. This way, we use the `#[non_exhaustive]` property of the enum in a more effective way.
Open questions:
- How do we check this change doesn't cause too much breakage? Will a crate run help and be enough?
- How do we ensure we don't accidentally start using `Other` again in the standard library? We don't have a `pub(not crate)` or `#[deprecated(in this crate only)]`.
cc https://github.com/rust-lang/rust/pull/79965
cc `@rust-lang/libs` `@ijackson`
r? `@dtolnay`
Provide ExitStatusError
Closes#73125
In MR #81452 "Add #[must_use] to [...] process::ExitStatus" we concluded that the existing arrangements in are too awkward so adding that `#[must_use]` is blocked on improving the ergonomics.
I wrote a mini-RFC-style discusion of the approach in https://github.com/rust-lang/rust/issues/73125#issuecomment-771092741
Expand WASI abbreviation in docs
I was pretty sure this was related to something for WebAssembly but wasn't 100% sure so I checked but even on these top-level docs I couldn't find the abbreviation expanded. I'm normally used to Rust docs being detailed and explanatory and writing abbreviations like this out in full at least once so I thought it was worth the change. Feel free to close this if it's too much.
Do not allocate or unwind after fork
### Objective scenarios
* Make (simple) panics safe in `Command::pre_exec_hook`, including most `panic!` calls, `Option::unwrap`, and array bounds check failures.
* Make it possible to `libc::fork` and then safely panic in the child (needed for the above, but this requirement means exposing the new raw hook API which the `Command` implementation needs).
* In singlethreaded programs, where panic in `pre_exec_hook` is already memory-safe, prevent the double-unwinding malfunction #79740.
I think we want to make panic after fork safe even though the post-fork child environment is only experienced by users of `unsafe`, beause the subset of Rust in which any panic is UB is really far too hazardous and unnatural.
#### Approach
* Provide a way for a program to, at runtime, switch to having panics abort. This makes it possible to panic without making *any* heap allocations, which is needed because on some platforms malloc is UB in a child forked from a multithreaded program (see https://github.com/rust-lang/rust/pull/80263#issuecomment-774272370, and maybe also the SuS [spec](https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html)).
* Make that change in the child spawned by `Command`.
* Document the rules comprehensively enough that a programmer has a fighting chance of writing correct code.
* Test that this all works as expected (and in particular, that there aren't any heap allocations we missed)
Fixes#79740
#### Rejected (or previously attempted) approaches
* Change the panic machinery to be able to unwind without allocating, at least when the payload and message are both `'static`. This seems like it would be even more subtle. Also that is a potentially-hot path which I don't want to mess with.
* Change the existing panic hook mechanism to not convert the message to a `String` before calling the hook. This would be a surprising change for existing code and would not be detected by the type system.
* Provide a `raw_panic_hook` function to intercept panics in a way that doesn't allocate. (That was an earlier version of this MR.)
### History
This MR could be considered a v2 of #80263. Thanks to everyone who commented there. In particular, thanks to `@m-ou-se,` `@Mark-Simulacrum` and `@hyd-dev.` (Tagging you since I think you might be interested in this new MR.) Compared to #80263, this MR has very substantial changes and additions.
Additionally, I have recently (2021-04-20) completely revised this series following very helpful comments from `@m-ou-se.`
r? `@m-ou-se`
It is unergnomic to have to say things like
bad.into_status().signal()
Implementing `ExitStatusExt` for `ExitStatusError` fixes this.
Unfortunately it does mean making a previously-infallible method
capable of panicing, although of course the existing impl remains
infallible.
The alternative would be a whole new `ExitStatusErrorExt` trait.
`<ExitStatus as ExitStatusExt>::into_raw()` is not particularly
ergonomic to call because of the often-required type annotation.
See for example the code in the test case in
library/std/src/sys/unix/process/process_unix/tests.rs
Perhaps we should provide equivalent free functions for `ExitStatus`
and `ExitStatusExt` in std::os::unix::process and maybe deprecate this
trait method. But I think that is for the future.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>