compiler: `{TyAnd,}Layout` comes home
The `Layout` and `TyAndLayout` types are heavily abstract and have no particular target-specific qualities, though we do use them to answer questions particular to targets. We can keep it that way if we simply move them out of `rustc_target` and into `rustc_abi`. They bring a small entourage of connected types with them, but that's fine.
This will allow us to strengthen a few abstraction barriers over time and thus make the notoriously gnarly layout code easier to refactor. For now, we don't need to worry about that and deliberately use reexports to minimize this particular diff.
core/net: add Ipv[46]Addr::from_octets, Ipv6Addr::from_segments.
Adds:
- `Ipv4Address::from_octets([u8;4])`
- `Ipv6Address::from_octets([u8;16])`
- `Ipv6Address::from_segments([u16;8])`
equivalent to the existing `From` impls.
Advantages:
- Consistent with `to_bits, from_bits`.
- More discoverable than the `From` impls.
- Helps with type inference: it's common to want to convert byte slices to IP addrs. If you try this
```rust
fn foo(x: &[u8]) -> Ipv4Addr {
Ipv4Addr::from(foo.try_into().unwrap())
}
```
it [doesn't work](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0e2873312de275a58fa6e33d1b213bec). You have to write `Ipv4Addr::from(<[u8;4]>::try_from(x).unwrap())` instead, which is not great. With `from_octets` it is able to infer the right types.
Found this while porting [smoltcp](https://github.com/smoltcp-rs/smoltcp/) from its own IP address types to the `core::net` types.
~~Tracking issues #27709 #76205~~
Tracking issue: https://github.com/rust-lang/rust/issues/131360
std::fs::get_path freebsd update.
what matters is we re doing the right things as doing sizeof, rather than passing KINFO_FILE_SIZE (only defined on intel architectures), the kernel
making sure it matches the expectation in its side.
Rollup of 8 pull requests
Successful merges:
- #130356 (don't warn about a missing change-id in CI)
- #130900 (Do not output () on empty description)
- #131066 (Add the Chinese translation entry to the RustByExample build process)
- #131067 (Fix std_detect links)
- #131644 (Clean up some Miri things in `sys/windows`)
- #131646 (sys/unix: add comments for some Miri fallbacks)
- #131653 (Remove const trait bound modifier hack)
- #131659 (enable `download_ci_llvm` test)
r? `@ghost`
`@rustbot` modify labels: rollup
Clean up some Miri things in `sys/windows`
- remove miri hack that is only needed for win7 (we don't support win7 as a target in Miri)
- remove outdated comment now that Miri is on CI
Do not output () on empty description
When passing an explicitly empty description string, as explained here https://github.com/rust-lang/rust/blob/master/config.example.toml#L611-L613, my expectation is that the resulting rustc will be compatible with upstream.
However, it seems that instead, a `()` is added to the end of the version string, causing the version compatibility check to fail. My proposed fix here would be to instead only print `({description})` if `description` is a non-empty string.
The allocator on Xous is now throwing warnings because the allocator
needs to be mutable, and allocators hand out mutable pointers, which
the `static_mut_refs` lint now catches.
Give the same treatment to Xous as wasm, at least until a solution is
devised for fixing the warning on wasm.
Signed-off-by: Sean Cross <sean@xobs.io>
Process arguments and environment variables are both passed by way of
Application Parameters. These are a TLV format that gets passed in as
the second process argument.
This patch combines both as they are very similar in their decode.
Signed-off-by: Sean Cross <sean@osdyne.com>
Optimize `escape_ascii` using a lookup table
Based upon my suggestion here: https://github.com/rust-lang/rust/pull/125340#issuecomment-2130441817
Effectively, we can take advantage of the fact that ASCII only needs 7 bits to make the eighth bit store whether the value should be escaped or not. This adds a 256-byte lookup table, but 256 bytes *should* be small enough that very few people will mind, according to my probably not incontrovertible opinion.
The generated assembly isn't clearly better (although has fewer branches), so, I decided to benchmark on three inputs: first on a random 200KiB, then on `/bin/cat`, then on `Cargo.toml` for this repo. In all cases, the generated code ran faster on my machine. (an old i7-8700)
But, if you want to try my benchmarking code for yourself:
<details><summary>Criterion code below. Replace <code>/home/ltdk/rustsrc</code> with the appropriate directory.</summary>
```rust
#![feature(ascii_char)]
#![feature(ascii_char_variants)]
#![feature(const_option)]
#![feature(let_chains)]
use core::ascii;
use core::ops::Range;
use criterion::{criterion_group, criterion_main, Criterion};
use rand::{thread_rng, Rng};
const HEX_DIGITS: [ascii::Char; 16] = *b"0123456789abcdef".as_ascii().unwrap();
#[inline]
const fn backslash<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 2) };
let mut output = [ascii::Char::Null; N];
output[0] = ascii::Char::ReverseSolidus;
output[1] = a;
(output, 0..2)
}
#[inline]
const fn hex_escape<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 4) };
let mut output = [ascii::Char::Null; N];
let hi = HEX_DIGITS[(byte >> 4) as usize];
let lo = HEX_DIGITS[(byte & 0xf) as usize];
output[0] = ascii::Char::ReverseSolidus;
output[1] = ascii::Char::SmallX;
output[2] = hi;
output[3] = lo;
(output, 0..4)
}
#[inline]
const fn verbatim<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 1) };
let mut output = [ascii::Char::Null; N];
output[0] = a;
(output, 0..1)
}
/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_old<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 4) };
match byte {
b'\t' => backslash(ascii::Char::SmallT),
b'\r' => backslash(ascii::Char::SmallR),
b'\n' => backslash(ascii::Char::SmallN),
b'\\' => backslash(ascii::Char::ReverseSolidus),
b'\'' => backslash(ascii::Char::Apostrophe),
b'\"' => backslash(ascii::Char::QuotationMark),
0x00..=0x1F => hex_escape(byte),
_ => match ascii::Char::from_u8(byte) {
Some(a) => verbatim(a),
None => hex_escape(byte),
},
}
}
/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_new<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
/// Lookup table helps us determine how to display character.
///
/// Since ASCII characters will always be 7 bits, we can exploit this to store the 8th bit to
/// indicate whether the result is escaped or unescaped.
///
/// We additionally use 0x80 (escaped NUL character) to indicate hex-escaped bytes, since
/// escaped NUL will not occur.
const LOOKUP: [u8; 256] = {
let mut arr = [0; 256];
let mut idx = 0;
loop {
arr[idx as usize] = match idx {
// use 8th bit to indicate escaped
b'\t' => 0x80 | b't',
b'\r' => 0x80 | b'r',
b'\n' => 0x80 | b'n',
b'\\' => 0x80 | b'\\',
b'\'' => 0x80 | b'\'',
b'"' => 0x80 | b'"',
// use NUL to indicate hex-escaped
0x00..=0x1F | 0x7F..=0xFF => 0x80 | b'\0',
_ => idx,
};
if idx == 255 {
break;
}
idx += 1;
}
arr
};
let lookup = LOOKUP[byte as usize];
// 8th bit indicates escape
let lookup_escaped = lookup & 0x80 != 0;
// SAFETY: We explicitly mask out the eighth bit to get a 7-bit ASCII character.
let lookup_ascii = unsafe { ascii::Char::from_u8_unchecked(lookup & 0x7F) };
if lookup_escaped {
// NUL indicates hex-escaped
if matches!(lookup_ascii, ascii::Char::Null) {
hex_escape(byte)
} else {
backslash(lookup_ascii)
}
} else {
verbatim(lookup_ascii)
}
}
fn escape_bytes(bytes: &[u8], f: impl Fn(u8) -> ([ascii::Char; 4], Range<u8>)) -> Vec<ascii::Char> {
let mut vec = Vec::new();
for b in bytes {
let (buf, range) = f(*b);
vec.extend_from_slice(&buf[range.start as usize..range.end as usize]);
}
vec
}
pub fn criterion_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("escape_ascii");
group.sample_size(1000);
let rand_200k = &mut [0; 200 * 1024];
thread_rng().fill(&mut rand_200k[..]);
let cat = include_bytes!("/bin/cat");
let cargo_toml = include_bytes!("/home/ltdk/rustsrc/Cargo.toml");
group.bench_function("old_rand", |b| {
b.iter(|| escape_bytes(rand_200k, escape_ascii_old));
});
group.bench_function("new_rand", |b| {
b.iter(|| escape_bytes(rand_200k, escape_ascii_new));
});
group.bench_function("old_bin", |b| {
b.iter(|| escape_bytes(cat, escape_ascii_old));
});
group.bench_function("new_bin", |b| {
b.iter(|| escape_bytes(cat, escape_ascii_new));
});
group.bench_function("old_cargo_toml", |b| {
b.iter(|| escape_bytes(cargo_toml, escape_ascii_old));
});
group.bench_function("new_cargo_toml", |b| {
b.iter(|| escape_bytes(cargo_toml, escape_ascii_new));
});
group.finish();
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
```
</details>
My benchmark results:
```
escape_ascii/old_rand time: [1.6965 ms 1.7006 ms 1.7053 ms]
Found 22 outliers among 1000 measurements (2.20%)
4 (0.40%) high mild
18 (1.80%) high severe
escape_ascii/new_rand time: [1.6749 ms 1.6953 ms 1.7158 ms]
Found 38 outliers among 1000 measurements (3.80%)
38 (3.80%) high mild
escape_ascii/old_bin time: [224.59 µs 225.40 µs 226.33 µs]
Found 39 outliers among 1000 measurements (3.90%)
17 (1.70%) high mild
22 (2.20%) high severe
escape_ascii/new_bin time: [164.86 µs 165.63 µs 166.58 µs]
Found 107 outliers among 1000 measurements (10.70%)
43 (4.30%) high mild
64 (6.40%) high severe
escape_ascii/old_cargo_toml
time: [23.397 µs 23.699 µs 24.014 µs]
Found 204 outliers among 1000 measurements (20.40%)
21 (2.10%) high mild
183 (18.30%) high severe
escape_ascii/new_cargo_toml
time: [16.404 µs 16.438 µs 16.483 µs]
Found 88 outliers among 1000 measurements (8.80%)
56 (5.60%) high mild
32 (3.20%) high severe
```
Random: 1.7006ms => 1.6953ms (<1% speedup)
Binary: 225.40µs => 165.63µs (26% speedup)
Text: 23.699µs => 16.438µs (30% speedup)
The recent changes to naked `asm!()` macros made this unbuildable
on Xous. The upstream package maintainer released 0.2.3 to fix support
on newer nightly toolchains.
Update the dependency to 0.2.3, which is the oldest version that works
with the current nightly compiler.
This closes#131602 and fixes the build on xous.
Signed-off-by: Sean Cross <sean@xobs.io>
Rollup of 6 pull requests
Successful merges:
- #131086 (Update unicode-width to 0.2.0)
- #131585 (compiletest: Remove the one thing that was checking a directive's `original_line`)
- #131614 (Error on trying to use revisions in `run-make` tests)
- #131638 (compiletest: Move debugger setup code out of `lib.rs`)
- #131641 (switch unicode-data bitsets back to 'static')
- #131642 (Special case error message for a `build-fail` test that failed check build)
r? `@ghost`
`@rustbot` modify labels: rollup
In issue #118053, the `loongarch64-unknown-linux-gnu` target needs indirection
to access external data, and so do the `loongarch64-unknown-linux-musl` and
`loongarch64-unknown-linux-ohos` targets.
Special case error message for a `build-fail` test that failed check build
A `build-fail` test requires that a check build (roughly `--emit=metadata`, no codegen) succeeds but fails later. Previously, if its check build failed, the user will see the error message
```
error: test compilation failed although it shouldn't!
```
which is confusing. Because the test is `build-fail`, we want the test compilation to fail! This error message doesn't account for the difference between a check build and a complete build, so let's special case the error message for a `build-fail` test whose check build failed to instead say
```
error: `build-fail` test is required to pass check build, but check build failed
```
Fixes#130894.
compiletest: Move debugger setup code out of `lib.rs`
These functions contain a few hundred lines of code for dealing with debuggers (for `debuginfo` tests), and don't really belong in the crate root.
Moving them out to their own module makes `lib.rs` easier to follow.
compiletest: Remove the one thing that was checking a directive's `original_line`
This special handling of `ignore-tidy*` was introduced during the migration to `//`@`` directives (#120881), and has become unnecessary after the subsequent removal of the legacy directive check (#131392).
Fixed get/set thread name implementations for macOS and FreeBSD
So, the story of fixing `pthread_getname_np` and `pthread_setname_np` continues, but this time I fixed the macOS implementation.
### [`pthread_getname_np`](c032e0b076/src/pthread.c (L1160-L1175))
The function never fails except for an invalid thread. Miri never verifies thread identifiers and uses them as indices when accessing a vector of threads. Therefore, the only possible result is `0` and a possibly trimmed output.
```c
int
pthread_getname_np(pthread_t thread, char *threadname, size_t len)
{
if (thread == pthread_self()) {
strlcpy(threadname, thread->pthread_name, len);
return 0;
}
if (!_pthread_validate_thread_and_list_lock(thread)) {
return ESRCH;
}
strlcpy(threadname, thread->pthread_name, len);
_pthread_lock_unlock(&_pthread_list_lock);
return 0;
}
```
#### [`strcpy`](https://www.man7.org/linux/man-pages/man7/strlcpy.7.html)
```
strlcpy(3bsd)
strlcat(3bsd)
Copy and catenate the input string into a destination
string. If the destination buffer, limited by its size,
isn't large enough to hold the copy, the resulting string
is truncated (but it is guaranteed to be null-terminated).
They return the length of the total string they tried to
create.
```
### [`pthread_setname_np`](c032e0b076/src/pthread.c (L1178-L1200))
```c
pthread_setname_np(const char *name)
{
int res;
pthread_t self = pthread_self();
size_t len = 0;
if (name != NULL) {
len = strlen(name);
}
_pthread_validate_signature(self);
res = __proc_info(5, getpid(), 2, (uint64_t)0, (void*)name, (int)len);
if (res == 0) {
if (len > 0) {
strlcpy(self->pthread_name, name, MAXTHREADNAMESIZE);
} else {
bzero(self->pthread_name, MAXTHREADNAMESIZE);
}
}
return res;
}
```
Where `5` is [`PROC_INFO_CALL_SETCONTROL`](8d741a5de7/bsd/sys/proc_info_private.h (L274)), and `2` is [`PROC_INFO_CALL_SETCONTROL`](8d741a5de7/bsd/sys/proc_info.h (L821)). And `__proc_info` is a syscall handled by the XNU kernel by [`proc_info_internal`](8d741a5de7/bsd/kern/proc_info.c (L300-L314)):
```c
int
proc_info_internal(int callnum, int pid, uint32_t flags, uint64_t ext_id, int flavor, uint64_t arg, user_addr_t buffer, uint32_t buffersize, int32_t * retval)
{
switch (callnum) {
// ...
case PROC_INFO_CALL_SETCONTROL:
return proc_setcontrol(pid, flavor, arg, buffer, buffersize, retval);
```
And the actual logic from [`proc_setcontrol`](8d741a5de7/bsd/kern/proc_info.c (L3218-L3227)):
```c
case PROC_SELFSET_THREADNAME: {
/*
* This is a bit ugly, as it copies the name into the kernel, and then
* invokes bsd_setthreadname again to copy it into the uthread name
* buffer. Hopefully this isn't such a hot codepath that an additional
* MAXTHREADNAMESIZE copy is a big issue.
*/
if (buffersize > (MAXTHREADNAMESIZE - 1)) {
return ENAMETOOLONG;
}
```
Unrelated to the current pull request, but perhaps, there's a very ugly thing in the kernel/libc because the last thing happening in `PROC_SELFSET_THREADNAME` is `bsd_setthreadname` which sets the name in the user space. But we just saw that `pthread_setname_np` sets the name in the user space too. Guess, I need to open a ticket in one of Apple's repositories at least to clarify that :D