1
Fork 0

Add is_ascii function optimized for x86-64 for [u8]

The new `is_ascii` function is optimized to use the
`pmovmskb` vector instruction which tests the high bit in a lane.
This corresponds to the same check of whether a byte is ASCII so
ASCII validity checking can be vectorized. This instruction
does not exist on other platforms so it is likely to regress performance
and is gated to all(target_arch = "x86_64", target_feature = "sse2").

Add codegen test
Remove crate::mem import for functions included in the prelude
This commit is contained in:
okaneco 2024-09-26 19:39:14 -04:00
parent d7d67ad14b
commit 1b5c02b757
3 changed files with 85 additions and 21 deletions

View file

@ -0,0 +1,16 @@
//@ only-x86_64
//@ compile-flags: -C opt-level=3
#![crate_type = "lib"]
/// Check that the fast-path of `is_ascii` uses a `pmovmskb` instruction.
/// Platforms lacking an equivalent instruction use other techniques for
/// optimizing `is_ascii`.
// CHECK-LABEL: @is_ascii_autovectorized
#[no_mangle]
pub fn is_ascii_autovectorized(s: &[u8]) -> bool {
// CHECK: load <32 x i8>
// CHECK-NEXT: icmp slt <32 x i8>
// CHECK-NEXT: bitcast <32 x i1>
// CHECK-NEXT: icmp eq i32
s.is_ascii()
}