Instead of towards negative infinity as is currently the case.
This done so that the obvious expectations of
`midpoint(a, b) == midpoint(b, a)` and
`midpoint(-a, -b) == -midpoint(a, b)` are true, which makes the even
more obvious implementation `(a + b) / 2` true.
https://github.com/rust-lang/rust/issues/110840#issuecomment-2336753931
* Use a lookup table for 8-bit integers and the Karatsuba square root
algorithm for larger integers.
* Include optimization hints that give the compiler the exact numeric
range of results.
Optimize integer `pow` by removing the exit branch
The branch at the end of the `pow` implementations is redundant with multiplication code already present in the loop. By rotating the exit check, this branch can be largely removed, improving code size and reducing instruction cache misses.
Testing on my machine (`x86_64`, 11th Gen Intel Core i5-1135G7 @ 2.40GHz), the `num::int_pow` benchmarks improve by some 40% for the unchecked operations and show some slight improvement for the checked operations as well.
Fix doc nits
Many tiny changes to stdlib doc comments to make them consistent (for example "Returns foo", rather than "Return foo"), adding missing periods, paragraph breaks, backticks for monospace style, and other minor nits.
In the dynamic exponent case, it's preferred to not increase code size,
so use solely the loop-based implementation there.
This shows about 4% penalty in the variable exponent benchmarks
on x86_64.
The newly optimized loop has introduced a regression in the case
when pow is called with a small constant exponent. LLVM is no longer
able to unroll the loop and the generated code is larger and slower
than what's expected in tests.
Match and handle small exponent values separately by branching out
to an explicit multiplication sequence for that exponent.
Powers larger than 6 need more than three multiplications, so these
cases are less likely to benefit from this optimization, also such
constant exponents are less likely to be used in practice.
For uses with a non-constant exponent, this might also provide
a performance benefit if the exponent is small and does not vary
between successive calls, so the same match arm tends to be taken as
a predicted branch.
The branch at the end of the `pow` implementations is redundant
with multiplication code already present in the loop. By rotating
the exit check, this branch can be largely removed, improving code size
and instruction cache coherence.
It's unnecessary when that arm leads to a `#[cold]` panic anyway, since controlling branch likihood is what `#[cold]` is all about.
(And, well, it's unclear whether `unlikely!` even works these days anyway.)
checked_ilog: improve performance
Addresses #115874.
(This PR replicates the original #115875, which I accidentally closed by deleting my forked repository...)
For things with easily pre-checked overflow conditions -- shifts and unsigned subtraction -- write then checked methods in such a way that we stop emitting wrapping versions of them.
For example, today <https://rust.godbolt.org/z/qM9YK8Txb> neither
```rust
a.checked_sub(b).unwrap()
```
nor
```rust
a.checked_sub(b).unwrap_unchecked()
```
actually optimizes to `sub nuw`. After this PR they do.
De-LLVM the unchecked shifts [MCP#693]
This is just one part of the MCP (https://github.com/rust-lang/compiler-team/issues/693), but it's the one that IMHO removes the most noise from the standard library code.
Seems net simpler this way, since MIR already supported heterogeneous shifts anyway, and thus it's not more work for backends than before.
r? WaffleLapkin
This is just one part of the MCP, but it's the one that IMHO removes the most noise from the standard library code.
Seems net simpler this way, since MIR already supported heterogeneous shifts anyway, and thus it's not more work for backends than before.
Replacement of #114390: Add new intrinsic `is_var_statically_known` and optimize pow for powers of two
This adds a new intrinsic `is_val_statically_known` that lowers to [``@llvm.is.constant.*`](https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic).` It also applies the intrinsic in the int_pow methods to recognize and optimize the idiom `2isize.pow(x)`. See #114390 for more discussion.
While I have extended the scope of the power of two optimization from #114390, I haven't added any new uses for the intrinsic. That can be done in later pull requests.
Note: When testing or using the library, be sure to use `--stage 1` or higher. Otherwise, the intrinsic will be a noop and the doctests will be skipped. If you are trying out edits, you may be interested in [`--keep-stage 0`](https://rustc-dev-guide.rust-lang.org/building/suggested.html#faster-builds-with---keep-stage).
Fixes#47234Resolves#114390
`@Centri3`