rename ptr::from_exposed_addr -> ptr::with_exposed_provenance
As discussed on [Zulip](427757066).
The old name, `from_exposed_addr`, makes little sense as it's not the address that is exposed, it's the provenance. (`ptr.expose_addr()` stays unchanged as we haven't found a better option yet. The intended interpretation is "expose the provenance and return the address".)
The new name nicely matches `ptr::without_provenance`.
Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
Unbox and unwrap the contents of `StatementKind::Coverage`
The payload of coverage statements was historically a structure with several fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace `Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid accidentally bloating it in the future.
``@rustbot`` label +A-code-coverage
The payload of coverage statements was historically a structure with several
fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace
`Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid
accidentally bloating it in the future.
Some of the marker statements used by coverage are added during MIR building
for use by the InstrumentCoverage pass (during analysis), and are not needed
afterwards.
fixes#117448
For example unnecessary imports in std::prelude that can be eliminated:
```rust
use std::option::Option::Some;//~ WARNING the item `Some` is imported redundantly
use std::option::Option::None; //~ WARNING the item `None` is imported redundantly
```
This new description reflects the changes made in this PR, and should hopefully
be more useful to non-coverage developers who need to care about coverage
statements.
Previously, mappings were attached to individual coverage statements in MIR.
That necessitated special handling in MIR optimizations to avoid deleting those
statements, since otherwise codegen would be unable to reassemble the original
list of mappings.
With this change, a function's list of mappings is now attached to its MIR
body, and survives intact even if individual statements are deleted by
optimizations.
coverage: Allow each coverage statement to have multiple code regions
The original implementation of coverage instrumentation was built around the assumption that a coverage counter/expression would be associated with *up to one* code region. When it was discovered that *multiple* regions would sometimes need to share a counter, a workaround was found: for the remaining regions, the instrumentor would create a fresh expression that adds zero to the existing counter/expression.
That got the job done, but resulted in some awkward code, and produces unnecessarily complicated coverage maps in the final binary.
---
This PR removes that tension by changing `StatementKind::Coverage`'s code region field from `Option<CodeRegion>` to `Vec<CodeRegion>`.
The changes on the codegen side are fairly straightforward. As long as each `CoverageKind::Counter` only injects one `llvm.instrprof.increment`, the rest of coverage codegen is happy to handle multiple regions mapped to the same counter/expression, with only minor option-to-vec adjustments.
On the instrumentor/mir-transform side, we can get rid of the code that creates extra (x + 0) expressions. Instead we gather all of the code regions associated with a single BCB, and inject them all into one coverage statement.
---
There are several patches here but they can be divided in to three phases:
- Preparatory work
- Actually switching over to multiple regions per coverage statement
- Cleaning up
So viewing the patches individually may be easier.
It makes it sound like the `ExprKind` and `Rvalue` are supposed to represent all pointer related
casts, when in reality their just used to share a some enum variants. Make it clear there these
are only coercion to make it clear why only some pointer related "casts" are in the enum.