Rollup of 9 pull requests
Successful merges:
- #112806 (Small code improvements in `collect_intra_doc_links.rs`)
- #119766 (Split tait and impl trait in assoc items logic)
- #120139 (Do not normalize closure signature when building `FnOnce` shim)
- #120160 (Manually implement derived `NonZero` traits.)
- #120171 (Fix assume and assert in jump threading)
- #120183 (Add `#[coverage(off)]` to closures introduced by `#[test]` and `#[bench]`)
- #120195 (add several resolution test cases)
- #120259 (Split Diagnostics for Uncommon Codepoints: Add List to Display Characters Involved)
- #120261 (Provide structured suggestion to use trait objects in some cases of `if` arm type divergence)
r? `@ghost`
`@rustbot` modify labels: rollup
Only use dense bitsets in dataflow analyses
When a dataflow state has the size close to the number of locals, we should prefer a dense bitset, like we already store locals in a dense vector.
Other occurrences of `ChunkedBitSet` need to be justified by the size of the dataflow state.
Save liveness results for DestinationPropagation
`DestinationPropagation` needs to verify that merge candidates do not conflict with each other. This is done by verifying that a local is not live when its counterpart is written to.
To get the liveness information, the pass runs `MaybeLiveLocals` dataflow analysis repeatedly, once for each propagation round. This is quite costly, and the main driver for the perf impact on `ucd` and `diesel`. (See https://github.com/rust-lang/rust/pull/115105#issuecomment-1689205908)
In order to mitigate this cost, this PR proposes to save the result of the analysis into a `SparseIntervalMatrix`, and mirror merges of locals into that matrix: `liveness(destination) := liveness(destination) union liveness(source)`.
<details>
<summary>Proof</summary>
We denote by `'` all the quantities of the transformed program. Let $\varphi$ be a mapping of locals, which maps `source` to `destination`, and is identity otherwise. The exact liveness set after a statement is $out'(statement)$, and the proposed liveness set is $\varphi(out(statement))$.
Consider a statement. Suppose that the output state verifies $out' \subset phi(out)$. We want to prove that $in' \subset \varphi(in)$ where $in = (out - kill) \cup gen$, and conclude by induction.
We have 2 cases: either that statement is kept with locals renumbered by $\varphi$, or it is a tautological assignment and it removed.
1. If the statement is kept: the gen-set and the kill-set of $statement' = \varphi(statement)$ are $gen' = \varphi(gen)$ and $kill' = \varphi(kill)$ exactly.
From soundness requirement 3, $\varphi(in)$ is disjoint from $\varphi(kill)$.
This implies that $\varphi(out - kill)$ is disjoint from $\varphi(kill)$, and so $\varphi(out - kill) = \varphi(out) - \varphi(kill)$. Then $\varphi(in) = (\varphi(out) - \varphi(kill)) \cup \varphi(gen) = (\varphi(out) - kill') \cup gen'$.
We can conclude that $out' \subset \varphi(out) \implies in' \subset \varphi(in)$.
2. If the statement is removed. As $\varphi(statement)$ is a tautological assignment, we know that $\varphi(gen) = \varphi(kill) = \\{ destination \\}$, while $gen' = kill' = \emptyset$. So $\varphi(in) = \varphi(out) \cup \\{ destination \\}$. Then $in' = out' \subset out \subset \varphi(in)$.
By recursion, we can conclude by that $in' \subset \varphi(in)$ everywhere.
</details>
This approximate liveness results is only suboptimal if there are locals that fully disappear from the CFG due to an assignment cycle. These cases are quite unlikely, so we do not bother with them.
This change allows to reduce the perf impact of DestinationPropagation by half on diesel and ucd (https://github.com/rust-lang/rust/pull/115105#issuecomment-1694701904).
cc ````@JakobDegen````
Rework how diagnostic lints are stored.
`Diagnostic::code` has the type `DiagnosticId`, which has `Error` and
`Lint` variants. Plus `Diagnostic::is_lint` is a bool, which should be
redundant w.r.t. `Diagnostic::code`.
Seems simple. Except it's possible for a lint to have an error code, in
which case its `code` field is recorded as `Error`, and `is_lint` is
required to indicate that it's a lint. This is what happens with
`derive(LintDiagnostic)` lints. Which means those lints don't have a
lint name or a `has_future_breakage` field because those are stored in
the `DiagnosticId::Lint`.
It's all a bit messy and confused and seems unintentional.
This commit:
- removes `DiagnosticId`;
- changes `Diagnostic::code` to `Option<String>`, which means both
errors and lints can straightforwardly have an error code;
- changes `Diagnostic::is_lint` to `Option<IsLint>`, where `IsLint` is a
new type containing a lint name and a `has_future_breakage` bool, so
all lints can have those, error code or not.
r? `@oli-obk`
Sandwich MIR optimizations between DSE.
This PR reorders MIR optimization passes in an attempt to increase their efficiency.
- Stop running CopyProp before GVN, it's useless as GVN will do the same thing anyway. Instead, we perform CopyProp at the end of the pipeline, to ensure we do not emit copy/move chains.
- Run DSE before GVN, as it increases the probability to have single-assignment locals.
- Run DSE after the final CopyProp to turn copies into moves.
r? `@ghost`
Avoid some redundant work in GVN
The first 2 commits are about reducing the perf effect.
Third commit avoids doing redundant work: is a local is SSA, it already has been simplified, and the resulting value is in `self.locals`. No need to call any code on it.
The last commit avoids removing some storage statements.
r? wg-mir-opt
`Diagnostic::code` has the type `DiagnosticId`, which has `Error` and
`Lint` variants. Plus `Diagnostic::is_lint` is a bool, which should be
redundant w.r.t. `Diagnostic::code`.
Seems simple. Except it's possible for a lint to have an error code, in
which case its `code` field is recorded as `Error`, and `is_lint` is
required to indicate that it's a lint. This is what happens with
`derive(LintDiagnostic)` lints. Which means those lints don't have a
lint name or a `has_future_breakage` field because those are stored in
the `DiagnosticId::Lint`.
It's all a bit messy and confused and seems unintentional.
This commit:
- removes `DiagnosticId`;
- changes `Diagnostic::code` to `Option<String>`, which means both
errors and lints can straightforwardly have an error code;
- changes `Diagnostic::is_lint` to `Option<IsLint>`, where `IsLint` is a
new type containing a lint name and a `has_future_breakage` bool, so
all lints can have those, error code or not.
This also switches from `split_off(0)` to `std::mem::take` when emptying the
accumulated list of blocks, because `split_off(0)` handles capacity in a way
that is unintuitive when used in a loop.
The old loop had two separate places where it would flush the acumulated list
of straight-line blocks into a new BCB. One occurred at the start of the loop
body when the current block couldn't be chained into, and the other occurred at
the end of the loop body when the current block couldn't be chained from.
The latter check can be hoisted to the start of the loop body by making it
examine the previous block (which has added itself to the list) instead of the
current block. With that done, we can combine the two separate flushes into one
flush with two possible trigger conditions.
Filtering out unreachable successors is only needed by the main graph traversal
loop, so we can move the filtering step into that loop instead, eliminating the
need to pass the MIR body into `bcb_filtered_successors`.
coverage: Add enums to accommodate other kinds of coverage mappings
Extracted from #118305.
LLVM supports several different kinds of coverage mapping regions, but currently we only ever emit ordinary “code” regions. This PR performs the plumbing required to add other kinds of regions as enum variants, but does not add any specific variants other than `Code`.
The main motivation for this change is branch coverage, but it will also allow separate experimentation with gap regions and skipped regions, which might help in producing more accurate and useful coverage reports.
---
``@rustbot`` label +A-code-coverage