Auto merge of #110083 - saethlin:encode-hashes-as-bytes, r=cjgillot

Encode hashes as bytes, not varint

In a few places, we store hashes as `u64` or `u128` and then apply `derive(Decodable, Encodable)` to the enclosing struct/enum. It is more efficient to encode hashes directly than try to apply some varint encoding. This PR adds two new types `Hash64` and `Hash128` which are produced by `StableHasher` and replace every use of storing a `u64` or `u128` that represents a hash.

Distribution of the byte lengths of leb128 encodings, from `x build --stage 2` with `incremental = true`

Before:
```
(  1) 373418203 (53.7%, 53.7%): 1
(  2) 196240113 (28.2%, 81.9%): 3
(  3) 108157958 (15.6%, 97.5%): 2
(  4)  17213120 ( 2.5%, 99.9%): 4
(  5)    223614 ( 0.0%,100.0%): 9
(  6)    216262 ( 0.0%,100.0%): 10
(  7)     15447 ( 0.0%,100.0%): 5
(  8)      3633 ( 0.0%,100.0%): 19
(  9)      3030 ( 0.0%,100.0%): 8
( 10)      1167 ( 0.0%,100.0%): 18
( 11)      1032 ( 0.0%,100.0%): 7
( 12)      1003 ( 0.0%,100.0%): 6
( 13)        10 ( 0.0%,100.0%): 16
( 14)        10 ( 0.0%,100.0%): 17
( 15)         5 ( 0.0%,100.0%): 12
( 16)         4 ( 0.0%,100.0%): 14
```

After:
```
(  1) 372939136 (53.7%, 53.7%): 1
(  2) 196240140 (28.3%, 82.0%): 3
(  3) 108014969 (15.6%, 97.5%): 2
(  4)  17192375 ( 2.5%,100.0%): 4
(  5)       435 ( 0.0%,100.0%): 5
(  6)        83 ( 0.0%,100.0%): 18
(  7)        79 ( 0.0%,100.0%): 10
(  8)        50 ( 0.0%,100.0%): 9
(  9)         6 ( 0.0%,100.0%): 19
```

The remaining 9 or 10 and 18 or 19 are `u64` and `u128` respectively that have the high bits set. As far as I can tell these are coming primarily from `SwitchTargets`.
This commit is contained in:
bors 2023-04-18 22:27:15 +00:00
commit b3f1379509
38 changed files with 289 additions and 138 deletions

View file

@ -16,6 +16,7 @@ pub use self::config::{HashResult, QueryConfig, TryLoadFromDisk};
use crate::dep_graph::DepKind;
use crate::dep_graph::{DepNodeIndex, HasDepContext, SerializedDepNodeIndex};
use rustc_data_structures::stable_hasher::Hash64;
use rustc_data_structures::sync::Lock;
use rustc_errors::Diagnostic;
use rustc_hir::def::DefKind;
@ -37,7 +38,7 @@ pub struct QueryStackFrame<D: DepKind> {
/// This hash is used to deterministically pick
/// a query to remove cycles in the parallel compiler.
#[cfg(parallel_compiler)]
hash: u64,
hash: Hash64,
}
impl<D: DepKind> QueryStackFrame<D> {
@ -49,7 +50,7 @@ impl<D: DepKind> QueryStackFrame<D> {
def_kind: Option<DefKind>,
dep_kind: D,
ty_adt_id: Option<DefId>,
_hash: impl FnOnce() -> u64,
_hash: impl FnOnce() -> Hash64,
) -> Self {
Self {
description,

View file

@ -573,7 +573,7 @@ where
// from disk. Re-hashing results is fairly expensive, so we can't
// currently afford to verify every hash. This subset should still
// give us some coverage of potential bugs though.
let try_verify = prev_fingerprint.as_value().1 % 32 == 0;
let try_verify = prev_fingerprint.split().1.as_u64() % 32 == 0;
if std::intrinsics::unlikely(
try_verify || qcx.dep_context().sess().opts.unstable_opts.incremental_verify_ich,
) {