Auto merge of #81635 - michaelwoerister:structured_def_path_hash, r=pnkfelix
Let a portion of DefPathHash uniquely identify the DefPath's crate. This allows to directly map from a `DefPathHash` to the crate it originates from, without constructing side tables to do that mapping -- something that is useful for incremental compilation where we deal with `DefPathHash` instead of `DefId` a lot. It also allows to reliably and cheaply check for `DefPathHash` collisions which allows the compiler to gracefully abort compilation instead of running into a subsequent ICE at some random place in the code. The following new piece of documentation describes the most interesting aspects of the changes: ```rust /// A `DefPathHash` is a fixed-size representation of a `DefPath` that is /// stable across crate and compilation session boundaries. It consists of two /// separate 64-bit hashes. The first uniquely identifies the crate this /// `DefPathHash` originates from (see [StableCrateId]), and the second /// uniquely identifies the corresponding `DefPath` within that crate. Together /// they form a unique identifier within an entire crate graph. /// /// There is a very small chance of hash collisions, which would mean that two /// different `DefPath`s map to the same `DefPathHash`. Proceeding compilation /// with such a hash collision would very probably lead to an ICE and, in the /// worst case, to a silent mis-compilation. The compiler therefore actively /// and exhaustively checks for such hash collisions and aborts compilation if /// it finds one. /// /// `DefPathHash` uses 64-bit hashes for both the crate-id part and the /// crate-internal part, even though it is likely that there are many more /// `LocalDefId`s in a single crate than there are individual crates in a crate /// graph. Since we use the same number of bits in both cases, the collision /// probability for the crate-local part will be quite a bit higher (though /// still very small). /// /// This imbalance is not by accident: A hash collision in the /// crate-local part of a `DefPathHash` will be detected and reported while /// compiling the crate in question. Such a collision does not depend on /// outside factors and can be easily fixed by the crate maintainer (e.g. by /// renaming the item in question or by bumping the crate version in a harmless /// way). /// /// A collision between crate-id hashes on the other hand is harder to fix /// because it depends on the set of crates in the entire crate graph of a /// compilation session. Again, using the same crate with a different version /// number would fix the issue with a high probability -- but that might be /// easier said then done if the crates in questions are dependencies of /// third-party crates. /// /// That being said, given a high quality hash function, the collision /// probabilities in question are very small. For example, for a big crate like /// `rustc_middle` (with ~50000 `LocalDefId`s as of the time of writing) there /// is a probability of roughly 1 in 14,750,000,000 of a crate-internal /// collision occurring. For a big crate graph with 1000 crates in it, there is /// a probability of 1 in 36,890,000,000,000 of a `StableCrateId` collision. ``` Given the probabilities involved I hope that no one will ever actually see the error messages. Nonetheless, I'd be glad about some feedback on how to improve them. Should we create a GH issue describing the problem and possible solutions to point to? Or a page in the rustc book? r? `@pnkfelix` (feel free to re-assign)
This commit is contained in:
commit
76c500ec6c
29 changed files with 348 additions and 127 deletions
|
@ -5,13 +5,16 @@
|
|||
//! expressions) that are mostly just leftovers.
|
||||
|
||||
pub use crate::def_id::DefPathHash;
|
||||
use crate::def_id::{CrateNum, DefId, DefIndex, LocalDefId, CRATE_DEF_INDEX, LOCAL_CRATE};
|
||||
use crate::def_id::{
|
||||
CrateNum, DefId, DefIndex, LocalDefId, StableCrateId, CRATE_DEF_INDEX, LOCAL_CRATE,
|
||||
};
|
||||
use crate::hir;
|
||||
|
||||
use rustc_ast::crate_disambiguator::CrateDisambiguator;
|
||||
use rustc_data_structures::fx::FxHashMap;
|
||||
use rustc_data_structures::stable_hasher::StableHasher;
|
||||
use rustc_data_structures::unhash::UnhashMap;
|
||||
use rustc_index::vec::IndexVec;
|
||||
use rustc_span::crate_disambiguator::CrateDisambiguator;
|
||||
use rustc_span::hygiene::ExpnId;
|
||||
use rustc_span::symbol::{kw, sym, Symbol};
|
||||
|
||||
|
@ -27,6 +30,7 @@ use tracing::debug;
|
|||
pub struct DefPathTable {
|
||||
index_to_key: IndexVec<DefIndex, DefKey>,
|
||||
def_path_hashes: IndexVec<DefIndex, DefPathHash>,
|
||||
def_path_hash_to_index: UnhashMap<DefPathHash, DefIndex>,
|
||||
}
|
||||
|
||||
impl DefPathTable {
|
||||
|
@ -39,6 +43,35 @@ impl DefPathTable {
|
|||
};
|
||||
self.def_path_hashes.push(def_path_hash);
|
||||
debug_assert!(self.def_path_hashes.len() == self.index_to_key.len());
|
||||
|
||||
// Check for hash collisions of DefPathHashes. These should be
|
||||
// exceedingly rare.
|
||||
if let Some(existing) = self.def_path_hash_to_index.insert(def_path_hash, index) {
|
||||
let def_path1 = DefPath::make(LOCAL_CRATE, existing, |idx| self.def_key(idx));
|
||||
let def_path2 = DefPath::make(LOCAL_CRATE, index, |idx| self.def_key(idx));
|
||||
|
||||
// Continuing with colliding DefPathHashes can lead to correctness
|
||||
// issues. We must abort compilation.
|
||||
//
|
||||
// The likelyhood of such a collision is very small, so actually
|
||||
// running into one could be indicative of a poor hash function
|
||||
// being used.
|
||||
//
|
||||
// See the documentation for DefPathHash for more information.
|
||||
panic!(
|
||||
"found DefPathHash collsion between {:?} and {:?}. \
|
||||
Compilation cannot continue.",
|
||||
def_path1, def_path2
|
||||
);
|
||||
}
|
||||
|
||||
// Assert that all DefPathHashes correctly contain the local crate's
|
||||
// StableCrateId
|
||||
#[cfg(debug_assertions)]
|
||||
if let Some(root) = self.def_path_hashes.get(CRATE_DEF_INDEX) {
|
||||
assert!(def_path_hash.stable_crate_id() == root.stable_crate_id());
|
||||
}
|
||||
|
||||
index
|
||||
}
|
||||
|
||||
|
@ -108,13 +141,10 @@ pub struct DefKey {
|
|||
}
|
||||
|
||||
impl DefKey {
|
||||
fn compute_stable_hash(&self, parent_hash: DefPathHash) -> DefPathHash {
|
||||
pub(crate) fn compute_stable_hash(&self, parent: DefPathHash) -> DefPathHash {
|
||||
let mut hasher = StableHasher::new();
|
||||
|
||||
// We hash a `0u8` here to disambiguate between regular `DefPath` hashes,
|
||||
// and the special "root_parent" below.
|
||||
0u8.hash(&mut hasher);
|
||||
parent_hash.hash(&mut hasher);
|
||||
parent.hash(&mut hasher);
|
||||
|
||||
let DisambiguatedDefPathData { ref data, disambiguator } = self.disambiguated_data;
|
||||
|
||||
|
@ -127,19 +157,13 @@ impl DefKey {
|
|||
|
||||
disambiguator.hash(&mut hasher);
|
||||
|
||||
DefPathHash(hasher.finish())
|
||||
}
|
||||
let local_hash: u64 = hasher.finish();
|
||||
|
||||
fn root_parent_stable_hash(
|
||||
crate_name: &str,
|
||||
crate_disambiguator: CrateDisambiguator,
|
||||
) -> DefPathHash {
|
||||
let mut hasher = StableHasher::new();
|
||||
// Disambiguate this from a regular `DefPath` hash; see `compute_stable_hash()` above.
|
||||
1u8.hash(&mut hasher);
|
||||
crate_name.hash(&mut hasher);
|
||||
crate_disambiguator.hash(&mut hasher);
|
||||
DefPathHash(hasher.finish())
|
||||
// Construct the new DefPathHash, making sure that the `crate_id`
|
||||
// portion of the hash is properly copied from the parent. This way the
|
||||
// `crate_id` part will be recursively propagated from the root to all
|
||||
// DefPathHashes in this DefPathTable.
|
||||
DefPathHash::new(parent.stable_crate_id(), local_hash)
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -295,6 +319,12 @@ impl Definitions {
|
|||
self.table.def_path_hash(id.local_def_index)
|
||||
}
|
||||
|
||||
#[inline]
|
||||
pub fn def_path_hash_to_def_id(&self, def_path_hash: DefPathHash) -> LocalDefId {
|
||||
let local_def_index = self.table.def_path_hash_to_index[&def_path_hash];
|
||||
LocalDefId { local_def_index }
|
||||
}
|
||||
|
||||
/// Returns the path from the crate root to `index`. The root
|
||||
/// nodes are not included in the path (i.e., this will be an
|
||||
/// empty vector for the crate root). For an inlined item, this
|
||||
|
@ -332,7 +362,8 @@ impl Definitions {
|
|||
},
|
||||
};
|
||||
|
||||
let parent_hash = DefKey::root_parent_stable_hash(crate_name, crate_disambiguator);
|
||||
let stable_crate_id = StableCrateId::new(crate_name, crate_disambiguator);
|
||||
let parent_hash = DefPathHash::new(stable_crate_id, 0);
|
||||
let def_path_hash = key.compute_stable_hash(parent_hash);
|
||||
|
||||
// Create the root definition.
|
||||
|
|
|
@ -30,6 +30,9 @@ mod stable_hash_impls;
|
|||
mod target;
|
||||
pub mod weak_lang_items;
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests;
|
||||
|
||||
pub use hir::*;
|
||||
pub use hir_id::*;
|
||||
pub use lang_items::{LangItem, LanguageItems};
|
||||
|
|
39
compiler/rustc_hir/src/tests.rs
Normal file
39
compiler/rustc_hir/src/tests.rs
Normal file
|
@ -0,0 +1,39 @@
|
|||
use crate::definitions::{DefKey, DefPathData, DisambiguatedDefPathData};
|
||||
use rustc_data_structures::fingerprint::Fingerprint;
|
||||
use rustc_span::crate_disambiguator::CrateDisambiguator;
|
||||
use rustc_span::def_id::{DefPathHash, StableCrateId};
|
||||
|
||||
#[test]
|
||||
fn def_path_hash_depends_on_crate_id() {
|
||||
// This test makes sure that *both* halves of a DefPathHash depend on
|
||||
// the crate-id of the defining crate. This is a desirable property
|
||||
// because the crate-id can be more easily changed than the DefPath
|
||||
// of an item, so, in the case of a crate-local DefPathHash collision,
|
||||
// the user can simply "role the dice again" for all DefPathHashes in
|
||||
// the crate by changing the crate disambiguator (e.g. via bumping the
|
||||
// crate's version number).
|
||||
|
||||
let d0 = CrateDisambiguator::from(Fingerprint::new(12, 34));
|
||||
let d1 = CrateDisambiguator::from(Fingerprint::new(56, 78));
|
||||
|
||||
let h0 = mk_test_hash("foo", d0);
|
||||
let h1 = mk_test_hash("foo", d1);
|
||||
|
||||
assert_ne!(h0.stable_crate_id(), h1.stable_crate_id());
|
||||
assert_ne!(h0.local_hash(), h1.local_hash());
|
||||
|
||||
fn mk_test_hash(crate_name: &str, crate_disambiguator: CrateDisambiguator) -> DefPathHash {
|
||||
let stable_crate_id = StableCrateId::new(crate_name, crate_disambiguator);
|
||||
let parent_hash = DefPathHash::new(stable_crate_id, 0);
|
||||
|
||||
let key = DefKey {
|
||||
parent: None,
|
||||
disambiguated_data: DisambiguatedDefPathData {
|
||||
data: DefPathData::CrateRoot,
|
||||
disambiguator: 0,
|
||||
},
|
||||
};
|
||||
|
||||
key.compute_stable_hash(parent_hash)
|
||||
}
|
||||
}
|
Loading…
Add table
Add a link
Reference in a new issue