Rollup merge of #134920 - lqd:polonius-next-episode-6, r=jackh726

Convert typeck constraints in location-sensitive polonius

In this PR, we do a big chunk of the work of localizing regular outlives constraints.

The slightly annoying thing is handling effectful statements: usually the subset graph propagates loans at a single point between regions, and liveness propagates loans between points within a single region, but some statements have effects applied on exit.

This was also a problem before, in datalog polonius terms and Niko's solution at the time, this is about: the mid-point. The idea was to duplicate all MIR locations into two physical points, and orchestrate the effects with that. Somewhat easier to do, but double the CFG.

We've always believed we didn't _need_ midpoints in principle, as we can represent changes on exit as on happening entry to the successor, but there's some difficulty in tracking the position information at sufficient granularity through outlives relation (especially since we also have bidirectional edges and time-traveling now).

Now, that is surely what we should be doing in the future. In the mean time, I infer this from the kind of statement/terminator where an outlives constraint arose. It's not particularly complicated but some explanation will help clarify the code.

Assignments (in their various forms) are the quintessential example of these crossover cases: loans that would flow into the LHS would not be visible on entry to the point but on exit -- so we'll localize these edges to the successor. Let's look at a real-world example, involving invariance for bidirectional edges:

```rust
let mut _1: HashMap<i32, &'7 i32>;
let mut _3: &'9 mut HashMap<i32, &'10 i32>;
...
/* at bb1[3]: */ _3 = &'3 mut _1;
```

Here, typeck expectedly produces 3 outlives constraints today:
1. `'3 -> '9`
2. `'7 -> '10`
3. `'10 -> '7`

And we localize them like so,

1. `'3 -> '9` flows into the LHS and becomes: `3_bb1_3 -> 9_bb1_4`
2. `'7 -> '10` flows into the LHS and becomes: `7_bb1_3 -> 10_bb1_4`
3. `'10 -> '7` flows from the LHS and becomes: `10_bb1_4 -> 7_bb1_3` (time traveling 👌)

---

r? ``@jackh726``

To keep you entertained during the holidays I also threw in a couple of small changes removing cruft in the borrow checker.

We're actually getting there. The next PR will be the last one needed to get end-to-end tests working.
This commit is contained in:
Jacob Pratt 2025-01-08 00:52:46 -05:00 committed by GitHub
commit 808c8f84c3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 309 additions and 117 deletions

View file

@ -11,7 +11,6 @@ use rustc_mir_dataflow::move_paths::MoveData;
use tracing::debug;
use crate::BorrowIndex;
use crate::path_utils::allow_two_phase_borrow;
use crate::place_ext::PlaceExt;
pub struct BorrowSet<'tcx> {
@ -350,7 +349,7 @@ impl<'a, 'tcx> GatherBorrows<'a, 'tcx> {
start_location, assigned_place, borrow_index,
);
if !allow_two_phase_borrow(kind) {
if !kind.allows_two_phase_borrow() {
debug!(" -> {:?}", start_location);
return;
}

View file

@ -10,6 +10,7 @@ use rustc_hir::intravisit::Visitor;
use rustc_hir::{self as hir, BindingMode, ByRef, Node};
use rustc_middle::bug;
use rustc_middle::hir::place::PlaceBase;
use rustc_middle::mir::visit::PlaceContext;
use rustc_middle::mir::{
self, BindingForm, Local, LocalDecl, LocalInfo, LocalKind, Location, Mutability, Place,
PlaceRef, ProjectionElem,
@ -22,7 +23,6 @@ use rustc_trait_selection::traits;
use tracing::debug;
use crate::diagnostics::BorrowedContentSource;
use crate::util::FindAssignments;
use crate::{MirBorrowckCtxt, session_diagnostics};
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
@ -1088,6 +1088,38 @@ impl<'infcx, 'tcx> MirBorrowckCtxt<'_, 'infcx, 'tcx> {
}
}
/// Finds all statements that assign directly to local (i.e., X = ...) and returns their
/// locations.
fn find_assignments(&self, local: Local) -> Vec<Location> {
use rustc_middle::mir::visit::Visitor;
struct FindLocalAssignmentVisitor {
needle: Local,
locations: Vec<Location>,
}
impl<'tcx> Visitor<'tcx> for FindLocalAssignmentVisitor {
fn visit_local(
&mut self,
local: Local,
place_context: PlaceContext,
location: Location,
) {
if self.needle != local {
return;
}
if place_context.is_place_assignment() {
self.locations.push(location);
}
}
}
let mut visitor = FindLocalAssignmentVisitor { needle: local, locations: vec![] };
visitor.visit_body(self.body);
visitor.locations
}
fn suggest_make_local_mut(&self, err: &mut Diag<'_>, local: Local, name: Symbol) {
let local_decl = &self.body.local_decls[local];
@ -1121,7 +1153,7 @@ impl<'infcx, 'tcx> MirBorrowckCtxt<'_, 'infcx, 'tcx> {
})) => {
// check if the RHS is from desugaring
let opt_assignment_rhs_span =
self.body.find_assignments(local).first().map(|&location| {
self.find_assignments(local).first().map(|&location| {
if let Some(mir::Statement {
source_info: _,
kind:

View file

@ -17,7 +17,7 @@
use std::cell::RefCell;
use std::marker::PhantomData;
use std::ops::Deref;
use std::ops::{ControlFlow, Deref};
use rustc_abi::FieldIdx;
use rustc_data_structures::fx::{FxIndexMap, FxIndexSet};
@ -81,7 +81,6 @@ mod session_diagnostics;
mod type_check;
mod universal_regions;
mod used_muts;
mod util;
/// A public API provided for the Rust compiler consumers.
pub mod consumers;
@ -1054,31 +1053,31 @@ impl<'a, 'tcx> MirBorrowckCtxt<'a, '_, 'tcx> {
rw,
(borrow_index, borrow),
);
Control::Continue
ControlFlow::Continue(())
}
(Read(_), BorrowKind::Shared | BorrowKind::Fake(_))
| (
Read(ReadKind::Borrow(BorrowKind::Fake(FakeBorrowKind::Shallow))),
BorrowKind::Mut { .. },
) => Control::Continue,
) => ControlFlow::Continue(()),
(Reservation(_), BorrowKind::Fake(_) | BorrowKind::Shared) => {
// This used to be a future compatibility warning (to be
// disallowed on NLL). See rust-lang/rust#56254
Control::Continue
ControlFlow::Continue(())
}
(Write(WriteKind::Move), BorrowKind::Fake(FakeBorrowKind::Shallow)) => {
// Handled by initialization checks.
Control::Continue
ControlFlow::Continue(())
}
(Read(kind), BorrowKind::Mut { .. }) => {
// Reading from mere reservations of mutable-borrows is OK.
if !is_active(this.dominators(), borrow, location) {
assert!(allow_two_phase_borrow(borrow.kind));
return Control::Continue;
assert!(borrow.kind.allows_two_phase_borrow());
return ControlFlow::Continue(());
}
error_reported = true;
@ -1094,7 +1093,7 @@ impl<'a, 'tcx> MirBorrowckCtxt<'a, '_, 'tcx> {
this.buffer_error(err);
}
}
Control::Break
ControlFlow::Break(())
}
(Reservation(kind) | Activation(kind, _) | Write(kind), _) => {
@ -1141,7 +1140,7 @@ impl<'a, 'tcx> MirBorrowckCtxt<'a, '_, 'tcx> {
this.report_illegal_mutation_of_borrowed(location, place_span, borrow)
}
}
Control::Break
ControlFlow::Break(())
}
},
);
@ -1185,7 +1184,7 @@ impl<'a, 'tcx> MirBorrowckCtxt<'a, '_, 'tcx> {
}
BorrowKind::Mut { .. } => {
let wk = WriteKind::MutableBorrow(bk);
if allow_two_phase_borrow(bk) {
if bk.allows_two_phase_borrow() {
(Deep, Reservation(wk))
} else {
(Deep, Write(wk))

View file

@ -142,9 +142,9 @@ pub(crate) fn compute_regions<'a, 'tcx>(
// If requested for `-Zpolonius=next`, convert NLL constraints to localized outlives
// constraints.
let localized_outlives_constraints = polonius_context
.as_mut()
.map(|polonius_context| polonius_context.create_localized_constraints(&mut regioncx, body));
let localized_outlives_constraints = polonius_context.as_mut().map(|polonius_context| {
polonius_context.create_localized_constraints(infcx.tcx, &regioncx, body)
});
// If requested: dump NLL facts, and run legacy polonius analysis.
let polonius_output = all_facts.as_ref().and_then(|all_facts| {

View file

@ -1,26 +1,14 @@
use std::ops::ControlFlow;
use rustc_abi::FieldIdx;
use rustc_data_structures::graph::dominators::Dominators;
use rustc_middle::mir::{BasicBlock, Body, BorrowKind, Location, Place, PlaceRef, ProjectionElem};
use rustc_middle::mir::{BasicBlock, Body, Location, Place, PlaceRef, ProjectionElem};
use rustc_middle::ty::TyCtxt;
use tracing::debug;
use crate::borrow_set::{BorrowData, BorrowSet, TwoPhaseActivation};
use crate::{AccessDepth, BorrowIndex, places_conflict};
/// Returns `true` if the borrow represented by `kind` is
/// allowed to be split into separate Reservation and
/// Activation phases.
pub(super) fn allow_two_phase_borrow(kind: BorrowKind) -> bool {
kind.allows_two_phase_borrow()
}
/// Control for the path borrow checking code
#[derive(Copy, Clone, PartialEq, Eq, Debug)]
pub(super) enum Control {
Continue,
Break,
}
/// Encapsulates the idea of iterating over every borrow that involves a particular path
pub(super) fn each_borrow_involving_path<'tcx, F, I, S>(
s: &mut S,
@ -31,7 +19,7 @@ pub(super) fn each_borrow_involving_path<'tcx, F, I, S>(
is_candidate: I,
mut op: F,
) where
F: FnMut(&mut S, BorrowIndex, &BorrowData<'tcx>) -> Control,
F: FnMut(&mut S, BorrowIndex, &BorrowData<'tcx>) -> ControlFlow<()>,
I: Fn(BorrowIndex) -> bool,
{
let (access, place) = access_place;
@ -62,7 +50,7 @@ pub(super) fn each_borrow_involving_path<'tcx, F, I, S>(
i, borrowed, place, access
);
let ctrl = op(s, i, borrowed);
if ctrl == Control::Break {
if matches!(ctrl, ControlFlow::Break(_)) {
return;
}
}

View file

@ -1,3 +1,5 @@
use std::ops::ControlFlow;
use rustc_data_structures::graph::dominators::Dominators;
use rustc_middle::bug;
use rustc_middle::mir::visit::Visitor;
@ -260,7 +262,7 @@ impl<'a, 'tcx> LoanInvalidationsGenerator<'a, 'tcx> {
}
BorrowKind::Mut { .. } => {
let wk = WriteKind::MutableBorrow(bk);
if allow_two_phase_borrow(bk) {
if bk.allows_two_phase_borrow() {
(Deep, Reservation(wk))
} else {
(Deep, Write(wk))
@ -378,8 +380,8 @@ impl<'a, 'tcx> LoanInvalidationsGenerator<'a, 'tcx> {
// Reading from mere reservations of mutable-borrows is OK.
if !is_active(this.dominators, borrow, location) {
// If the borrow isn't active yet, reads don't invalidate it
assert!(allow_two_phase_borrow(borrow.kind));
return Control::Continue;
assert!(borrow.kind.allows_two_phase_borrow());
return ControlFlow::Continue(());
}
// Unique and mutable borrows are invalidated by reads from any
@ -395,7 +397,7 @@ impl<'a, 'tcx> LoanInvalidationsGenerator<'a, 'tcx> {
this.emit_loan_invalidated_at(borrow_index, location);
}
}
Control::Continue
ControlFlow::Continue(())
},
);
}

View file

@ -37,21 +37,20 @@ mod constraints;
mod dump;
pub(crate) mod legacy;
mod liveness_constraints;
mod typeck_constraints;
use std::collections::BTreeMap;
use rustc_index::bit_set::SparseBitMatrix;
use rustc_middle::mir::{Body, Location};
use rustc_middle::ty::RegionVid;
use rustc_middle::mir::Body;
use rustc_middle::ty::{RegionVid, TyCtxt};
use rustc_mir_dataflow::points::PointIndex;
pub(crate) use self::constraints::*;
pub(crate) use self::dump::dump_polonius_mir;
use self::liveness_constraints::create_liveness_constraints;
use self::typeck_constraints::convert_typeck_constraints;
use crate::RegionInferenceContext;
use crate::constraints::OutlivesConstraint;
use crate::region_infer::values::LivenessValues;
use crate::type_check::Locations;
/// This struct holds the data needed to create the Polonius localized constraints.
pub(crate) struct PoloniusContext {
@ -88,14 +87,17 @@ impl PoloniusContext {
/// - encoding liveness constraints
pub(crate) fn create_localized_constraints<'tcx>(
&self,
tcx: TyCtxt<'tcx>,
regioncx: &RegionInferenceContext<'tcx>,
body: &Body<'tcx>,
) -> LocalizedOutlivesConstraintSet {
let mut localized_outlives_constraints = LocalizedOutlivesConstraintSet::default();
convert_typeck_constraints(
tcx,
body,
regioncx.liveness_constraints(),
regioncx.outlives_constraints(),
regioncx.universal_regions(),
&mut localized_outlives_constraints,
);
@ -117,38 +119,3 @@ impl PoloniusContext {
localized_outlives_constraints
}
}
/// Propagate loans throughout the subset graph at a given point (with some subtleties around the
/// location where effects start to be visible).
fn convert_typeck_constraints<'tcx>(
body: &Body<'tcx>,
liveness: &LivenessValues,
outlives_constraints: impl Iterator<Item = OutlivesConstraint<'tcx>>,
localized_outlives_constraints: &mut LocalizedOutlivesConstraintSet,
) {
for outlives_constraint in outlives_constraints {
match outlives_constraint.locations {
Locations::All(_) => {
// For now, turn logical constraints holding at all points into physical edges at
// every point in the graph.
// FIXME: encode this into *traversal* instead.
for (block, bb) in body.basic_blocks.iter_enumerated() {
let statement_count = bb.statements.len();
for statement_index in 0..=statement_count {
let current_location = Location { block, statement_index };
let current_point = liveness.point_from_location(current_location);
localized_outlives_constraints.push(LocalizedOutlivesConstraint {
source: outlives_constraint.sup,
from: current_point,
target: outlives_constraint.sub,
to: current_point,
});
}
}
}
_ => {}
}
}
}

View file

@ -0,0 +1,241 @@
use rustc_data_structures::fx::FxHashSet;
use rustc_middle::mir::{Body, Location, Statement, StatementKind, Terminator, TerminatorKind};
use rustc_middle::ty::{TyCtxt, TypeVisitable};
use rustc_mir_dataflow::points::PointIndex;
use super::{LocalizedOutlivesConstraint, LocalizedOutlivesConstraintSet};
use crate::constraints::OutlivesConstraint;
use crate::region_infer::values::LivenessValues;
use crate::type_check::Locations;
use crate::universal_regions::UniversalRegions;
/// Propagate loans throughout the subset graph at a given point (with some subtleties around the
/// location where effects start to be visible).
pub(super) fn convert_typeck_constraints<'tcx>(
tcx: TyCtxt<'tcx>,
body: &Body<'tcx>,
liveness: &LivenessValues,
outlives_constraints: impl Iterator<Item = OutlivesConstraint<'tcx>>,
universal_regions: &UniversalRegions<'tcx>,
localized_outlives_constraints: &mut LocalizedOutlivesConstraintSet,
) {
for outlives_constraint in outlives_constraints {
match outlives_constraint.locations {
Locations::All(_) => {
// For now, turn logical constraints holding at all points into physical edges at
// every point in the graph.
// FIXME: encode this into *traversal* instead.
for (block, bb) in body.basic_blocks.iter_enumerated() {
let statement_count = bb.statements.len();
for statement_index in 0..=statement_count {
let current_location = Location { block, statement_index };
let current_point = liveness.point_from_location(current_location);
localized_outlives_constraints.push(LocalizedOutlivesConstraint {
source: outlives_constraint.sup,
from: current_point,
target: outlives_constraint.sub,
to: current_point,
});
}
}
}
Locations::Single(location) => {
// This constraint is marked as holding at one location, we localize it to that
// location or its successor, depending on the corresponding MIR
// statement/terminator. Unfortunately, they all show up from typeck as coming "on
// entry", so for now we modify them to take effects that should apply "on exit"
// into account.
//
// FIXME: this approach is subtle, complicated, and hard to test, so we should track
// this information better in MIR typeck instead, for example with a new `Locations`
// variant that contains which node is crossing over between entry and exit.
let point = liveness.point_from_location(location);
let localized_constraint = if let Some(stmt) =
body[location.block].statements.get(location.statement_index)
{
localize_statement_constraint(
tcx,
body,
stmt,
liveness,
&outlives_constraint,
location,
point,
universal_regions,
)
} else {
assert_eq!(location.statement_index, body[location.block].statements.len());
let terminator = body[location.block].terminator();
localize_terminator_constraint(
tcx,
body,
terminator,
liveness,
&outlives_constraint,
point,
universal_regions,
)
};
localized_outlives_constraints.push(localized_constraint);
}
}
}
}
/// For a given outlives constraint arising from a MIR statement, localize the constraint with the
/// needed CFG `from`-`to` intra-block nodes.
fn localize_statement_constraint<'tcx>(
tcx: TyCtxt<'tcx>,
body: &Body<'tcx>,
stmt: &Statement<'tcx>,
liveness: &LivenessValues,
outlives_constraint: &OutlivesConstraint<'tcx>,
current_location: Location,
current_point: PointIndex,
universal_regions: &UniversalRegions<'tcx>,
) -> LocalizedOutlivesConstraint {
match &stmt.kind {
StatementKind::Assign(box (lhs, rhs)) => {
// To create localized outlives constraints without midpoints, we rely on the property
// that no input regions from the RHS of the assignment will flow into themselves: they
// should not appear in the output regions in the LHS. We believe this to be true by
// construction of the MIR, via temporaries, and assert it here.
//
// We think we don't need midpoints because:
// - every LHS Place has a unique set of regions that don't appear elsewhere
// - this implies that for them to be part of the RHS, the same Place must be read and
// written
// - and that should be impossible in MIR
//
// When we have a more complete implementation in the future, tested with crater, etc,
// we can relax this to a debug assert instead, or remove it.
assert!(
{
let mut lhs_regions = FxHashSet::default();
tcx.for_each_free_region(lhs, |region| {
let region = universal_regions.to_region_vid(region);
lhs_regions.insert(region);
});
let mut rhs_regions = FxHashSet::default();
tcx.for_each_free_region(rhs, |region| {
let region = universal_regions.to_region_vid(region);
rhs_regions.insert(region);
});
// The intersection between LHS and RHS regions should be empty.
lhs_regions.is_disjoint(&rhs_regions)
},
"there should be no common regions between the LHS and RHS of an assignment"
);
// As mentioned earlier, we should be tracking these better upstream but: we want to
// relate the types on entry to the type of the place on exit. That is, outlives
// constraints on the RHS are on entry, and outlives constraints to/from the LHS are on
// exit (i.e. on entry to the successor location).
let lhs_ty = body.local_decls[lhs.local].ty;
let successor_location = Location {
block: current_location.block,
statement_index: current_location.statement_index + 1,
};
let successor_point = liveness.point_from_location(successor_location);
compute_constraint_direction(
tcx,
outlives_constraint,
&lhs_ty,
current_point,
successor_point,
universal_regions,
)
}
_ => {
// For the other cases, we localize an outlives constraint to where it arises.
LocalizedOutlivesConstraint {
source: outlives_constraint.sup,
from: current_point,
target: outlives_constraint.sub,
to: current_point,
}
}
}
}
/// For a given outlives constraint arising from a MIR terminator, localize the constraint with the
/// needed CFG `from`-`to` inter-block nodes.
fn localize_terminator_constraint<'tcx>(
tcx: TyCtxt<'tcx>,
body: &Body<'tcx>,
terminator: &Terminator<'tcx>,
liveness: &LivenessValues,
outlives_constraint: &OutlivesConstraint<'tcx>,
current_point: PointIndex,
universal_regions: &UniversalRegions<'tcx>,
) -> LocalizedOutlivesConstraint {
// FIXME: check if other terminators need the same handling as `Call`s, in particular
// Assert/Yield/Drop. A handful of tests are failing with Drop related issues, as well as some
// coroutine tests, and that may be why.
match &terminator.kind {
// FIXME: also handle diverging calls.
TerminatorKind::Call { destination, target: Some(target), .. } => {
// Calls are similar to assignments, and thus follow the same pattern. If there is a
// target for the call we also relate what flows into the destination here to entry to
// that successor.
let destination_ty = destination.ty(&body.local_decls, tcx);
let successor_location = Location { block: *target, statement_index: 0 };
let successor_point = liveness.point_from_location(successor_location);
compute_constraint_direction(
tcx,
outlives_constraint,
&destination_ty,
current_point,
successor_point,
universal_regions,
)
}
_ => {
// Typeck constraints guide loans between regions at the current point, so we do that in
// the general case, and liveness will take care of making them flow to the terminator's
// successors.
LocalizedOutlivesConstraint {
source: outlives_constraint.sup,
from: current_point,
target: outlives_constraint.sub,
to: current_point,
}
}
}
}
/// For a given outlives constraint and CFG edge, returns the localized constraint with the
/// appropriate `from`-`to` direction. This is computed according to whether the constraint flows to
/// or from a free region in the given `value`, some kind of result for an effectful operation, like
/// the LHS of an assignment.
fn compute_constraint_direction<'tcx>(
tcx: TyCtxt<'tcx>,
outlives_constraint: &OutlivesConstraint<'tcx>,
value: &impl TypeVisitable<TyCtxt<'tcx>>,
current_point: PointIndex,
successor_point: PointIndex,
universal_regions: &UniversalRegions<'tcx>,
) -> LocalizedOutlivesConstraint {
let mut to = current_point;
let mut from = current_point;
tcx.for_each_free_region(value, |region| {
let region = universal_regions.to_region_vid(region);
if region == outlives_constraint.sub {
// This constraint flows into the result, its effects start becoming visible on exit.
to = successor_point;
} else if region == outlives_constraint.sup {
// This constraint flows from the result, its effects start becoming visible on exit.
from = successor_point;
}
});
LocalizedOutlivesConstraint {
source: outlives_constraint.sup,
from,
target: outlives_constraint.sub,
to,
}
}

View file

@ -1,35 +0,0 @@
use rustc_middle::mir::visit::{PlaceContext, Visitor};
use rustc_middle::mir::{Body, Local, Location};
pub(crate) trait FindAssignments {
// Finds all statements that assign directly to local (i.e., X = ...)
// and returns their locations.
fn find_assignments(&self, local: Local) -> Vec<Location>;
}
impl<'tcx> FindAssignments for Body<'tcx> {
fn find_assignments(&self, local: Local) -> Vec<Location> {
let mut visitor = FindLocalAssignmentVisitor { needle: local, locations: vec![] };
visitor.visit_body(self);
visitor.locations
}
}
// The Visitor walks the MIR to return the assignment statements corresponding
// to a Local.
struct FindLocalAssignmentVisitor {
needle: Local,
locations: Vec<Location>,
}
impl<'tcx> Visitor<'tcx> for FindLocalAssignmentVisitor {
fn visit_local(&mut self, local: Local, place_context: PlaceContext, location: Location) {
if self.needle != local {
return;
}
if place_context.is_place_assignment() {
self.locations.push(location);
}
}
}

View file

@ -1,3 +0,0 @@
mod collect_writes;
pub(crate) use collect_writes::FindAssignments;