This attempts to bring better error messages to invalid method calls, by applying some heuristics to identify common mistakes.
The algorithm is inspired by Levenshtein distance and longest common sub-sequence. In essence, we treat the types of the function, and the types of the arguments you provided as two "words" and compute the edits to get from one to the other.
We then modify that algorithm to detect 4 cases:
- A function input is missing
- An extra argument was provided
- The type of an argument is straight up invalid
- Two arguments have been swapped
- A subset of the arguments have been shuffled
(We detect the last two as separate cases so that we can detect two swaps, instead of 4 parameters permuted.)
It helps to understand this argument by paying special attention to terminology: "inputs" refers to the inputs being *expected* by the function, and "arguments" refers to what has been provided at the call site.
The basic sketch of the algorithm is as follows:
- Construct a boolean grid, with a row for each argument, and a column for each input. The cell [i, j] is true if the i'th argument could satisfy the j'th input.
- If we find an argument that could satisfy no inputs, provided for an input that can't be satisfied by any other argument, we consider this an "invalid type".
- Extra arguments are those that can't satisfy any input, provided for an input that *could* be satisfied by another argument.
- Missing inputs are inputs that can't be satisfied by any argument, where the provided argument could satisfy another input
- Swapped / Permuted arguments are identified with a cycle detection algorithm.
As each issue is found, we remove the relevant inputs / arguments and check for more issues. If we find no issues, we match up any "valid" arguments, and start again.
Note that there's a lot of extra complexity:
- We try to stay efficient on the happy path, only computing the diagonal until we find a problem, and then filling in the rest of the matrix.
- Closure arguments are wrapped in a tuple and need to be unwrapped
- We need to resolve closure types after the rest, to allow the most specific type constraints
- We need to handle imported C functions that might be variadic in their inputs.
I tried to document a lot of this in comments in the code and keep the naming clear.
Specifically, rename the `Const` struct as `ConstS` and re-introduce `Const` as
this:
```
pub struct Const<'tcx>(&'tcx Interned<ConstS>);
```
This now matches `Ty` and `Predicate` more closely, including using
pointer-based `eq` and `hash`.
Notable changes:
- `mk_const` now takes a `ConstS`.
- `Const` was copy, despite being 48 bytes. Now `ConstS` is not, so need a
we need separate arena for it, because we can't use the `Dropless` one any
more.
- Many `&'tcx Const<'tcx>`/`&Const<'tcx>` to `Const<'tcx>` changes
- Many `ct.ty` to `ct.ty()` and `ct.val` to `ct.val()` changes.
- Lots of tedious sigil fiddling.
Specifically, change `Ty` from this:
```
pub type Ty<'tcx> = &'tcx TyS<'tcx>;
```
to this
```
pub struct Ty<'tcx>(Interned<'tcx, TyS<'tcx>>);
```
There are two benefits to this.
- It's now a first class type, so we can define methods on it. This
means we can move a lot of methods away from `TyS`, leaving `TyS` as a
barely-used type, which is appropriate given that it's not meant to
be used directly.
- The uniqueness requirement is now explicit, via the `Interned` type.
E.g. the pointer-based `Eq` and `Hash` comes from `Interned`, rather
than via `TyS`, which wasn't obvious at all.
Much of this commit is boring churn. The interesting changes are in
these files:
- compiler/rustc_middle/src/arena.rs
- compiler/rustc_middle/src/mir/visit.rs
- compiler/rustc_middle/src/ty/context.rs
- compiler/rustc_middle/src/ty/mod.rs
Specifically:
- Most mentions of `TyS` are removed. It's very much a dumb struct now;
`Ty` has all the smarts.
- `TyS` now has `crate` visibility instead of `pub`.
- `TyS::make_for_test` is removed in favour of the static `BOOL_TY`,
which just works better with the new structure.
- The `Eq`/`Ord`/`Hash` impls are removed from `TyS`. `Interned`s impls
of `Eq`/`Hash` now suffice. `Ord` is now partly on `Interned`
(pointer-based, for the `Equal` case) and partly on `TyS`
(contents-based, for the other cases).
- There are many tedious sigil adjustments, i.e. adding or removing `*`
or `&`. They seem to be unavoidable.
Add in ValuePair::Term
This adds in an enum when matching on positions which can either be types or consts.
It will default to emitting old special cased error messages for types.
r? `@oli-obk`
cc `@matthiaskrgr`
Fixes#93578
This adds in an enum when matching on positions which can either be types or consts.
It will default to emitting old special cased error messages for types.
by using an opaque type obligation to bubble up comparisons between opaque types and other types
Also uses proper obligation causes so that the body id works, because out of some reason nll uses body ids for logic instead of just diagnostics.
This makes `Obligation` two words bigger, but avoids allocating a lot of
the time.
I previously tried this in #73983 and it didn't help much, but local
timings look more promising now.
This crate actually had a typo `'ctx` in one of its functions:
```diff
-pub fn same_type_modulo_infer(a: Ty<'tcx>, b: Ty<'ctx>) -> bool {
+pub fn same_type_modulo_infer<'tcx>(a: Ty<'tcx>, b: Ty<'tcx>) -> bool {
```
Implementation of GATs outlives lint
See #87479 for background. Closes#87479
The basic premise of this lint/error is to require the user to write where clauses on a GAT when those bounds can be implied or proven from any function on the trait returning that GAT.
## Intuitive Explanation (Attempt) ##
Let's take this trait definition as an example:
```rust
trait Iterable {
type Item<'x>;
fn iter<'a>(&'a self) -> Self::Item<'a>;
}
```
Let's focus on the `iter` function. The first thing to realize is that we know that `Self: 'a` because of `&'a self`. If an impl wants `Self::Item` to contain any data with references, then those references must be derived from `&'a self`. Thus, they must live only as long as `'a`. Furthermore, because of the `Self: 'a` implied bound, they must live only as long as `Self`. Since it's `'a` is used in place of `'x`, it is reasonable to assume that any value of `Self::Item<'x>`, and thus `'x`, will only be able to live as long as `Self`. Therefore, we require this bound on `Item` in the trait.
As another example:
```rust
trait Deserializer<T> {
type Out<'x>;
fn deserialize<'a>(&self, input: &'a T) -> Self::Out<'a>;
}
```
The intuition is similar here, except rather than a `Self: 'a` implied bound, we have a `T: 'a` implied bound. Thus, the data on `Self::Out<'a>` is derived from `&'a T`, and thus it is reasonable to expect that the lifetime `'x` will always be less than `T`.
## Implementation Algorithm ##
* Given a GAT `<P0 as Trait<P1..Pi>>::G<Pi...Pn>` declared as `trait T<A1..Ai> for A0 { type G<Ai...An>; }` used in return type of one associated function `F`
* Given env `E` (including implied bounds) for `F`
* For each lifetime parameter `'a` in `P0...Pn`:
* For each other type parameter `Pi != 'a` in `P0...Pn`: // FIXME: this include of lifetime parameters too
* If `E => (P: 'a)`:
* Require where clause `Ai: 'a`
## Follow-up questions ##
* What should we do when we don't pass params exactly?
For this example:
```rust
trait Des {
type Out<'x, D>;
fn des<'z, T>(&self, data: &'z Wrap<T>) -> Self::Out<'z, Wrap<T>>;
}
```
Should we be requiring a `D: 'x` clause? We pass `Wrap<T>` as `D` and `'z` as `'x`, and should be able to prove that `Wrap<T>: 'z`.
r? `@nikomatsakis`