Fix parenthesization of chained comparisons by pretty-printer
Example:
```rust
macro_rules! repro {
() => {
1 < 2
};
}
fn main() {
let _ = repro!() == false;
}
```
Previously `-Zunpretty=expanded` would pretty-print this syntactically invalid output: `fn main() { let _ = 1 < 2 == false; }`
```console
error: comparison operators cannot be chained
--> <anon>:8:23
|
8 | fn main() { let _ = 1 < 2 == false; }
| ^ ^^
|
help: parenthesize the comparison
|
8 | fn main() { let _ = (1 < 2) == false; }
| + +
```
With the fix, it will print `fn main() { let _ = (1 < 2) == false; }`.
Making `-Zunpretty=expanded` consistently produce syntactically valid Rust output is important because that is what makes it possible for `cargo expand` to format and perform filtering on the expanded code.
## Review notes
According to `rg '\.fixity\(\)' compiler/` the `fixity` function is called only 3 places:
- 13170cd787/compiler/rustc_ast_pretty/src/pprust/state/expr.rs (L283-L287)
- 13170cd787/compiler/rustc_hir_pretty/src/lib.rs (L1295-L1299)
- 13170cd787/compiler/rustc_parse/src/parser/expr.rs (L282-L289)
The 2 pretty printers definitely want to treat comparisons using `Fixity::None`. That's the whole bug being fixed. Meanwhile, the parser's `Fixity::None` codepath is previously unreachable as indicated by the comment, so as long as `Fixity::None` here behaves exactly the way that `Fixity::Left` used to behave, you can tell that this PR definitely does not constitute any behavior change for the parser.
My guess for why comparison operators were set to `Fixity::Left` instead of `Fixity::None` is that it's a very old workaround for giving a good chained comparisons diagnostic (like what I pasted above). Nowadays that is handled by a different dedicated codepath.
Detect missing `.` in method chain in `let` bindings and statements
On parse errors where an ident is found where one wasn't expected, see if the next elements might have been meant as method call or field access.
```
error: expected one of `.`, `;`, `?`, `else`, or an operator, found `map`
--> $DIR/missing-dot-on-statement-expression.rs:7:29
|
LL | let _ = [1, 2, 3].iter()map(|x| x);
| ^^^ expected one of `.`, `;`, `?`, `else`, or an operator
|
help: you might have meant to write a method call
|
LL | let _ = [1, 2, 3].iter().map(|x| x);
| +
```
On parse errors where an ident is found where one wasn't expected, see if the next elements might have been meant as method call or field access.
```
error: expected one of `.`, `;`, `?`, `else`, or an operator, found `map`
--> $DIR/missing-dot-on-statement-expression.rs:7:29
|
LL | let _ = [1, 2, 3].iter()map(|x| x);
| ^^^ expected one of `.`, `;`, `?`, `else`, or an operator
|
help: you might have meant to write a method call
|
LL | let _ = [1, 2, 3].iter().map(|x| x);
| +
```
Instead use dcx.abort_if_error() or guar.raise_fatal() instead. These
guarantee that an error actually happened previously and thus we don't
silently abort.
Currently it relies on having the right integer for every variant, and
if you add a variant you need to adjust the integers for all subsequent
variants, which is a pain.
This commit introduces a match guard formulation that takes advantage of
the enum-to-integer conversion to avoid specifying the integer for each
variant. And it does this via a macro to avoid lots of boilerplate.
The parser pushes a `TokenType` to `Parser::expected_token_types` on
every call to the various `check`/`eat` methods, and clears it on every
call to `bump`. Some of those `TokenType` values are full tokens that
require cloning and dropping. This is a *lot* of work for something
that is only used in error messages and it accounts for a significant
fraction of parsing execution time.
This commit overhauls `TokenType` so that `Parser::expected_token_types`
can be implemented as a bitset. This requires changing `TokenType` to a
C-style parameterless enum, and adding `TokenTypeSet` which uses a
`u128` for the bits. (The new `TokenType` has 105 variants.)
The new types `ExpTokenPair` and `ExpKeywordPair` are now arguments to
the `check`/`eat` methods. This is for maximum speed. The elements in
the pairs are always statically known; e.g. a
`token::BinOp(token::Star)` is always paired with a `TokenType::Star`.
So we now compute `TokenType`s in advance and pass them in to
`check`/`eat` rather than the current approach of constructing them on
insertion into `expected_token_types`.
Values of these pair types can be produced by the new `exp!` macro,
which is used at every `check`/`eat` call site. The macro is for
convenience, allowing any pair to be generated from a single identifier.
The ident/keyword filtering in `expected_one_of_not_found` is no longer
necessary. It was there to account for some sloppiness in
`TokenKind`/`TokenType` comparisons.
The existing `TokenType` is moved to a new file `token_type.rs`, and all
its new infrastructure is added to that file. There is more boilerplate
code than I would like, but I can't see how to make it shorter.
This is a naming convention used in a handful of spots in the parser for
delimiters. It confused me when I first saw it a long time ago, and I've
never liked it. A web search says "Bra-ket notation" exists in linear
algebra but the terminology has zero prior use in a programming context,
as far as I can tell.
This commit changes it to `open`/`close`, which is consistent with the
rest of the compiler.
The most significant is `check_keyword`: it now only pushes to
`expected_token_types` if the keyword check fails, which matches how all
the other `check` methods work.
The remainder are just tweaks to make these methods more consistent with
each other.
Use field init shorthand where possible
Field init shorthand allows writing initializers like `tcx: tcx` as
`tcx`. The compiler already uses it extensively. Fix the last few places
where it isn't yet used.
EDIT: this PR also updates `rustfmt.toml` to set
`use_field_init_shorthand = true`.
Overhaul keyword handling
The compiler's list of keywords has some problems.
- It contains several items that aren't keywords.
- The order isn't quite right in a couple of places.
- Some of the names of predicates relating to keywords are confusing.
- rustdoc and rustfmt have their own (incorrect) versions of the keyword list.
- `AllKeywords` is unnecessarily complex.
r? ```@jieyouxu```
`rustc_symbol` is the source of truth for keywords.
rustdoc has its own implicit definition of keywords, via the
`is_doc_keyword`. It (presumably) intends to include all keywords, but
it omits `yeet`.
rustfmt has its own explicit list of Rust keywords. It also (presumably)
intends to include all keywords, but it omits `await`, `builtin`, `gen`,
`macro_rules`, `raw`, `reuse`, `safe`, and `yeet`. Also, it does linear
searches through this list, which is inefficient.
This commit fixes all of the above problems by introducing a new
predicate `is_any_keyword` in rustc and using it in rustdoc and rustfmt.
It documents that it's not the right predicate in most cases.
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.
This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
- Move it to `rustc_parse`, which is the only crate that uses it. This
lets us remove all the `pub` markers from it.
- Change `next_ref` and `look_ahead` to `get` and `bump`, which work
better for the `rustc_parse` uses.
- This requires adding a `TokenStream::get` method, which is simple.
- In `TokenCursor`, we currently duplicate the
`DelimSpan`/`DelimSpacing`/`Delimiter` from the surrounding
`TokenTree::Delimited` in the stack. This isn't necessary so long as
we don't prematurely move past the `Delimited`, and is a small perf
win on a very hot code path.
- In `parse_token_tree`, we clone the relevant `TokenTree::Delimited`
instead of constructing an identical one from pieces.
Because `TokenStreamIter` is a much better name for a `TokenStream`
iterator. Also rename the `TokenStream::trees` method as
`TokenStream::iter`, and some local variables.
Field init shorthand allows writing initializers like `tcx: tcx` as
`tcx`. The compiler already uses it extensively. Fix the last few places
where it isn't yet used.
Keep track of patterns that could have introduced a binding, but didn't
When we recover from a pattern parse error, or a pattern uses `..`, we keep track of that and affect resolution error for missing bindings that could have been provided by that pattern. We differentiate between `..` and parse recovery. We silence resolution errors likely caused by the pattern parse error.
```
error[E0425]: cannot find value `title` in this scope
--> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:18:30
|
LL | if let Website { url, .. } = website {
| ------------------- this pattern doesn't include `title`, which is available in `Website`
LL | println!("[{}]({})", title, url);
| ^^^^^ not found in this scope
```
Fix#74863.
Remove `Lexer`'s dependency on `Parser`.
Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`.
But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created.
This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form.
The cost is slightly worse error messages in two obscure cases, as shown in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).
In my opinion this cost is outweighed by the magnitude of the code cleanup.
r? ```````@chenyukang```````
When we recover from a pattern parse error, or a pattern uses `..`, we keep track of that and affect resolution error for missing bindings that could have been provided by that pattern. We differentiate between `..` and parse recovery. We silence resolution errors likely caused by the pattern parse error.
```
error[E0425]: cannot find value `title` in this scope
--> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:19:30
|
LL | println!("[{}]({})", title, url);
| ^^^^^ not found in this scope
|
note: `Website` has a field `title` which could have been included in this pattern, but it wasn't
--> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:17:12
|
LL | / struct Website {
LL | | url: String,
LL | | title: Option<String> ,
| | ----- defined here
LL | | }
| |_-
...
LL | if let Website { url, .. } = website {
| ^^^^^^^^^^^^^^^^^^^ this pattern doesn't include `title`, which is available in `Website`
```
Fix#74863.
Add AST support for unsafe binders
I'm splitting up #130514 into pieces. It's impossible for me to keep up with a huge PR like that. I'll land type system support for this next, probably w/o MIR lowering, which will come later.
r? `@oli-obk`
cc `@BoxyUwU` and `@lcnr` who also may want to look at this, though this PR doesn't do too much yet
Keep track of parse errors in `mod`s and don't emit resolve errors for paths involving them
When we expand a `mod foo;` and parse `foo.rs`, we now track whether that file had an unrecovered parse error that reached the end of the file. If so, we keep that information around in the HIR and mark its `DefId` in the `Resolver`. When resolving a path like `foo::bar`, we do not emit any errors for "`bar` not found in `foo`", as we know that the parse error might have caused `bar` to not be parsed and accounted for.
When this happens in an existing project, every path referencing `foo` would be an irrelevant compile error. Instead, we now skip emitting anything until `foo.rs` is fixed. Tellingly enough, we didn't have any test for errors caused by expansion of `mod`s with parse errors.
Fix https://github.com/rust-lang/rust/issues/97734.
Lexing precedes parsing, as you'd expect: `Lexer` creates a
`TokenStream` and `Parser` then parses that `TokenStream`.
But, in a horrendous violation of layering abstractions and common
sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err`
method does some error recovery that relies on creating a `Parser` to do
some post-processing of the `TokenStream` that the `Lexer` just created.
This commit just removes `unclosed_delim_err`. This change removes
`Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s
return value can have a more typical form.
The cost is slightly worse error messages in two obscure cases, as shown
in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less
explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff
marker detection is no longer supported (because that detection is
implemented in the parser).
In my opinion this cost is outweighed by the magnitude of the code
cleanup.
allow `symbol_intern_string_literal` lint in test modules
Since #133545, `x check compiler --stage 1` no longer works because compiler test modules trigger `symbol_intern_string_literal` lint errors. Bootstrap shouldn't control when to ignore or enable this lint in the compiler tree (using `Kind != Test` was ineffective for obvious reasons).
Also, conditionally adding this rustflag invalidates the build cache between `x test` and other commands.
This PR removes the `Kind` check from bootstrap and handles it directly in the compiler tree in a more natural way.
When we expand a `mod foo;` and parse `foo.rs`, we now track whether that file had an unrecovered parse error that reached the end of the file. If so, we keep that information around. When resolving a path like `foo::bar`, we do not emit any errors for "`bar` not found in `foo`", as we know that the parse error might have caused `bar` to not be parsed and accounted for.
When this happens in an existing project, every path referencing `foo` would be an irrelevant compile error. Instead, we now skip emitting anything until `foo.rs` is fixed. Tellingly enough, we didn't have any test for errors caused by `mod` expansion.
Fix#97734.
Add test to check unicode identifier version
This adds a test to verify which version of Unicode is used for identifiers. This is part of the language, documented at https://doc.rust-lang.org/nightly/reference/identifiers.html#r-ident.unicode. The version here often changes implicitly due to dependency updates pulling in new versions, and thus we often don't notice it has changed leaving the documentation out of date. The intent here is to have a canary to give us a notification when it changes so that we can update the documentation.
Initial implementation of `#[feature(default_field_values]`, proposed in https://github.com/rust-lang/rfcs/pull/3681.
Support default fields in enum struct variant
Allow default values in an enum struct variant definition:
```rust
pub enum Bar {
Foo {
bar: S = S,
baz: i32 = 42 + 3,
}
}
```
Allow using `..` without a base on an enum struct variant
```rust
Bar::Foo { .. }
```
`#[derive(Default)]` doesn't account for these as it is still gating `#[default]` only being allowed on unit variants.
Support `#[derive(Default)]` on enum struct variants with all defaulted fields
```rust
pub enum Bar {
#[default]
Foo {
bar: S = S,
baz: i32 = 42 + 3,
}
}
```
Check for missing fields in typeck instead of mir_build.
Expand test with `const` param case (needs `generic_const_exprs` enabled).
Properly instantiate MIR const
The following works:
```rust
struct S<A> {
a: Vec<A> = Vec::new(),
}
S::<i32> { .. }
```
Add lint for default fields that will always fail const-eval
We *allow* this to happen for API writers that might want to rely on users'
getting a compile error when using the default field, different to the error
that they would get when the field isn't default. We could change this to
*always* error instead of being a lint, if we wanted.
This will *not* catch errors for partially evaluated consts, like when the
expression relies on a const parameter.
Suggestions when encountering `Foo { .. }` without `#[feature(default_field_values)]`:
- Suggest adding a base expression if there are missing fields.
- Suggest enabling the feature if all the missing fields have optional values.
- Suggest removing `..` if there are no missing fields.