The size of this struct depends on the alignment of `u128`, for example
powerpc64le and s390x have align-8 and end up with only 280 bytes. Our
64-bit tier-1 arches are the same though, so let's just assert on those.
Fix parenthesization of chained comparisons by pretty-printer
Example:
```rust
macro_rules! repro {
() => {
1 < 2
};
}
fn main() {
let _ = repro!() == false;
}
```
Previously `-Zunpretty=expanded` would pretty-print this syntactically invalid output: `fn main() { let _ = 1 < 2 == false; }`
```console
error: comparison operators cannot be chained
--> <anon>:8:23
|
8 | fn main() { let _ = 1 < 2 == false; }
| ^ ^^
|
help: parenthesize the comparison
|
8 | fn main() { let _ = (1 < 2) == false; }
| + +
```
With the fix, it will print `fn main() { let _ = (1 < 2) == false; }`.
Making `-Zunpretty=expanded` consistently produce syntactically valid Rust output is important because that is what makes it possible for `cargo expand` to format and perform filtering on the expanded code.
## Review notes
According to `rg '\.fixity\(\)' compiler/` the `fixity` function is called only 3 places:
- 13170cd787/compiler/rustc_ast_pretty/src/pprust/state/expr.rs (L283-L287)
- 13170cd787/compiler/rustc_hir_pretty/src/lib.rs (L1295-L1299)
- 13170cd787/compiler/rustc_parse/src/parser/expr.rs (L282-L289)
The 2 pretty printers definitely want to treat comparisons using `Fixity::None`. That's the whole bug being fixed. Meanwhile, the parser's `Fixity::None` codepath is previously unreachable as indicated by the comment, so as long as `Fixity::None` here behaves exactly the way that `Fixity::Left` used to behave, you can tell that this PR definitely does not constitute any behavior change for the parser.
My guess for why comparison operators were set to `Fixity::Left` instead of `Fixity::None` is that it's a very old workaround for giving a good chained comparisons diagnostic (like what I pasted above). Nowadays that is handled by a different dedicated codepath.
Detect missing `.` in method chain in `let` bindings and statements
On parse errors where an ident is found where one wasn't expected, see if the next elements might have been meant as method call or field access.
```
error: expected one of `.`, `;`, `?`, `else`, or an operator, found `map`
--> $DIR/missing-dot-on-statement-expression.rs:7:29
|
LL | let _ = [1, 2, 3].iter()map(|x| x);
| ^^^ expected one of `.`, `;`, `?`, `else`, or an operator
|
help: you might have meant to write a method call
|
LL | let _ = [1, 2, 3].iter().map(|x| x);
| +
```
On parse errors where an ident is found where one wasn't expected, see if the next elements might have been meant as method call or field access.
```
error: expected one of `.`, `;`, `?`, `else`, or an operator, found `map`
--> $DIR/missing-dot-on-statement-expression.rs:7:29
|
LL | let _ = [1, 2, 3].iter()map(|x| x);
| ^^^ expected one of `.`, `;`, `?`, `else`, or an operator
|
help: you might have meant to write a method call
|
LL | let _ = [1, 2, 3].iter().map(|x| x);
| +
```
Instead use dcx.abort_if_error() or guar.raise_fatal() instead. These
guarantee that an error actually happened previously and thus we don't
silently abort.
Currently it relies on having the right integer for every variant, and
if you add a variant you need to adjust the integers for all subsequent
variants, which is a pain.
This commit introduces a match guard formulation that takes advantage of
the enum-to-integer conversion to avoid specifying the integer for each
variant. And it does this via a macro to avoid lots of boilerplate.
The parser pushes a `TokenType` to `Parser::expected_token_types` on
every call to the various `check`/`eat` methods, and clears it on every
call to `bump`. Some of those `TokenType` values are full tokens that
require cloning and dropping. This is a *lot* of work for something
that is only used in error messages and it accounts for a significant
fraction of parsing execution time.
This commit overhauls `TokenType` so that `Parser::expected_token_types`
can be implemented as a bitset. This requires changing `TokenType` to a
C-style parameterless enum, and adding `TokenTypeSet` which uses a
`u128` for the bits. (The new `TokenType` has 105 variants.)
The new types `ExpTokenPair` and `ExpKeywordPair` are now arguments to
the `check`/`eat` methods. This is for maximum speed. The elements in
the pairs are always statically known; e.g. a
`token::BinOp(token::Star)` is always paired with a `TokenType::Star`.
So we now compute `TokenType`s in advance and pass them in to
`check`/`eat` rather than the current approach of constructing them on
insertion into `expected_token_types`.
Values of these pair types can be produced by the new `exp!` macro,
which is used at every `check`/`eat` call site. The macro is for
convenience, allowing any pair to be generated from a single identifier.
The ident/keyword filtering in `expected_one_of_not_found` is no longer
necessary. It was there to account for some sloppiness in
`TokenKind`/`TokenType` comparisons.
The existing `TokenType` is moved to a new file `token_type.rs`, and all
its new infrastructure is added to that file. There is more boilerplate
code than I would like, but I can't see how to make it shorter.
This is a naming convention used in a handful of spots in the parser for
delimiters. It confused me when I first saw it a long time ago, and I've
never liked it. A web search says "Bra-ket notation" exists in linear
algebra but the terminology has zero prior use in a programming context,
as far as I can tell.
This commit changes it to `open`/`close`, which is consistent with the
rest of the compiler.
The most significant is `check_keyword`: it now only pushes to
`expected_token_types` if the keyword check fails, which matches how all
the other `check` methods work.
The remainder are just tweaks to make these methods more consistent with
each other.
Use field init shorthand where possible
Field init shorthand allows writing initializers like `tcx: tcx` as
`tcx`. The compiler already uses it extensively. Fix the last few places
where it isn't yet used.
EDIT: this PR also updates `rustfmt.toml` to set
`use_field_init_shorthand = true`.
Overhaul keyword handling
The compiler's list of keywords has some problems.
- It contains several items that aren't keywords.
- The order isn't quite right in a couple of places.
- Some of the names of predicates relating to keywords are confusing.
- rustdoc and rustfmt have their own (incorrect) versions of the keyword list.
- `AllKeywords` is unnecessarily complex.
r? ```@jieyouxu```
`rustc_symbol` is the source of truth for keywords.
rustdoc has its own implicit definition of keywords, via the
`is_doc_keyword`. It (presumably) intends to include all keywords, but
it omits `yeet`.
rustfmt has its own explicit list of Rust keywords. It also (presumably)
intends to include all keywords, but it omits `await`, `builtin`, `gen`,
`macro_rules`, `raw`, `reuse`, `safe`, and `yeet`. Also, it does linear
searches through this list, which is inefficient.
This commit fixes all of the above problems by introducing a new
predicate `is_any_keyword` in rustc and using it in rustdoc and rustfmt.
It documents that it's not the right predicate in most cases.
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.
This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
- Move it to `rustc_parse`, which is the only crate that uses it. This
lets us remove all the `pub` markers from it.
- Change `next_ref` and `look_ahead` to `get` and `bump`, which work
better for the `rustc_parse` uses.
- This requires adding a `TokenStream::get` method, which is simple.
- In `TokenCursor`, we currently duplicate the
`DelimSpan`/`DelimSpacing`/`Delimiter` from the surrounding
`TokenTree::Delimited` in the stack. This isn't necessary so long as
we don't prematurely move past the `Delimited`, and is a small perf
win on a very hot code path.
- In `parse_token_tree`, we clone the relevant `TokenTree::Delimited`
instead of constructing an identical one from pieces.
Because `TokenStreamIter` is a much better name for a `TokenStream`
iterator. Also rename the `TokenStream::trees` method as
`TokenStream::iter`, and some local variables.
Field init shorthand allows writing initializers like `tcx: tcx` as
`tcx`. The compiler already uses it extensively. Fix the last few places
where it isn't yet used.
Keep track of patterns that could have introduced a binding, but didn't
When we recover from a pattern parse error, or a pattern uses `..`, we keep track of that and affect resolution error for missing bindings that could have been provided by that pattern. We differentiate between `..` and parse recovery. We silence resolution errors likely caused by the pattern parse error.
```
error[E0425]: cannot find value `title` in this scope
--> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:18:30
|
LL | if let Website { url, .. } = website {
| ------------------- this pattern doesn't include `title`, which is available in `Website`
LL | println!("[{}]({})", title, url);
| ^^^^^ not found in this scope
```
Fix#74863.
When we recover from a pattern parse error, or a pattern uses `..`, we keep track of that and affect resolution error for missing bindings that could have been provided by that pattern. We differentiate between `..` and parse recovery. We silence resolution errors likely caused by the pattern parse error.
```
error[E0425]: cannot find value `title` in this scope
--> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:19:30
|
LL | println!("[{}]({})", title, url);
| ^^^^^ not found in this scope
|
note: `Website` has a field `title` which could have been included in this pattern, but it wasn't
--> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:17:12
|
LL | / struct Website {
LL | | url: String,
LL | | title: Option<String> ,
| | ----- defined here
LL | | }
| |_-
...
LL | if let Website { url, .. } = website {
| ^^^^^^^^^^^^^^^^^^^ this pattern doesn't include `title`, which is available in `Website`
```
Fix#74863.
Add AST support for unsafe binders
I'm splitting up #130514 into pieces. It's impossible for me to keep up with a huge PR like that. I'll land type system support for this next, probably w/o MIR lowering, which will come later.
r? `@oli-obk`
cc `@BoxyUwU` and `@lcnr` who also may want to look at this, though this PR doesn't do too much yet
Keep track of parse errors in `mod`s and don't emit resolve errors for paths involving them
When we expand a `mod foo;` and parse `foo.rs`, we now track whether that file had an unrecovered parse error that reached the end of the file. If so, we keep that information around in the HIR and mark its `DefId` in the `Resolver`. When resolving a path like `foo::bar`, we do not emit any errors for "`bar` not found in `foo`", as we know that the parse error might have caused `bar` to not be parsed and accounted for.
When this happens in an existing project, every path referencing `foo` would be an irrelevant compile error. Instead, we now skip emitting anything until `foo.rs` is fixed. Tellingly enough, we didn't have any test for errors caused by expansion of `mod`s with parse errors.
Fix https://github.com/rust-lang/rust/issues/97734.
allow `symbol_intern_string_literal` lint in test modules
Since #133545, `x check compiler --stage 1` no longer works because compiler test modules trigger `symbol_intern_string_literal` lint errors. Bootstrap shouldn't control when to ignore or enable this lint in the compiler tree (using `Kind != Test` was ineffective for obvious reasons).
Also, conditionally adding this rustflag invalidates the build cache between `x test` and other commands.
This PR removes the `Kind` check from bootstrap and handles it directly in the compiler tree in a more natural way.