Commit graph

395 commits

Author SHA1 Message Date
Nicholas Nethercote
b9bf0b4b10 Speed up Parser::expected_token_types.
The parser pushes a `TokenType` to `Parser::expected_token_types` on
every call to the various `check`/`eat` methods, and clears it on every
call to `bump`. Some of those `TokenType` values are full tokens that
require cloning and dropping. This is a *lot* of work for something
that is only used in error messages and it accounts for a significant
fraction of parsing execution time.

This commit overhauls `TokenType` so that `Parser::expected_token_types`
can be implemented as a bitset. This requires changing `TokenType` to a
C-style parameterless enum, and adding `TokenTypeSet` which uses a
`u128` for the bits. (The new `TokenType` has 105 variants.)

The new types `ExpTokenPair` and `ExpKeywordPair` are now arguments to
the `check`/`eat` methods. This is for maximum speed. The elements in
the pairs are always statically known; e.g. a
`token::BinOp(token::Star)` is always paired with a `TokenType::Star`.
So we now compute `TokenType`s in advance and pass them in to
`check`/`eat` rather than the current approach of constructing them on
insertion into `expected_token_types`.

Values of these pair types can be produced by the new `exp!` macro,
which is used at every `check`/`eat` call site. The macro is for
convenience, allowing any pair to be generated from a single identifier.

The ident/keyword filtering in `expected_one_of_not_found` is no longer
necessary. It was there to account for some sloppiness in
`TokenKind`/`TokenType` comparisons.

The existing `TokenType` is moved to a new file `token_type.rs`, and all
its new infrastructure is added to that file. There is more boilerplate
code than I would like, but I can't see how to make it shorter.
2024-12-19 16:05:41 +11:00
Nicholas Nethercote
d5370d981f Remove bra/ket naming.
This is a naming convention used in a handful of spots in the parser for
delimiters. It confused me when I first saw it a long time ago, and I've
never liked it. A web search says "Bra-ket notation" exists in linear
algebra but the terminology has zero prior use in a programming context,
as far as I can tell.

This commit changes it to `open`/`close`, which is consistent with the
rest of the compiler.
2024-12-19 16:05:41 +11:00
Nicholas Nethercote
48f7714819 Rename Parser::expected_tokens as Parser::expected_token_types.
Because the `Token` type is similar to but different to the `TokenType`
type, and the difference is important, so we want to avoid confusion.
2024-12-19 16:05:41 +11:00
Nicholas Nethercote
1564318482 Only have one source of truth for keywords.
`rustc_symbol` is the source of truth for keywords.

rustdoc has its own implicit definition of keywords, via the
`is_doc_keyword`. It (presumably) intends to include all keywords, but
it omits `yeet`.

rustfmt has its own explicit list of Rust keywords. It also (presumably)
intends to include all keywords, but it omits `await`, `builtin`, `gen`,
`macro_rules`, `raw`, `reuse`, `safe`, and `yeet`. Also, it does linear
searches through this list, which is inefficient.

This commit fixes all of the above problems by introducing a new
predicate `is_any_keyword` in rustc and using it in rustdoc and rustfmt.
It documents that it's not the right predicate in most cases.
2024-12-18 20:21:03 +11:00
Nicholas Nethercote
64abe8be33 Simplify AllKeywords.
It's a verbose reinvention of a range type, and can be cut down a lot.
2024-12-18 20:21:01 +11:00
Nicholas Nethercote
2620eb42d7 Re-export more rustc_span::symbol things from rustc_span.
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.

This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
2024-12-18 13:38:53 +11:00
Nicholas Nethercote
40c964510c Remove PErr.
It's just a synonym for `Diag` that adds no value and is only used in a
few places.
2024-12-12 11:31:55 +11:00
Nicholas Nethercote
90ad2adfea Improve span handling in parse_expr_bottom.
`parse_expr_bottom` stores `this.token.span` in `lo`, but then fails to
use it in many places where it could. This commit fixes that, and
likewise (to a smaller extent) in `parse_ty_common`.
2024-11-28 17:01:50 +11:00
klensy
746b675c5a fix clippy::clone_on_ref_ptr for compiler 2024-10-28 18:05:08 +03:00
Michael Goulet
95dba280b9 Move trait bound modifiers into ast::PolyTraitRef 2024-10-14 09:20:38 -04:00
Michael Goulet
c682aa162b Reformat using the new identifier sorting from rustfmt 2024-09-22 19:11:29 -04:00
Veera
741005792e Implement a Method to Seal DiagInner's Suggestions 2024-09-12 21:27:44 -04:00
Eduardo Sánchez Muñoz
0b20ffcb63 Remove needless returns detected by clippy in the compiler 2024-09-09 13:32:22 +02:00
Veera
14e86eb7d9 Add Suggestions for Misspelled Keywords
This PR detects misspelled keywords using two heuristics:

1. Lowercasing the unexpected identifier.
2. Using edit distance to find a keyword similar to the unexpected identifier.

However, it does not detect each and every misspelled keyword to
minimize false positives and ambiguities. More details about the
implementation can be found in the comments.
2024-09-06 23:07:45 -04:00
Michael Goulet
25ff9b6bcb Use bool in favor of Option<()> for diagnostics 2024-08-21 01:31:11 -04:00
Nicholas Nethercote
9d31f86f0d Overhaul token collection.
This commit does the following.

- Renames `collect_tokens_trailing_token` as `collect_tokens`, because
  (a) it's annoying long, and (b) the `_trailing_token` bit is less
  accurate now that its types have changed.

- In `collect_tokens`, adds a `Option<CollectPos>` argument and a
  `UsePreAttrPos` in the return type of `f`. These are used in
  `parse_expr_force_collect` (for vanilla expressions) and in
  `parse_stmt_without_recovery` (for two different cases of expression
  statements). Together these ensure are enough to fix all the problems
  with token collection and assoc expressions. The changes to the
  `stringify.rs` test demonstrate some of these.

- Adds a new test. The code in this test was causing an assertion
  failure prior to this commit, due to an invalid `NodeRange`.

The extra complexity is annoying, but necessary to fix the existing
problems.
2024-08-16 09:07:55 +10:00
Nicholas Nethercote
7923b20dd9 Use impl PartialEq<TokenKind> for Token more.
This lets us compare a `Token` with a `TokenKind`. It's used a lot, but
can be used even more, avoiding the need for some `.kind` uses.
2024-08-14 16:37:09 +10:00
Nicholas Nethercote
84ac80f192 Reformat use declarations.
The previous commit updated `rustfmt.toml` appropriately. This commit is
the outcome of running `x fmt --all` with the new formatting options.
2024-07-29 08:26:52 +10:00
Matthias Krüger
c86e13f330
Rollup merge of #127350 - veera-sivarajan:bugfix-126311, r=lcnr
Parser: Suggest Placing the Return Type After Function Parameters

Fixes #126311

This PR suggests placing the return type after the function parameters when it's misplaced after a `where` clause.

This also tangentially improves diagnostics for cases like [this](86d6f1312a/tests/ui/parser/issues/misplaced-return-type-without-where-issue-126311.rs (L1C1-L1C28)) and adds doc comments for `parser::AllowPlus`.
2024-07-19 10:48:03 +02:00
Veera
4cad705017 Parser: Suggest Placing the Return Type After Function Parameters 2024-07-18 17:56:34 -04:00
Matthias Krüger
50a90e394e
Rollup merge of #127835 - estebank:issue-127823, r=compiler-errors
Fix ICE in suggestion caused by `⩵` being recovered as `==`

The second suggestion shown here would previously incorrectly assume that the span corresponding to `⩵` was 2 bytes wide composed by 2 1 byte wide chars, so a span pointing at `==` could point only at one of the `=` to remove it. Instead, we now replace the whole thing (as we should have the whole time):

```
error: unknown start of token: \u{2a75}
  --> $DIR/unicode-double-equals-recovery.rs:1:16
   |
LL | const A: usize ⩵ 2;
   |                ^
   |
help: Unicode character '⩵' (Two Consecutive Equals Signs) looks like '==' (Double Equals Sign), but it is not
   |
LL | const A: usize == 2;
   |                ~~

error: unexpected `==`
  --> $DIR/unicode-double-equals-recovery.rs:1:16
   |
LL | const A: usize ⩵ 2;
   |                ^
   |
help: try using `=` instead
   |
LL | const A: usize = 2;
   |                ~
```

Fix #127823.
2024-07-18 23:05:21 +02:00
Esteban Küber
67ec1326ee Fix ICE in suggestion caused by being recovered as ==
The second suggestion shown here would previously incorrectly assume that the span corresponding to `⩵` was 2 bytes wide composed by 2 1 byte wide chars, so a span pointing at `==` could point only at one of the `=` to remove it. Instead, we now replace the whole thing (as we should have the whole time):

```
error: unknown start of token: \u{2a75}
  --> $DIR/unicode-double-equals-recovery.rs:1:16
   |
LL | const A: usize ⩵ 2;
   |                ^
   |
help: Unicode character '⩵' (Two Consecutive Equals Signs) looks like '==' (Double Equals Sign), but it is not
   |
LL | const A: usize == 2;
   |                ~~

error: unexpected `==`
  --> $DIR/unicode-double-equals-recovery.rs:1:16
   |
LL | const A: usize ⩵ 2;
   |                ^
   |
help: try using `=` instead
   |
LL | const A: usize = 2;
   |                ~
```
2024-07-18 17:47:31 +00:00
Esteban Küber
f6c4679547 More accurate span for anonymous argument suggestion
Use smaller span for suggesting adding `_:` ahead of a type:

```
error: expected one of `(`, `...`, `..=`, `..`, `::`, `:`, `{`, or `|`, found `)`
  --> $DIR/anon-params-denied-2018.rs:12:47
   |
LL |     fn foo_with_qualified_path(<Bar as T>::Baz);
   |                                               ^ expected one of 8 possible tokens
   |
   = note: anonymous parameters are removed in the 2018 edition (see RFC 1685)
help: explicitly ignore the parameter name
   |
LL |     fn foo_with_qualified_path(_: <Bar as T>::Baz);
   |                                ++
```
2024-07-18 00:19:27 +00:00
Esteban Küber
b6f518877f fix unused binding 2024-07-12 03:04:06 +00:00
Esteban Küber
377d14be88 More accurate incorrect use of await suggestion 2024-07-12 03:02:58 +00:00
Esteban Küber
692bc344d5 Make parse error suggestions verbose and fix spans
Go over all structured parser suggestions and make them verbose style.

When suggesting to add or remove delimiters, turn them into multiple suggestion parts.
2024-07-12 03:02:57 +00:00
Matthias Krüger
73cc4eca56
Rollup merge of #126125 - dev-ardi:conflict-markers, r=estebank
Improve conflict marker recovery

<!--
If this PR is related to an unstable feature or an otherwise tracked effort,
please link to the relevant tracking issue here. If you don't know of a related
tracking issue or there are none, feel free to ignore this.

This PR will get automatically assigned to a reviewer. In case you would like
a specific user to review your work, you can assign it to them by using

    r​? <reviewer name>
-->
closes #113826
r? ```@estebank``` since you reviewed #115413
cc: ```@rben01``` since you opened up the issue in the first place
2024-06-21 09:12:34 +02:00
bors
894f7a4ba6 Auto merge of #126678 - nnethercote:fix-duplicated-attrs-on-nt-expr, r=petrochenkov
Fix duplicated attributes on nonterminal expressions

This PR fixes a long-standing bug (#86055) whereby expression attributes can be duplicated when expanded through declarative macros.

First, consider how items are parsed in declarative macros:
```
Items:
- parse_nonterminal
  - parse_item(ForceCollect::Yes)
    - parse_item_
      - attrs = parse_outer_attributes
      - parse_item_common(attrs)
        - maybe_whole!
        - collect_tokens_trailing_token
```
The important thing is that the parsing of outer attributes is outside token collection, so the item's tokens don't include the attributes. This is how it's supposed to be.

Now consider how expression are parsed in declarative macros:
```
Exprs:
- parse_nonterminal
  - parse_expr_force_collect
    - collect_tokens_no_attrs
      - collect_tokens_trailing_token
        - parse_expr
          - parse_expr_res(None)
            - parse_expr_assoc_with
              - parse_expr_prefix
                - parse_or_use_outer_attributes
                - parse_expr_dot_or_call
```
The important thing is that the parsing of outer attributes is inside token collection, so the the expr's tokens do include the attributes, i.e. in `AttributesData::tokens`.

This PR fixes the bug by rearranging expression parsing to that outer attribute parsing happens outside of token collection. This requires a number of small refactorings because expression parsing is somewhat complicated. While doing so the PR makes the code a bit cleaner and simpler, by eliminating `parse_or_use_outer_attributes` and `Option<AttrWrapper>` arguments (in favour of the simpler `parse_outer_attributes` and `AttrWrapper` arguments), and simplifying `LhsExpr`.

r? `@petrochenkov`
2024-06-19 13:58:21 +00:00
Nicholas Nethercote
8170acb197 Refactor parse_expr_res.
This removes the final `Option<AttrWrapper>` argument.
2024-06-19 19:12:02 +10:00
ardi
9f6371236f make this comment correct 2024-06-19 00:27:41 +02:00
ardi
d51b4462ec Improve conflict marker recovery 2024-06-19 00:27:41 +02:00
Oli Scherer
3f34196839 Remove redundant argument from subdiagnostic method 2024-06-18 15:42:11 +00:00
Oli Scherer
7ba82d61eb Use a dedicated type instead of a reference for the diagnostic context
This paves the way for tracking more state (e.g. error tainting) in the diagnostic context handle
2024-06-18 15:42:11 +00:00
Oli Scherer
c91edc3888 Prefer dcx methods over fields or fields' methods 2024-06-18 13:45:08 +00:00
Michael Goulet
68bd001c00 Make parse_seq_to_before_tokens take expected/nonexpected tokens, use in parse_precise_capturing_syntax 2024-06-17 22:35:25 -04:00
Nicholas Nethercote
95b4c07ef8 Reduce pub exposure. 2024-06-06 08:26:54 +10:00
Nicholas Nethercote
bb364fe950 Remove #[macro_use] extern crate tracing from rustc_parse. 2024-05-23 18:02:40 +10:00
ardi
1f6d271527 Clarify that the diff_marker is talking about version control system
conflicts specifically and a few more improvements.
2024-05-17 15:45:50 +02:00
ardi
8dc6a5d145 improve maybe_consume_incorrect_semicolon 2024-05-14 23:07:40 +02:00
Nicholas Nethercote
9a63a42cb7 Remove a Span from TokenKind::Interpolated.
This span records the declaration of the metavariable in the LHS of the macro.
It's used in a couple of error messages. Unfortunately, it gets in the way of
the long-term goal of removing `TokenKind::Interpolated`. So this commit
removes it, which degrades a couple of (obscure) error messages but makes
things simpler and enables the next commit.
2024-05-13 10:30:30 +10:00
许杰友 Jieyou Xu (Joe)
e7bb090d0a
Rollup merge of #124930 - compiler-errors:consume-arg, r=petrochenkov
Make sure we consume a generic arg when checking mistyped turbofish

When recovering un-turbofish-ed args in expr position (e.g. `let x = a<T, U>();` in `check_mistyped_turbofish_with_multiple_type_params`, we used `parse_seq_to_before_end` to parse the fake generic args; however, it used `parse_generic_arg` which *optionally* parses a generic arg. If it doesn't end up parsing an arg, it returns `Ok(None)` and consumes no tokens. If we don't find a delimiter after this (`,`), we try parsing *another* element. In this case, we just infinitely loop looking for a subsequent element.

We can fix this by making sure that we either parse a generic arg or error in `parse_seq_to_before_end`'s callback.

Fixes #124897
2024-05-11 01:15:10 +01:00
Michael Goulet
dbdef68ddf Make sure we consume a generic arg when checking mistyped turbofish 2024-05-09 10:47:14 -04:00
Nicholas Nethercote
fd91925bce Add ErrorGuaranteed to Recovered::Yes and use it more.
The starting point for this was identical comments on two different
fields, in `ast::VariantData::Struct` and `hir::VariantData::Struct`:
```
    // FIXME: investigate making this a `Option<ErrorGuaranteed>`
    recovered: bool
```
I tried that, and then found that I needed to add an `ErrorGuaranteed`
to `Recovered::Yes`. Then I ended up using `Recovered` instead of
`Option<ErrorGuaranteed>` for these two places and elsewhere, which
required moving `ErrorGuaranteed` from `rustc_parse` to `rustc_ast`.

This makes things more consistent, because `Recovered` is used in more
places, and there are fewer uses of `bool` and
`Option<ErrorGuaranteed>`. And safer, because it's difficult/impossible
to set `recovered` to `Recovered::Yes` without having emitted an error.
2024-05-09 20:12:07 +10:00
Jules Bertholet
2a4624ddd1
Rename BindingAnnotation to BindingMode 2024-04-17 09:34:39 -04:00
León Orell Valerian Liehr
3cbc9e9560
Rename ModSep to PathSep 2024-04-04 19:44:04 +02:00
Nicholas Nethercote
541d7cc65c Rename AddToDiagnostic as Subdiagnostic.
To match `derive(Subdiagnostic)`.

Also rename `add_to_diagnostic{,_with}` as `add_to_diag{,_with}`.
2024-03-11 10:04:49 +11:00
Nicholas Nethercote
a09b1d33a7 Rename IntoDiagnosticArg as IntoDiagArg.
Also rename `into_diagnostic_arg` as `into_diag_arg`, and
`NotIntoDiagnosticArg` as `NotInotDiagArg`.
2024-03-11 09:12:19 +11:00
许杰友 Jieyou Xu (Joe)
4663fbb2cb
Eagerly translate HelpUseLatestEdition in parser diagnostics 2024-03-07 23:03:42 +00:00
Nicholas Nethercote
80d2bdb619 Rename all ParseSess variables/fields/lifetimes as psess.
Existing names for values of this type are `sess`, `parse_sess`,
`parse_session`, and `ps`. `sess` is particularly annoying because
that's also used for `Session` values, which are often co-located, and
it can be difficult to know which type a value named `sess` refers to.
(That annoyance is the main motivation for this change.) `psess` is nice
and short, which is good for a name used this much.

The commit also renames some `parse_sess_created` values as
`psess_created`.
2024-03-05 08:11:45 +11:00
Nicholas Nethercote
899cb40809 Rename DiagnosticBuilder as Diag.
Much better!

Note that this involves renaming (and updating the value of)
`DIAGNOSTIC_BUILDER` in clippy.
2024-02-28 08:55:35 +11:00