Commit graph

78 commits

Author SHA1 Message Date
Nilstrieb
8d13b2a046
Store ErrorGuaranteed in ErrorReported 2022-11-02 21:05:09 +01:00
Dezhi Wu
b1430fb7ca Fix a bunch of typo
This PR will fix some typos detected by [typos].

I only picked the ones I was sure were spelling errors to fix, mostly in
the comments.

[typos]: https://github.com/crate-ci/typos
2022-08-31 18:24:55 +08:00
Chayim Refael Friedman
f4ba14d290
Fix typo: fo->for 2022-06-08 16:40:02 +03:00
Jacob Pratt
49c82f31a8
Remove crate visibility usage in compiler 2022-05-20 20:04:54 -04:00
est31
e6ccf9b5d8 Use pluralize in one instance 2022-05-13 08:48:35 +02:00
Elliot Roberts
7907385999 fix most compiler/ doctests 2022-05-02 17:40:30 -07:00
Dylan DPC
91847c43cc
Rollup merge of #96023 - matthiaskrgr:clippyper1304, r=lcnr
couple of clippy::perf fixes
2022-04-16 14:25:56 +02:00
Matthias Krüger
75287dd73d remove function param that is only used in recursive of fn inner() 2022-04-14 11:54:28 +02:00
Nicholas Nethercote
75fd391aaa Introduce TtHandle and use it in TokenSet.
This removes the last use of `<mbe::TokenTree as Clone>`. It also
removes two trivial methods on `Delimited`.
2022-04-14 09:01:23 +10:00
Matthias Krüger
bbd7ce6904 couple of clippy::perf fixes 2022-04-13 22:18:28 +02:00
Nicholas Nethercote
edd7f2cdab Add a useful comment. 2022-04-11 09:38:40 +10:00
Nicholas Nethercote
4ba609601f Tweak NamedMatch representation.
The `Lrc` isn't necessary, neither is the `SmallVec`. Performance is
changed negligibly, but the new code is simpler.
2022-04-11 09:38:40 +10:00
Vadim Petrochenkov
379ae12a1d expand: Remove ParseSess::missing_fragment_specifiers
It was used for deduplicating some errors for legacy code which are mostly deduplicated even without that, but at cost of global mutable state, which is not a good tradeoff.
2022-04-09 15:44:19 +03:00
Dylan DPC
747bd16214
Rollup merge of #95797 - nnethercote:rm-Delimited-all_tts, r=petrochenkov
Remove explicit delimiter token trees from `Delimited`.

They were introduced by the final commit in #95159 and gave a
performance win. But since the introduction of `MatcherLoc` they are no
longer needed. This commit reverts that change, making the code a bit
simpler.

r? `@petrochenkov`
2022-04-09 05:58:45 +02:00
Nicholas Nethercote
7450c4e3e8 Remove explicit delimiter token trees from Delimited.
They were introduced by the final commit in #95159 and gave a
performance win. But since the introduction of `MatcherLoc` they are no
longer needed. This commit reverts that change, making the code a bit
simpler.
2022-04-09 10:11:40 +10:00
James 'zofrex' Sanderson
ef59ab738e Use gender neutral terms 2022-04-07 08:51:59 +01:00
Nicholas Nethercote
238d9076fc Call compute_locs once per rule.
Currently it's called in `parse_tt` every time a match rule is invoked.
This commit moves it so it's called instead once per match rule, in
`compile_declarative_macro. This is a performance win.

The commit also moves `compute_locs` out of `TtParser`, because there's
no longer any reason for it to be in there.
2022-04-06 10:23:06 +10:00
Nicholas Nethercote
7300bd6a38 Move the missing fragment identifier checking.
In #95555 this was moved out of `parse_tt_inner` and `nameize` into
`compute_locs`. But the next commit will be moving `compute_locs`
outwards to a place that isn't suitable for the missing fragment
identifier checking. So this reinstates the old checking.
2022-04-05 17:23:30 +10:00
Nicholas Nethercote
896d8f5905 Remove the lifetime from TtParser and MatcherLoc.
It's a slight performance loss for now, but that will be recouped by the
next commit.
2022-04-05 17:19:38 +10:00
Nicholas Nethercote
0bd47e8a39 Reorder match arms in parse_tt_inner.
To match the order the variants are declared in.
2022-04-04 17:03:36 +10:00
Nicholas Nethercote
88f8fbcce0 A new matcher representation for use in parse_tt.
`parse_tt` currently traverses a `&[TokenTree]` to do matching. But this
is a bad representation for the traversal.
- `TokenTree` is nested, and there's a bunch of expensive and fiddly
  state required to handle entering and exiting nested submatchers.
- There are three positions (sequence separators, sequence Kleene ops,
  and end of the matcher) that are represented by an index that exceeds
  the end of the `&[TokenTree]`, which is clumsy and error-prone.

This commit introduces a new representation called `MatcherLoc` that is
designed specifically for matching. It fixes all the above problems,
making the code much easier to read. A `&[TokenTree]` is converted to a
`&[MatcherLoc]` before matching begins. Despite the cost of the
conversion, it's still a net performance win, because various pieces of
traversal state are computed once up-front, rather than having to be
recomputed repeatedly during the macro matching.

Some improvements worth noting.
- `parse_tt_inner` is *much* easier to read. No more having to compare
  `idx` against `len` and read comments to understand what the result
  means.
- The handling of `Delimited` in `parse_tt_inner` is now trivial.
- The three end-of-sequence cases in `parse_tt_inner` are now handled in
  three separate match arms, and the control flow is much simpler.
- `nameize` is no longer recursive.
- There were two places that issued "missing fragment specifier" errors:
  one in `parse_tt_inner()`, and one in `nameize()`. Presumably the
  latter was never executed. There's now a single place issuing these
  errors, in `compute_locs()`.
- The number of heap allocations done for a `check full` build of
  `async-std-1.10.0` (an extreme example of heavy macro use) drops from
  11.8M to 2.6M, and most of these occur outside of macro matching.
- The size of `MatcherPos` drops from 64 bytes to 16 bytes. Small enough
  that it no longer needs boxing, which partly accounts for the
  reduction in allocations.
- The rest of the drop in allocations is due to the removal of
  `MatcherKind`, because we no longer need to record anything for the
  parent matcher when entering a submatcher.
- Overall it reduces code size by 45 lines.
2022-04-04 17:01:28 +10:00
bors
95f68702ff Auto merge of #95509 - nnethercote:simplify-MatcherPos-some-more, r=petrochenkov
Simplify `MatcherPos` some more

A few more improvements.

r? `@petrochenkov`
2022-04-02 04:59:16 +00:00
Vadim Petrochenkov
9ab4f732cb expand: Do not count metavar declarations on RHS of macro_rules
They are 0 by definition there.
2022-03-31 19:09:40 +03:00
Nicholas Nethercote
c6fedd4f10 Make MatcherPos not derive Clone.
It's only used in one place, and there we clone and then make a bunch of
modifications. It's clearer if we duplicate more explicitly, and there's
a symmetry now between `sequence()` and `empty_sequence()`.
2022-03-31 14:40:43 +11:00
Nicholas Nethercote
f68a0449ed Remove MatcherPos::stack.
`parse_tt` needs a way to get from within submatchers make to the
enclosing submatchers. Currently it has two distinct mechanisms for
this:
- `Delimited` submatchers use `MatcherPos::stack` to record stuff about
  the parent (and further back ancestors).
- `Sequence` submatchers use `MatcherPosSequence::parent` to point to
  the parent matcher position.

Having two mechanisms is really confusing, and it took me a long time to
understand all this.

This commit eliminates `MatcherPos::stack`, and changes `Delimited`
submatchers to use the same mechanism as sequence submatchers. That
mechanism is also changed a bit: instead of storing the entire parent
`MatcherPos`, we now only store the necessary parts from the parent
`MatcherPos`.

Overall this is a small performance win, with the positives outweighing
the negatives, but it's mostly for clarity.
2022-03-31 14:39:00 +11:00
Nicholas Nethercote
048bd67d51 Clarify idx handling in sequences.
By adding comments, and improving an assertion. I finally fully
understand this part!
2022-03-31 11:48:36 +11:00
Nicholas Nethercote
2e423c7fd0 Remove MatcherPos::match_lo.
It's redundant w.r.t. other fields.
2022-03-31 11:48:35 +11:00
Nicholas Nethercote
21699c41af Simplify exit of Delimited submatchers.
Currently, we detect an exit from a `Delimited` submatcher when `idx`
exceeds the bounds of the current submatcher *and* there is a `stack`
entry.

This commit changes it to something simpler: just look for a
`CloseDelim` token.
2022-03-31 11:48:34 +11:00
Nicholas Nethercote
6b0a16ab1a Pre-allocate an empty Lrc<NamedMatchVec>.
This avoids some allocations.
2022-03-30 10:54:57 +11:00
Nicholas Nethercote
524d21bd54 Overhaul how matches are recorded.
Currently, matches within a sequence are recorded in a new empty
`matches` vector. Then when the sequence finishes the matches are merged
into the `matches` vector of the parent.

This commit changes things so that a sequence mp inherits the matches
made so far. This means that additional matches from the sequence don't
need to be merged into the parent. `push_match` becomes more
complicated, and the current sequence depth needs to be tracked. But
it's a sizeable performance win because it avoids one or more
`push_match` calls on every iteration of a sequence.

The commit also removes `match_hi`, which is no longer necessary.
2022-03-30 10:54:37 +11:00
Nicholas Nethercote
a1b140cdb7 Improve comments and rename many things for consistency.
In particular:
- Replace use of "item" with "matcher position/"mp".
- Replace use of "repetition" with "sequence".
- Replace `ms` with `matcher`.
2022-03-30 10:50:17 +11:00
Nicholas Nethercote
ac3d8ce1c6 Clarify comments about doc comments in macros. 2022-03-30 10:42:47 +11:00
Nicholas Nethercote
2b60cc081b Simplify and rename count_names. 2022-03-30 10:42:34 +11:00
Nicholas Nethercote
df6ead557d Add a useful assertion. 2022-03-29 08:00:26 +11:00
Dylan DPC
1c8b7412d4
Rollup merge of #95390 - nnethercote:allow-doc-comments-in-macros, r=petrochenkov
Ignore doc comments in a declarative macro matcher.

Fixes #95267. Reverts to the old behaviour before #95159 introduced a
regression.

r? `@petrochenkov`
2022-03-28 16:08:11 +02:00
Nicholas Nethercote
9967594346 Ignore doc comments in a declarative macro matcher.
Fixes #95267. Reverts to the old behaviour before #95159 introduced a
regression.
2022-03-28 10:45:24 +11:00
Nicholas Nethercote
364b908d57 Remove Nonterminal::NtTT.
It's only needed for macro expansion, not as a general element in the
AST. This commit removes it, adds `NtOrTt` for the parser and macro
expansion cases, and renames the variants in `NamedMatch` to better
match the new type.
2022-03-28 10:03:02 +11:00
Nicholas Nethercote
fdec26ddad Shrink MatcherPosRepetition.
Currently it copies a `KleeneOp` and a `Token` out of a
`SequenceRepetition`. It's better to store a reference to the
`SequenceRepetition`, which is now possible due to #95159 having changed
the lifetimes.
2022-03-25 12:35:56 +11:00
Nicholas Nethercote
cad5f1e774 Shrink NamedMatchVec to one inline element.
This counters the `NamedMatchVec` size increase from the previous
commit, leaving `NamedMatchVec` smaller than before.
2022-03-25 12:35:56 +11:00
Nicholas Nethercote
6817442ec7 Split NamedMatch::MatchNonterminal in two.
The `Lrc` is only relevant within `transcribe()`. There, the `Lrc` is
helpful for the non-`NtTT` cases, because the entire nonterminal is
cloned. But for the `NtTT` cases the inner token tree is cloned (a full
clone) and so the `Lrc` is of no help.

This commit splits the `NtTT` and non-`NtTT` cases, avoiding the useless
`Lrc` in the former case, for the following effect on macro-heavy
crates.
- It reduces the total number of allocations a lot.
- It increases the size of some of the remaining allocations.
- It doesn't affect *peak* memory usage, because the larger allocations
  are short-lived.

This overall gives a speed win.
2022-03-25 12:35:52 +11:00
Nicholas Nethercote
904e70a7b0 Add a size assertion for NamedMatchVec. 2022-03-23 13:54:34 +11:00
Nicholas Nethercote
31df680789 Eliminate TokenTreeOrTokenTreeSlice.
As its name suggests, `TokenTreeOrTokenTreeSlice` is either a single
`TokenTree` or a slice of them. It has methods `len` and `get_tt` that
let it be treated much like an ordinary slice. The reason it isn't an
ordinary slice is that for `TokenTree::Delimited` the open and close
delimiters are represented implicitly, and when they are needed they are
constructed on the fly with `Delimited::{open,close}_tt`, rather than
being present in memory.

This commit changes `Delimited` so the open and close delimiters are
represented explicitly. As a result, `TokenTreeOrTokenTreeSlice` is no
longer needed and `MatcherPos` and `MatcherTtFrame` can just use an
ordinary slice. `TokenTree::{len,get_tt}` are also removed, because they
were only needed to support `TokenTreeOrTokenTreeSlice`.

The change makes the code shorter and a little bit faster on benchmarks
that use macro expansion heavily, partly because `MatcherPos` is a lot
smaller (less data to `memcpy`) and partly because ordinary slice
operations are faster than `TokenTreeOrTokenTreeSlice::{len,get_tt}`.
2022-03-23 07:13:31 +11:00
Nicholas Nethercote
754dc8e66f Move items into TtParser as Vecs.
By putting them in `TtParser`, we can reuse them for every rule in a
macro. With that done, they can be `SmallVec` instead of `Vec`, and this
is a performance win because these vectors are hot and `SmallVec`
operations are a bit slower due to always needing an "inline or heap?"
check.
2022-03-21 10:09:24 +11:00
Nicholas Nethercote
cedb787f6e Remove MatcherPosHandle.
This type was a small performance win for `html5ever`, which uses a
macro with hundreds of very simple rules that don't contain any
metavariables. But this type is complicated (extra lifetimes) and
perf-neutral for macros that do have metavariables.

This commit removes `MatcherPosHandle`, simplifying things a lot. This
increases the allocation rate for `html5ever` and similar cases a bit,
but makes things easier for follow-up changes that will improve
performance more than what we lost here.
2022-03-21 10:08:29 +11:00
Nicholas Nethercote
10644e0789 Remove an impossible code path.
Doc comments cannot appear in a matcher.
2022-03-19 09:44:47 +11:00
Nicholas Nethercote
39810a85da Add TtParser::macro_name.
Instead of passing it into `parse_tt`.
2022-03-19 09:44:44 +11:00
Nicholas Nethercote
354bd1071c Rename bb_items_ambiguity_error as ambiguity_error.
Because it involves `next_items` as well as `bb_items`.
2022-03-19 08:04:06 +11:00
Nicholas Nethercote
d21b4f30c1 Introduce TtParser.
It currently has no state, just the three methods `parse_tt`,
`parse_tt_inner`, and `bb_items_ambiguity_error`.

This commit is large but trivial, and mostly consists of changes to the
indentation of those methods. Subsequent commits will do more.
2022-03-19 07:47:22 +11:00
bors
a8adf7685a Auto merge of #95067 - nnethercote:parse_tt-more-refactoring, r=petrochenkov
Still more refactoring of `parse_tt`

r? `@petrochenkov`
2022-03-18 12:34:05 +00:00
Nicholas Nethercote
440a685575 Rename TtSeq as TtSlice.
It's a better name because (a) it holds a slice, and (b) "sequence" has
other meanings in this file.
2022-03-18 17:47:08 +11:00