Rewrite collect_tokens
implementations to use a flattened buffer
Instead of trying to collect tokens at each depth, we 'flatten' the stream as we go allong, pushing open/close delimiters to our buffer just like regular tokens. One capturing is complete, we reconstruct a nested `TokenTree::Delimited` structure, producing a normal `TokenStream`. The reconstructed `TokenStream` is not created immediately - instead, it is produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This closure stores a clone of the original `TokenCursor`, plus a record of the number of calls to `next()/next_desugared()`. This is sufficient to reconstruct the tokenstream seen by the callback without storing any additional state. If the tokenstream is never used (e.g. when a captured `macro_rules!` argument is never passed to a proc macro), we never actually create a `TokenStream`. This implementation has a number of advantages over the previous one: * It is significantly simpler, with no edge cases around capturing the start/end of a delimited group. * It can be easily extended to allow replacing tokens an an arbitrary 'depth' by just using `Vec::splice` at the proper position. This is important for PR #76130, which requires us to track information about attributes along with tokens. * The lazy approach to `TokenStream` construction allows us to easily parse an AST struct, and then decide after the fact whether we need a `TokenStream`. This will be useful when we start collecting tokens for `Attribute` - we can discard the `LazyTokenStream` if the parsed attribute doesn't need tokens (e.g. is a builtin attribute). The performance impact seems to be neglibile (see https://github.com/rust-lang/rust/pull/77250#issuecomment-703960604). There is a small slowdown on a few benchmarks, but it only rises above 1% for incremental builds, where it represents a larger fraction of the much smaller instruction count. There a ~1% speedup on a few other incremental benchmarks - my guess is that the speedups and slowdowns will usually cancel out in practice.
This commit is contained in:
parent
cb2462c53f
commit
593fdd3d45
7 changed files with 252 additions and 165 deletions
|
@ -16,8 +16,9 @@
|
|||
use crate::token::{self, DelimToken, Token, TokenKind};
|
||||
|
||||
use rustc_data_structures::stable_hasher::{HashStable, StableHasher};
|
||||
use rustc_data_structures::sync::Lrc;
|
||||
use rustc_data_structures::sync::{self, Lrc};
|
||||
use rustc_macros::HashStable_Generic;
|
||||
use rustc_serialize::{Decodable, Decoder, Encodable, Encoder};
|
||||
use rustc_span::{Span, DUMMY_SP};
|
||||
use smallvec::{smallvec, SmallVec};
|
||||
|
||||
|
@ -119,6 +120,77 @@ where
|
|||
}
|
||||
}
|
||||
|
||||
// A cloneable callback which produces a `TokenStream`. Each clone
|
||||
// of this should produce the same `TokenStream`
|
||||
pub trait CreateTokenStream: sync::Send + sync::Sync + FnOnce() -> TokenStream {
|
||||
// Workaround for the fact that `Clone` is not object-safe
|
||||
fn clone_it(&self) -> Box<dyn CreateTokenStream>;
|
||||
}
|
||||
|
||||
impl<F: 'static + Clone + sync::Send + sync::Sync + FnOnce() -> TokenStream> CreateTokenStream
|
||||
for F
|
||||
{
|
||||
fn clone_it(&self) -> Box<dyn CreateTokenStream> {
|
||||
Box::new(self.clone())
|
||||
}
|
||||
}
|
||||
|
||||
impl Clone for Box<dyn CreateTokenStream> {
|
||||
fn clone(&self) -> Self {
|
||||
let val: &(dyn CreateTokenStream) = &**self;
|
||||
val.clone_it()
|
||||
}
|
||||
}
|
||||
|
||||
/// A lazy version of `TokenStream`, which may defer creation
|
||||
/// of an actual `TokenStream` until it is needed.
|
||||
pub type LazyTokenStream = Lrc<LazyTokenStreamInner>;
|
||||
|
||||
#[derive(Clone)]
|
||||
pub enum LazyTokenStreamInner {
|
||||
Lazy(Box<dyn CreateTokenStream>),
|
||||
Ready(TokenStream),
|
||||
}
|
||||
|
||||
impl std::fmt::Debug for LazyTokenStreamInner {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
match self {
|
||||
LazyTokenStreamInner::Lazy(..) => f.debug_struct("LazyTokenStream::Lazy").finish(),
|
||||
LazyTokenStreamInner::Ready(..) => f.debug_struct("LazyTokenStream::Ready").finish(),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl LazyTokenStreamInner {
|
||||
pub fn into_token_stream(&self) -> TokenStream {
|
||||
match self {
|
||||
// Note that we do not cache this. If this ever becomes a performance
|
||||
// problem, we should investigate wrapping `LazyTokenStreamInner`
|
||||
// in a lock
|
||||
LazyTokenStreamInner::Lazy(cb) => (cb.clone())(),
|
||||
LazyTokenStreamInner::Ready(stream) => stream.clone(),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl<S: Encoder> Encodable<S> for LazyTokenStreamInner {
|
||||
fn encode(&self, _s: &mut S) -> Result<(), S::Error> {
|
||||
panic!("Attempted to encode LazyTokenStream");
|
||||
}
|
||||
}
|
||||
|
||||
impl<D: Decoder> Decodable<D> for LazyTokenStreamInner {
|
||||
fn decode(_d: &mut D) -> Result<Self, D::Error> {
|
||||
panic!("Attempted to decode LazyTokenStream");
|
||||
}
|
||||
}
|
||||
|
||||
impl<CTX> HashStable<CTX> for LazyTokenStreamInner {
|
||||
fn hash_stable(&self, _hcx: &mut CTX, _hasher: &mut StableHasher) {
|
||||
panic!("Attempted to compute stable hash for LazyTokenStream");
|
||||
}
|
||||
}
|
||||
|
||||
/// A `TokenStream` is an abstract sequence of tokens, organized into `TokenTree`s.
|
||||
///
|
||||
/// The goal is for procedural macros to work with `TokenStream`s and `TokenTree`s
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue