1
Fork 0

Improve print_tts by changing tokenstream::Spacing.

`tokenstream::Spacing` appears on all `TokenTree::Token` instances,
both punct and non-punct. Its current usage:
- `Joint` means "can join with the next token *and* that token is a
  punct".
- `Alone` means "cannot join with the next token *or* can join with the
  next token but that token is not a punct".

The fact that `Alone` is used for two different cases is awkward.
This commit augments `tokenstream::Spacing` with a new variant
`JointHidden`, resulting in:
- `Joint` means "can join with the next token *and* that token is a
  punct".
- `JointHidden` means "can join with the next token *and* that token is a
  not a punct".
- `Alone` means "cannot join with the next token".

This *drastically* improves the output of `print_tts`. For example,
this:
```
stringify!(let a: Vec<u32> = vec![];)
```
currently produces this string:
```
let a : Vec < u32 > = vec! [] ;
```
With this PR, it now produces this string:
```
let a: Vec<u32> = vec![] ;
```
(The space after the `]` is because `TokenTree::Delimited` currently
doesn't have spacing information. The subsequent commit fixes this.)

The new `print_tts` doesn't replicate original code perfectly. E.g.
multiple space characters will be condensed into a single space
character. But it's much improved.

`print_tts` still produces the old, uglier output for code produced by
proc macros. Because we have to translate the generated code from
`proc_macro::Spacing` to the more expressive `token::Spacing`, which
results in too much `proc_macro::Along` usage and no
`proc_macro::JointHidden` usage. So `space_between` still exists and
is used by `print_tts` in conjunction with the `Spacing` field.

This change will also help with the removal of `Token::Interpolated`.
Currently interpolated tokens are pretty-printed nicely via AST pretty
printing. `Token::Interpolated` removal will mean they get printed with
`print_tts`. Without this change, that would result in much uglier
output for code produced by decl macro expansions. With this change, AST
pretty printing and `print_tts` produce similar results.

The commit also tweaks the comments on `proc_macro::Spacing`. In
particular, it refers to "compound tokens" rather than "multi-char
operators" because lifetimes aren't operators.
This commit is contained in:
Nicholas Nethercote 2023-08-08 11:43:44 +10:00
parent 7e452c123c
commit 925f7fad57
56 changed files with 567 additions and 356 deletions

View file

@ -925,13 +925,12 @@ impl !Sync for Punct {}
pub enum Spacing {
/// A `Punct` token can join with the following token to form a multi-character operator.
///
/// In token streams constructed using proc macro interfaces `Joint` punctuation tokens can be
/// followed by any other tokens. \
/// However, in token streams parsed from source code compiler will only set spacing to `Joint`
/// in the following cases:
/// - A `Punct` is immediately followed by another `Punct` without a whitespace. \
/// E.g. `+` is `Joint` in `+=` and `++`.
/// - A single quote `'` is immediately followed by an identifier without a whitespace. \
/// In token streams constructed using proc macro interfaces, `Joint` punctuation tokens can be
/// followed by any other tokens. However, in token streams parsed from source code, the
/// compiler will only set spacing to `Joint` in the following cases.
/// - When a `Punct` is immediately followed by another `Punct` without a whitespace. E.g. `+`
/// is `Joint` in `+=` and `++`.
/// - When a single quote `'` is immediately followed by an identifier without a whitespace.
/// E.g. `'` is `Joint` in `'lifetime`.
///
/// This list may be extended in the future to enable more token combinations.
@ -939,11 +938,10 @@ pub enum Spacing {
Joint,
/// A `Punct` token cannot join with the following token to form a multi-character operator.
///
/// `Alone` punctuation tokens can be followed by any other tokens. \
/// In token streams parsed from source code compiler will set spacing to `Alone` in all cases
/// not covered by the conditions for `Joint` above. \
/// E.g. `+` is `Alone` in `+ =`, `+ident` and `+()`.
/// In particular, token not followed by anything will also be marked as `Alone`.
/// `Alone` punctuation tokens can be followed by any other tokens. In token streams parsed
/// from source code, the compiler will set spacing to `Alone` in all cases not covered by the
/// conditions for `Joint` above. E.g. `+` is `Alone` in `+ =`, `+ident` and `+()`. In
/// particular, tokens not followed by anything will be marked as `Alone`.
#[stable(feature = "proc_macro_lib2", since = "1.29.0")]
Alone,
}
@ -978,8 +976,8 @@ impl Punct {
}
/// Returns the spacing of this punctuation character, indicating whether it can be potentially
/// combined into a multi-character operator with the following token (`Joint`), or the operator
/// has certainly ended (`Alone`).
/// combined into a multi-character operator with the following token (`Joint`), or whether the
/// operator has definitely ended (`Alone`).
#[stable(feature = "proc_macro_lib2", since = "1.29.0")]
pub fn spacing(&self) -> Spacing {
if self.0.joint { Spacing::Joint } else { Spacing::Alone }