2021-12-16 21:17:22 -08:00
|
|
|
// This file contains type definitions that are processed by the Closure Compiler but are
|
|
|
|
// not put into the JavaScript we include as part of the documentation. It is used for
|
|
|
|
// type checking. See README.md in this directory for more info.
|
|
|
|
|
|
|
|
/* eslint-disable */
|
2022-04-25 14:23:06 +02:00
|
|
|
let searchState;
|
2021-12-16 21:17:22 -08:00
|
|
|
function initSearch(searchIndex){}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @typedef {{
|
2021-12-20 15:42:08 +01:00
|
|
|
* name: string,
|
2024-09-24 12:36:05 -07:00
|
|
|
* id: number|null,
|
2021-12-20 15:42:08 +01:00
|
|
|
* fullPath: Array<string>,
|
|
|
|
* pathWithoutLast: Array<string>,
|
|
|
|
* pathLast: string,
|
|
|
|
* generics: Array<QueryElement>,
|
2024-09-24 12:36:05 -07:00
|
|
|
* bindings: Map<number, Array<QueryElement>>,
|
2021-12-20 15:42:08 +01:00
|
|
|
* }}
|
|
|
|
*/
|
2022-04-25 14:23:06 +02:00
|
|
|
let QueryElement;
|
2021-12-20 15:42:08 +01:00
|
|
|
|
2022-01-04 15:44:00 +01:00
|
|
|
/**
|
|
|
|
* @typedef {{
|
|
|
|
* pos: number,
|
|
|
|
* totalElems: number,
|
|
|
|
* typeFilter: (null|string),
|
|
|
|
* userQuery: string,
|
2023-09-22 17:27:06 -07:00
|
|
|
* isInBinding: (null|string),
|
2022-01-04 15:44:00 +01:00
|
|
|
* }}
|
|
|
|
*/
|
2022-04-25 14:23:06 +02:00
|
|
|
let ParserState;
|
2022-01-04 15:44:00 +01:00
|
|
|
|
2021-12-20 15:42:08 +01:00
|
|
|
/**
|
|
|
|
* @typedef {{
|
|
|
|
* original: string,
|
2022-01-03 16:43:30 +01:00
|
|
|
* userQuery: string,
|
2021-12-20 15:42:08 +01:00
|
|
|
* typeFilter: number,
|
|
|
|
* elems: Array<QueryElement>,
|
|
|
|
* args: Array<QueryElement>,
|
|
|
|
* returned: Array<QueryElement>,
|
|
|
|
* foundElems: number,
|
2023-08-05 11:22:21 -07:00
|
|
|
* totalElems: number,
|
2023-04-15 11:53:50 -07:00
|
|
|
* literalSearch: boolean,
|
2024-09-05 17:58:05 -07:00
|
|
|
* hasReturnArrow: boolean,
|
2024-06-07 05:49:46 +08:00
|
|
|
* corrections: Array<{from: string, to: integer}> | null,
|
rustdoc-search: use set ops for ranking and filtering
This commit adds ranking and quick filtering to type-based search,
improving performance and having it order results based on their
type signatures.
Motivation
----------
If I write a query like `str -> String`, a lot of functions come up.
That's to be expected, but `String::from_str` should come up on top, and
it doesn't right now. This is because the sorting algorithm is based
on the functions name, and doesn't consider the type signature at all.
`slice::join` even comes up above it!
To fix this, the sorting should take into account the function's
signature, and the closer match should come up on top.
Guide-level description
-----------------------
When searching by type signature, types with a "closer" match will
show up above types that match less precisely.
Reference-level explanation
---------------------------
Functions signature search works in three major phases:
* A compact "fingerprint," based on the [bloom filter] technique, is used to
check for matches and to estimate the distance. It sometimes has false
positive matches, but it also operates on 128 bit contiguous memory and
requires no backtracking, so it performs a lot better than real
unification.
The fingerprint represents the set of items in the type signature, but it
does not represent nesting, and it ignores when the same item appears more
than once.
The result is rejected if any query bits are absent in the function, or
if the distance is higher than the current maximum and 200
results have already been found.
* The second step performs unification. This is where nesting and true bag
semantics are taken into account, and it has no false positives. It uses a
recursive, backtracking algorithm.
The result is rejected if any query elements are absent in the function.
[bloom filter]: https://en.wikipedia.org/wiki/Bloom_filter
Drawbacks
---------
This makes the code bigger.
More than that, this design is a subtle trade-off. It makes the cases I've
tested against measurably faster, but it's not clear how well this extends
to other crates with potentially more functions and fewer types.
The more complex things get, the more important it is to gather a good set
of data to test with (this is arguably more important than the actual
benchmarking ifrastructure right now).
Rationale and alternatives
--------------------------
Throwing a bloom filter in front makes it faster.
More than that, it tries to take a tactic where the system can not only check
for potential matches, but also gets an accurate distance function without
needing to do unification. That way it can skip unification even on items
that have the needed elems, as long as they have more items than the
currently found maximum.
If I didn't want to be able to cheaply do set operations on the fingerprint,
a [cuckoo filter] is supposed to have better performance.
But the nice bit-banging set intersection doesn't work AFAIK.
I also looked into [minhashing], but since it's actually an unbiased
estimate of the similarity coefficient, I'm not sure how it could be used
to skip unification (I wouldn't know if the estimate was too low or
too high).
This function actually uses the number of distinct items as its
"distance function."
This should give the same results that it would have gotten from a Jaccard
Distance $1-\frac{|F\cap{}Q|}{|F\cup{}Q|}$, while being cheaper to compute.
This is because:
* The function $F$ must be a superset of the query $Q$, so their union is
just $F$ and the intersection is $Q$ and it can be reduced to
$1-\frac{|Q|}{|F|}.
* There are no magic thresholds. These values are only being used to
compare against each other while sorting (and, if 200 results are found,
to compare with the maximum match). This means we only care if one value
is bigger than the other, not what it's actual value is, and since $Q$ is
the same for everything, it can be safely left out, reducing the formula
to $1-\frac{1}{|F|} = \frac{|F|}{|F|}-\frac{1}{|F|} = |F|-1$. And, since
the values are only being compared with each other, $|F|$ is fine.
Prior art
---------
This is significantly different from how Hoogle does it.
It doesn't account for order, and it has no special account for nesting,
though `Box<t>` is still two items, while `t` is only one.
This should give the same results that it would have gotten from a Jaccard
Distance $1-\frac{|A\cap{}B|}{|A\cup{}B|}$, while being cheaper to compute.
Unresolved questions
--------------------
`[]` and `()`, the slice/array and tuple/union operators, are ignored while
building the signature for the query. This is because they match more than
one thing, making them ambiguous. Unfortunately, this also makes them
a performance cliff. Is this likely to be a problem?
Right now, the system just stashes the type distance into the
same field that levenshtein distance normally goes in. This means exact
query matches show up on top (for example, if you have a function like
`fn nothing(a: Nothing, b: i32)`, then searching for `nothing` will show it
on top even if there's another function with `fn bar(x: Nothing)` that's
technically a closer match in type signature.
Future possibilities
--------------------
It should be possible to adopt more sorting criteria to act as a tie breaker,
which could be determined during unification.
[cuckoo filter]: https://en.wikipedia.org/wiki/Cuckoo_filter
[minhashing]: https://en.wikipedia.org/wiki/MinHash
2023-11-27 22:41:45 -07:00
|
|
|
* typeFingerprint: Uint32Array,
|
2024-06-07 05:49:46 +08:00
|
|
|
* error: Array<string> | null,
|
2021-12-16 21:17:22 -08:00
|
|
|
* }}
|
|
|
|
*/
|
2022-04-25 14:23:06 +02:00
|
|
|
let ParsedQuery;
|
2021-12-16 21:17:22 -08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @typedef {{
|
|
|
|
* crate: string,
|
|
|
|
* desc: string,
|
|
|
|
* id: number,
|
|
|
|
* name: string,
|
|
|
|
* normalizedName: string,
|
|
|
|
* parent: (Object|null|undefined),
|
|
|
|
* path: string,
|
|
|
|
* ty: (Number|null|number),
|
2023-06-02 19:58:44 -07:00
|
|
|
* type: FunctionSearchType?
|
2021-12-16 21:17:22 -08:00
|
|
|
* }}
|
|
|
|
*/
|
2022-04-25 14:23:06 +02:00
|
|
|
let Row;
|
2022-01-04 15:44:00 +01:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @typedef {{
|
|
|
|
* in_args: Array<Object>,
|
|
|
|
* returned: Array<Object>,
|
|
|
|
* others: Array<Object>,
|
|
|
|
* query: ParsedQuery,
|
|
|
|
* }}
|
|
|
|
*/
|
2022-04-25 14:23:06 +02:00
|
|
|
let ResultsTable;
|
2022-01-04 15:44:00 +01:00
|
|
|
|
2023-04-13 17:05:12 -07:00
|
|
|
/**
|
|
|
|
* @typedef {Map<String, ResultObject>}
|
|
|
|
*/
|
|
|
|
let Results;
|
|
|
|
|
2022-01-04 15:44:00 +01:00
|
|
|
/**
|
|
|
|
* @typedef {{
|
|
|
|
* desc: string,
|
|
|
|
* displayPath: string,
|
|
|
|
* fullPath: string,
|
|
|
|
* href: string,
|
|
|
|
* id: number,
|
|
|
|
* lev: number,
|
|
|
|
* name: string,
|
|
|
|
* normalizedName: string,
|
|
|
|
* parent: (Object|undefined),
|
|
|
|
* path: string,
|
|
|
|
* ty: number,
|
2024-09-24 14:31:44 -07:00
|
|
|
* type: FunctionSearchType?,
|
|
|
|
* displayType: Promise<Array<Array<string>>>|null,
|
|
|
|
* displayTypeMappedNames: Promise<Array<[string, Array<string>]>>|null,
|
2022-01-04 15:44:00 +01:00
|
|
|
* }}
|
|
|
|
*/
|
2023-04-13 17:05:12 -07:00
|
|
|
let ResultObject;
|
2022-06-27 11:07:16 -07:00
|
|
|
|
|
|
|
/**
|
2022-06-27 12:07:13 -07:00
|
|
|
* A pair of [inputs, outputs], or 0 for null. This is stored in the search index.
|
2022-06-27 11:07:16 -07:00
|
|
|
* The JavaScript deserializes this into FunctionSearchType.
|
|
|
|
*
|
2022-06-27 14:13:13 -07:00
|
|
|
* Numeric IDs are *ONE-indexed* into the paths array (`p`). Zero is used as a sentinel for `null`
|
|
|
|
* because `null` is four bytes while `0` is one byte.
|
|
|
|
*
|
2022-06-27 11:07:16 -07:00
|
|
|
* An input or output can be encoded as just a number if there is only one of them, AND
|
|
|
|
* it has no generics. The no generics rule exists to avoid ambiguity: imagine if you had
|
|
|
|
* a function with a single output, and that output had a single generic:
|
|
|
|
*
|
|
|
|
* fn something() -> Result<usize, usize>
|
|
|
|
*
|
rustdoc-search: add support for type parameters
When writing a type-driven search query in rustdoc, specifically one
with more than one query element, non-existent types become generic
parameters instead of auto-correcting (which is currently only done
for single-element queries) or giving no result. You can also force a
generic type parameter by writing `generic:T` (and can force it to not
use a generic type parameter with something like `struct:T` or whatever,
though if this happens it means the thing you're looking for doesn't
exist and will give you no results).
There is no syntax provided for specifying type constraints
for generic type parameters.
When you have a generic type parameter in a search query, it will only
match up with generic type parameters in the actual function, not
concrete types that match, not concrete types that implement a trait.
It also strictly matches based on when they're the same or different,
so `option<T>, option<U> -> option<U>` matches `Option::and`, but not
`Option::or`. Similarly, `option<T>, option<T> -> option<T>`` matches
`Option::or`, but not `Option::and`.
2023-06-16 14:43:28 -07:00
|
|
|
* If output was allowed to be any RawFunctionType, it would look like thi
|
2022-06-27 11:07:16 -07:00
|
|
|
*
|
|
|
|
* [[], [50, [3, 3]]]
|
|
|
|
*
|
|
|
|
* The problem is that the above output could be interpreted as either a type with ID 50 and two
|
|
|
|
* generics, or it could be interpreted as a pair of types, the first one with ID 50 and the second
|
|
|
|
* with ID 3 and a single generic parameter that is also ID 3. We avoid this ambiguity by choosing
|
|
|
|
* in favor of the pair of types interpretation. This is why the `(number|Array<RawFunctionType>)`
|
|
|
|
* is used instead of `(RawFunctionType|Array<RawFunctionType>)`.
|
|
|
|
*
|
rustdoc-search: add support for type parameters
When writing a type-driven search query in rustdoc, specifically one
with more than one query element, non-existent types become generic
parameters instead of auto-correcting (which is currently only done
for single-element queries) or giving no result. You can also force a
generic type parameter by writing `generic:T` (and can force it to not
use a generic type parameter with something like `struct:T` or whatever,
though if this happens it means the thing you're looking for doesn't
exist and will give you no results).
There is no syntax provided for specifying type constraints
for generic type parameters.
When you have a generic type parameter in a search query, it will only
match up with generic type parameters in the actual function, not
concrete types that match, not concrete types that implement a trait.
It also strictly matches based on when they're the same or different,
so `option<T>, option<U> -> option<U>` matches `Option::and`, but not
`Option::or`. Similarly, `option<T>, option<T> -> option<T>`` matches
`Option::or`, but not `Option::and`.
2023-06-16 14:43:28 -07:00
|
|
|
* The output can be skipped if it's actually unit and there's no type constraints. If thi
|
|
|
|
* function accepts constrained generics, then the output will be unconditionally emitted, and
|
|
|
|
* after it will come a list of trait constraints. The position of the item in the list will
|
|
|
|
* determine which type parameter it is. For example:
|
|
|
|
*
|
|
|
|
* [1, 2, 3, 4, 5]
|
|
|
|
* ^ ^ ^ ^ ^
|
|
|
|
* | | | | - generic parameter (-3) of trait 5
|
|
|
|
* | | | - generic parameter (-2) of trait 4
|
|
|
|
* | | - generic parameter (-1) of trait 3
|
|
|
|
* | - this function returns a single value (type 2)
|
|
|
|
* - this function takes a single input parameter (type 1)
|
|
|
|
*
|
|
|
|
* Or, for a less contrived version:
|
|
|
|
*
|
|
|
|
* [[[4, -1], 3], [[5, -1]], 11]
|
|
|
|
* -^^^^^^^---- ^^^^^^^ ^^
|
|
|
|
* | | | - generic parameter, roughly `where -1: 11`
|
|
|
|
* | | | since -1 is the type parameter and 11 the trait
|
|
|
|
* | | - function output 5<-1>
|
|
|
|
* | - the overall function signature is something like
|
|
|
|
* | `fn(4<-1>, 3) -> 5<-1> where -1: 11`
|
|
|
|
* - function input, corresponds roughly to 4<-1>
|
|
|
|
* 4 is an index into the `p` array for a type
|
|
|
|
* -1 is the generic parameter, given by 11
|
|
|
|
*
|
|
|
|
* If a generic parameter has multiple trait constraints, it gets wrapped in an array, just like
|
|
|
|
* function inputs and outputs:
|
|
|
|
*
|
|
|
|
* [-1, -1, [4, 3]]
|
|
|
|
* ^^^^^^ where -1: 4 + 3
|
|
|
|
*
|
|
|
|
* If a generic parameter's trait constraint has generic parameters, it gets wrapped in the array
|
|
|
|
* even if only one exists. In other words, the ambiguity of `4<3>` and `4 + 3` is resolved in
|
|
|
|
* favor of `4 + 3`:
|
|
|
|
*
|
|
|
|
* [-1, -1, [[4, 3]]]
|
|
|
|
* ^^^^^^^^ where -1: 4 + 3
|
|
|
|
*
|
|
|
|
* [-1, -1, [5, [4, 3]]]
|
|
|
|
* ^^^^^^^^^^^ where -1: 5, -2: 4 + 3
|
|
|
|
*
|
|
|
|
* If a generic parameter has no trait constraints (like in Rust, the `Sized` constraint i
|
|
|
|
* implied and a fake `?Sized` constraint used to note its absence), it will be filled in with 0.
|
|
|
|
*
|
2022-06-27 11:07:16 -07:00
|
|
|
* @typedef {(
|
|
|
|
* 0 |
|
|
|
|
* [(number|Array<RawFunctionType>)] |
|
rustdoc-search: add support for type parameters
When writing a type-driven search query in rustdoc, specifically one
with more than one query element, non-existent types become generic
parameters instead of auto-correcting (which is currently only done
for single-element queries) or giving no result. You can also force a
generic type parameter by writing `generic:T` (and can force it to not
use a generic type parameter with something like `struct:T` or whatever,
though if this happens it means the thing you're looking for doesn't
exist and will give you no results).
There is no syntax provided for specifying type constraints
for generic type parameters.
When you have a generic type parameter in a search query, it will only
match up with generic type parameters in the actual function, not
concrete types that match, not concrete types that implement a trait.
It also strictly matches based on when they're the same or different,
so `option<T>, option<U> -> option<U>` matches `Option::and`, but not
`Option::or`. Similarly, `option<T>, option<T> -> option<T>`` matches
`Option::or`, but not `Option::and`.
2023-06-16 14:43:28 -07:00
|
|
|
* [(number|Array<RawFunctionType>), (number|Array<RawFunctionType>)] |
|
|
|
|
* Array<(number|Array<RawFunctionType>)>
|
2022-06-27 11:07:16 -07:00
|
|
|
* )}
|
|
|
|
*/
|
|
|
|
let RawFunctionSearchType;
|
|
|
|
|
|
|
|
/**
|
|
|
|
* A single function input or output type. This is either a single path ID, or a pair of
|
|
|
|
* [path ID, generics].
|
|
|
|
*
|
2022-06-27 14:13:13 -07:00
|
|
|
* Numeric IDs are *ONE-indexed* into the paths array (`p`). Zero is used as a sentinel for `null`
|
|
|
|
* because `null` is four bytes while `0` is one byte.
|
|
|
|
*
|
2022-06-27 11:07:16 -07:00
|
|
|
* @typedef {number | [number, Array<RawFunctionType>]}
|
|
|
|
*/
|
|
|
|
let RawFunctionType;
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @typedef {{
|
|
|
|
* inputs: Array<FunctionType>,
|
2023-06-02 19:58:44 -07:00
|
|
|
* output: Array<FunctionType>,
|
rustdoc-search: add support for type parameters
When writing a type-driven search query in rustdoc, specifically one
with more than one query element, non-existent types become generic
parameters instead of auto-correcting (which is currently only done
for single-element queries) or giving no result. You can also force a
generic type parameter by writing `generic:T` (and can force it to not
use a generic type parameter with something like `struct:T` or whatever,
though if this happens it means the thing you're looking for doesn't
exist and will give you no results).
There is no syntax provided for specifying type constraints
for generic type parameters.
When you have a generic type parameter in a search query, it will only
match up with generic type parameters in the actual function, not
concrete types that match, not concrete types that implement a trait.
It also strictly matches based on when they're the same or different,
so `option<T>, option<U> -> option<U>` matches `Option::and`, but not
`Option::or`. Similarly, `option<T>, option<T> -> option<T>`` matches
`Option::or`, but not `Option::and`.
2023-06-16 14:43:28 -07:00
|
|
|
* where_clause: Array<Array<FunctionType>>,
|
2022-06-27 11:07:16 -07:00
|
|
|
* }}
|
|
|
|
*/
|
|
|
|
let FunctionSearchType;
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @typedef {{
|
2023-04-15 11:53:50 -07:00
|
|
|
* id: (null|number),
|
2023-09-22 17:27:06 -07:00
|
|
|
* ty: number,
|
2022-06-27 11:07:16 -07:00
|
|
|
* generics: Array<FunctionType>,
|
2023-09-22 17:27:06 -07:00
|
|
|
* bindings: Map<integer, Array<FunctionType>>,
|
2022-06-27 11:07:16 -07:00
|
|
|
* }}
|
|
|
|
*/
|
|
|
|
let FunctionType;
|
2023-12-30 17:27:20 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* The raw search data for a given crate. `n`, `t`, `d`, `i`, and `f`
|
|
|
|
* are arrays with the same length. `q`, `a`, and `c` use a sparse
|
|
|
|
* representation for compactness.
|
|
|
|
*
|
|
|
|
* `n[i]` contains the name of an item.
|
|
|
|
*
|
|
|
|
* `t[i]` contains the type of that item
|
|
|
|
* (as a string of characters that represent an offset in `itemTypes`).
|
|
|
|
*
|
|
|
|
* `d[i]` contains the description of that item.
|
|
|
|
*
|
|
|
|
* `q` contains the full paths of the items. For compactness, it is a set of
|
|
|
|
* (index, path) pairs used to create a map. If a given index `i` is
|
|
|
|
* not present, this indicates "same as the last index present".
|
|
|
|
*
|
|
|
|
* `i[i]` contains an item's parent, usually a module. For compactness,
|
|
|
|
* it is a set of indexes into the `p` array.
|
|
|
|
*
|
|
|
|
* `f` contains function signatures, or `0` if the item isn't a function.
|
|
|
|
* More information on how they're encoded can be found in rustc-dev-guide
|
|
|
|
*
|
|
|
|
* Functions are themselves encoded as arrays. The first item is a list of
|
|
|
|
* types representing the function's inputs, and the second list item is a list
|
|
|
|
* of types representing the function's output. Tuples are flattened.
|
|
|
|
* Types are also represented as arrays; the first item is an index into the `p`
|
|
|
|
* array, while the second is a list of types representing any generic parameters.
|
|
|
|
*
|
|
|
|
* b[i] contains an item's impl disambiguator. This is only present if an item
|
|
|
|
* is defined in an impl block and, the impl block's type has more than one associated
|
|
|
|
* item with the same name.
|
|
|
|
*
|
|
|
|
* `a` defines aliases with an Array of pairs: [name, offset], where `offset`
|
|
|
|
* points into the n/t/d/q/i/f arrays.
|
|
|
|
*
|
|
|
|
* `doc` contains the description of the crate.
|
|
|
|
*
|
|
|
|
* `p` is a list of path/type pairs. It is used for parents and function parameters.
|
2024-01-15 10:28:31 -07:00
|
|
|
* The first item is the type, the second is the name, the third is the visible path (if any) and
|
|
|
|
* the fourth is the canonical path used for deduplication (if any).
|
|
|
|
*
|
|
|
|
* `r` is the canonical path used for deduplication of re-exported items.
|
|
|
|
* It is not used for associated items like methods (that's the fourth element
|
|
|
|
* of `p`) but is used for modules items like free functions.
|
2023-12-30 17:27:20 -07:00
|
|
|
*
|
|
|
|
* `c` is an array of item indices that are deprecated.
|
|
|
|
* @typedef {{
|
|
|
|
* doc: string,
|
|
|
|
* a: Object,
|
|
|
|
* n: Array<string>,
|
2024-01-15 10:28:31 -07:00
|
|
|
* t: string,
|
2023-12-30 17:27:20 -07:00
|
|
|
* d: Array<string>,
|
2024-01-15 10:28:31 -07:00
|
|
|
* q: Array<[number, string]>,
|
|
|
|
* i: Array<number>,
|
2023-12-30 17:27:20 -07:00
|
|
|
* f: string,
|
2024-01-15 10:28:31 -07:00
|
|
|
* p: Array<[number, string] | [number, string, number] | [number, string, number, number]>,
|
|
|
|
* b: Array<[number, String]>,
|
|
|
|
* c: Array<number>,
|
|
|
|
* r: Array<[number, number]>,
|
2023-12-30 17:27:20 -07:00
|
|
|
* }}
|
|
|
|
*/
|
|
|
|
let RawSearchIndexCrate;
|