🌜
🌞
wade

wade

v0.3.3

Blazing fast, 1kb search library

npm install wade

README

Wade

Blazing fast, 1kb search

Build Status

Installation

NPM

npm install wade

CDN

<script src="https://unpkg.com/wade"></script>

Usage

Initialize with strings in the form of an array

const search = Wade(["Apple", "Orange", "Lemon", "Tomato"]);

Now you can search for a substring within the array, and Wade will return the index of it.

search("App");
/*
[{
  index: 0,
  score: 1
}]
*/

Combined with libraries like Moon, you can create a simple real-time search.

Loading/Saving Data

To save data as an object, use Wade.save on your search function, and then use these later when initializing Wade.

For example:

// Create the initial search function
const search = Wade(["Apple", "Orange", "Lemon", "Tomato"]);
const instance = Wade.save(search);

// Save `instance` somewhere...

Later, you can get the same search function without having Wade recreate an index every time by doing:

// Retrieve `instance`, then
const search = Wade(instance);

Pipeline

Wade uses a pipeline to preprocess data and search queries. By default, this pipeline will:

  • Make everything lowercase
  • Remove punctuation
  • Remove stop words

A pipeline consists of different functions that process a string and modify it in some way, and return the string.

You can easily modify the pipeline as it is available in Wade.pipeline, for example:

// Don't preprocess at all
Wade.pipeline = [];

// Add custom processor to remove periods
Wade.pipeline.push(function(str) {
  return str.replace(/\./g, "");
});

All functions will be executed in the order of the pipeline (0-n) and they will be used on each document in the data.

The stop words can be configured to include any words you like, and you can access the array of stop words by using:

Wade.config.stopWords = [/* array of stop words */];

The punctuation regular expression used to remove punctuation can be configured with:

Wade.config.punctuationRE = /[.!]/g; // should contain punctuation to remove

Algorithm

The algorithm behind the search is fairly simple. First, a trie data structure is generated off of the data. When performing a search, the following happens:

  • The search query is processed through the pipeline
  • The search query is then tokenized into keywords
  • Each keyword except the last is searched for and scores for each item in the data are updated according to the amount of keywords that appear in the document.
  • The last keyword is treated as a prefix, and Wade performs a depth-first search and updates the score for all data prefixed with this keyword. The score is added depending on how much of the word was included in the prefix. This allows for searching as a user types.

License

Licensed under the MIT License by Kabir Shah

Release Notes

0.3.3
By Kabir Shah • Published on July 29, 2017

Patches

  • Add note about punctuation configuration option: 35c2cbb84a31cb5671e24eb54da2fb848d1ce555
  • Fix: skip over empty processed items (closes #10, thanks @dumin): 6c529272dcd456dc30ce56400371e8e400a319f3
0.3.2
By Kabir Shah • Published on July 19, 2017

Patches

  • Feat: allow stop words to be configured: 4b446375f4b09f4f6de52f01efc290c60bea98ae
  • Add docs for stop words (fixes #2): 8507e5f306f9e45914b91c90c42092df56545fce
  • Fix: remove all stop words by iterating through array backwards (fixes #3): 388ccaa674c95ed0bd66e6fbe49eb7a1e8971a18
  • Update punctuation symbols, fix config usage: 518ad44742950fb4eaf9f16d3031b195a04de180
  • Perf: cache config for punctuation removal: 0f08f8519d6b995814c861b35a804f6dae11f35c
0.3.1
By Kabir Shah • Published on July 9, 2017

Patches

  • Fix variable casing: 0794aef5b64649ac63b061b8dc5d969f3f14c689
  • Don't mutate data and check for trimmed query (fixes #1): f2afe63e62e3a5345de873f61329d9f3d01685b6
0.3.0
By Kabir Shah • Published on July 7, 2017

Minor Changes

  • Init multiple patterns: 87813fd280cb7ab96ac1fdcf3d9d52f5d88830a6
  • Naive trie lookup for multiple patterns: 47bdf62a15ce9585826e614f8f84e8af527ca4b0
  • :zap: blazing fast trie based search with prefix support (w/ stack): 0ff145960ab4c2ea92e03d5c815d740772a14472
  • Change order of algorithm/pipeline: d0516f6d737408d1725655e408bb7071e8767957
  • Add saving and loading feature: 8d92f69f16edbafd4ec1335f3f282184d4b6024f

Patches

  • Use two different algorithms for now: 80ae221f762ca80bb460a6db7d97f2cf0c2223b0
  • Add suffix trie generator: 74b9bc4dacee0a9d7768487dd1174474abb14271
  • Remove unused case, add more optimization for multiple patterns: 60feaf9d319c9c1ae97cc9f070dcc789346b8903
  • Add index builder: 92c830835ec21ba7102370e701614e55a0582577
  • Fix iterator: cf5831c3c70b7b606486b1f5d51eacf36e814de6
  • Build index: 35743d85791690a68afa7bf0fc176664c6357354
  • Correctly tokenize strings: 58d908785f31ad99cd27604ba8a8831c26aee6e3
0.2.0
By Kabir Shah • Published on May 21, 2017

Minor Changes

  • Use quick search: 8124226e063a332cf62b8c98ee801b60352a39ab
  • Add preprocessing pipeline: a55060a32749b56e2388bd122166d82745c7553a

Patches

  • Reuse length when creating tables: 8b1ed02e0919c865cc425768aa943aa9d05416b3
  • Improve pipeline and add docs: e59a3a397acf31f2d7a7c08a26cb665eb31b6147

General

License
MIT
Typescript Types
None found
Tree-shakeable
No

Popularity

GitHub Stargazers
2,963
Community Interest
3,110
Number of Forks
101

Maintenance

Commits
10/219/2201
Last Commit
Nov 20, 2019
Open Issues
4
Closed Issues
13
Open Pull Requests
2
Closed Pull Requests
4

Versions

Versions Released
10/219/2201
Latest Version Released
Jul 29, 2017
Current Tags
latest0.3.3

Contributors

kbrsh
kbrsh
Commits: 93
MMeent
MMeent
Commits: 1
leovarmak
leovarmak
Commits: 1