stringdist version 0.9.6 arrived on CRAN on 16 july 2020.
This release brings a few new features.
Fuzzy text search
Search text for approximate matches of a search string using any stringdist distance. There are several functions that allow you to
- detect whether there is a match within a certain maximum distance
- return the position of the first best match
- return the best match.
There are several interfaces for this. Functions
grabl work like base
grepl. The function
extract has output similar to
stringr::str_extract. The workhorse function is called
afind (approximate find), which returns all results for multiple search patterns.
There is also a new implementation of the popular ‘cosine’ distance that I developed especially for this purpose. It is called ‘running_cosine’ and it avoids double work otherwise done with by the standard ‘cosine’ method. The result is a much faster implementation (up to about 100 times faster).
string similarity matrices
Thanks to a PR by Johannes Gruber stringdist now has a function to compute string similarity matrices: