A quick note – while our main literal matcher and overall approach are described in our earlier paper (link and discussion https://branchfree.org/2019/02/28/paper-hyperscan-a-fast-multi-pattern-regex-matcher-for-modern-cpus/), we didn’t have space to discuss in much detail other string matching approaches.
Our “FDR” matcher is used for large-scale literal string matching – but what if the number of literals is small? We have a rather dull SIMD-based matcher for single strings (“Noodle”), but the territory between 2 and about 300 strings is interesting. Accessing matching tables in SIMD is much faster than going out to memory – even L1 cache (both because it’s parallel, and because it’s in registers). We have two matchers, both named after historical presidents of the United States of America (we have an informal cutoff around Truman or Eisenhower; I don’t think the conversations that result from calling a string matcher “Reagan” or “Carter” would be very productive).
Fortunately, there are papers about both.
Teddy was our initial SIMD matcher: https://dl.acm.org/doi/10.1145/3472456.3473512
Harry is a new SIMD matcher: https://ieeexplore.ieee.org/abstract/document/10229022
The engineer who built Harry (the very talented Harry Chang) assured me that he’s sticking to our presidential naming conventions.
Both of these systems can run much faster than FDR, which works byte-by-byte (or at most, can skip 1 or 3 bytes if doing strided matching). They’re a fine example of why SIMD is so powerful – the idea of using instructions such as PSHUFB or VPERMB for table lookup (rather than data rearrangement) is venerable, but we’re still learning new ways of applying this idea.