Crate grep_searcher
source ·Expand description
This crate provides an implementation of line oriented search, with optional support for multi-line search.
Brief overview
The principle type in this crate is a Searcher
, which can be configured
and built by a SearcherBuilder
. A Searcher
is responsible for reading
bytes from a source (e.g., a file), executing a search of those bytes using
a Matcher
(e.g., a regex) and then reporting the results of that search to
a Sink
(e.g., stdout). The Searcher
itself is principally responsible
for managing the consumption of bytes from a source and applying a Matcher
over those bytes in an efficient way. The Searcher
is also responsible for
inverting a search, counting lines, reporting contextual lines, detecting
binary data and even deciding whether or not to use memory maps.
A Matcher
(which is defined in the
grep-matcher
crate) is a trait
for describing the lowest levels of pattern search in a generic way. The
interface itself is very similar to the interface of a regular expression.
For example, the grep-regex
crate provides an implementation of the Matcher
trait using Rust’s
regex
crate.
Finally, a Sink
describes how callers receive search results producer by a
Searcher
. This includes routines that are called at the beginning and end of
a search, in addition to routines that are called when matching or contextual
lines are found by the Searcher
. Implementations of Sink
can be trivially
simple, or extraordinarily complex, such as the Standard
printer found in
the grep-printer
crate, which
effectively implements grep-like output. This crate also provides convenience
Sink
implementations in the sinks
sub-module for easy searching with
closures.
Example
This example shows how to execute the searcher and read the search results
using the UTF8
implementation of Sink
.
use {
grep_matcher::Matcher,
grep_regex::RegexMatcher,
grep_searcher::Searcher,
grep_searcher::sinks::UTF8,
};
const SHERLOCK: &'static [u8] = b"\
For the Doctor Watsons of this world, as opposed to the Sherlock
Holmeses, success in the province of detective work must always
be, to a very large extent, the result of luck. Sherlock Holmes
can extract a clew from a wisp of straw or a flake of cigar ash;
but Doctor Watson has to have it taken out for him and dusted,
and exhibited clearly, with a label attached.
";
let matcher = RegexMatcher::new(r"Doctor \w+")?;
let mut matches: Vec<(u64, String)> = vec![];
Searcher::new().search_slice(&matcher, SHERLOCK, UTF8(|lnum, line| {
// We are guaranteed to find a match, so the unwrap is OK.
let mymatch = matcher.find(line.as_bytes())?.unwrap();
matches.push((lnum, line[mymatch].to_string()));
Ok(true)
}))?;
assert_eq!(matches.len(), 2);
assert_eq!(
matches[0],
(1, "Doctor Watsons".to_string())
);
assert_eq!(
matches[1],
(5, "Doctor Watson".to_string())
);
See also examples/search-stdin.rs
from the root of this crate’s directory
to see a similar example that accepts a pattern on the command line and
searches stdin.
Modules
- A collection of convenience implementations of
Sink
.
Structs
- The behavior of binary detection while searching.
- An encoding to use when searching.
- An iterator over lines in a particular slice of bytes.
- An explicit iterator over lines in a particular slice of bytes.
- Controls the strategy used for determining when to use memory maps.
- A searcher executes searches over a haystack and writes results to a caller provided sink.
- A builder for configuring a searcher.
- A type that describes a contextual line reported by a searcher.
- Summary data reported at the end of a search.
- A type that describes a match reported by a searcher.
Enums
- An error that can occur when building a searcher.
- The type of context reported by a searcher.
Traits
- A trait that defines how results from searchers are handled.
- A trait that describes errors that can be reported by searchers and implementations of
Sink
.