Expand description
This crate provides common routines used in command line applications, with a focus on routines useful for search oriented applications. As a utility library, there is no central type or function. However, a key focus of this crate is to improve failure modes and provide user friendly error messages when things go wrong.
To the best extent possible, everything in this crate works on Windows, macOS and Linux.
Standard I/O
is_readable_stdin
determines whether stdin can be usefully read from. It
is useful when writing an application that changes behavior based on whether
the application was invoked with data on stdin. For example, rg foo
might
recursively search the current working directory for occurrences of foo
, but
rg foo < file
might only search the contents of file
.
Coloring and buffering
The stdout
, stdout_buffered_block
and stdout_buffered_line
routines
are alternative constructors for StandardStream
. A StandardStream
implements termcolor::WriteColor
, which provides a way to emit colors to
terminals. Its key use is the encapsulation of buffering style. Namely,
stdout
will return a line buffered StandardStream
if and only if
stdout is connected to a tty, and will otherwise return a block buffered
StandardStream
. Line buffering is important for use with a tty because it
typically decreases the latency at which the end user sees output. Block
buffering is used otherwise because it is faster, and redirecting stdout to a
file typically doesn’t benefit from the decreased latency that line buffering
provides.
The stdout_buffered_block
and stdout_buffered_line
can be used to
explicitly set the buffering strategy regardless of whether stdout is connected
to a tty or not.
Escaping
The escape
, escape_os
, unescape
and
unescape_os
routines provide a user friendly way of dealing with UTF-8
encoded strings that can express arbitrary bytes. For example, you might want
to accept a string containing arbitrary bytes as a command line argument, but
most interactive shells make such strings difficult to type. Instead, we can
ask users to use escape sequences.
For example, a\xFFz
is itself a valid UTF-8 string corresponding to the
following bytes:
[b'a', b'\\', b'x', b'F', b'F', b'z']
However, we can
interpret \xFF
as an escape sequence with the unescape
/unescape_os
routines, which will yield
[b'a', b'\xFF', b'z']
instead. For example:
use grep_cli::unescape;
// Note the use of a raw string!
assert_eq!(vec![b'a', b'\xFF', b'z'], unescape(r"a\xFFz"));
The escape
/escape_os
routines provide the reverse transformation, which
makes it easy to show user friendly error messages involving arbitrary bytes.
Building patterns
Typically, regular expression patterns must be valid UTF-8. However, command
line arguments aren’t guaranteed to be valid UTF-8. Unfortunately, the standard
library’s UTF-8 conversion functions from OsStr
s do not provide good error
messages. However, the pattern_from_bytes
and pattern_from_os
do,
including reporting exactly where the first invalid UTF-8 byte is seen.
Additionally, it can be useful to read patterns from a file while reporting
good error messages that include line numbers. The patterns_from_path
,
patterns_from_reader
and patterns_from_stdin
routines do just that. If
any pattern is found that is invalid UTF-8, then the error includes the file
path (if available) along with the line number and the byte offset at which the
first invalid UTF-8 byte was observed.
Read process output
Sometimes a command line application needs to execute other processes and
read its stdout in a streaming fashion. The CommandReader
provides this
functionality with an explicit goal of improving failure modes. In particular,
if the process exits with an error code, then stderr is read and converted into
a normal Rust error to show to end users. This makes the underlying failure
modes explicit and gives more information to end users for debugging the
problem.
As a special case, DecompressionReader
provides a way to decompress
arbitrary files by matching their file extensions up with corresponding
decompression programs (such as gzip
and xz
). This is useful as a means of
performing simplistic decompression in a portable manner without binding to
specific compression libraries. This does come with some overhead though, so
if you need to decompress lots of small files, this may not be an appropriate
convenience to use.
Each reader has a corresponding builder for additional configuration, such as whether to read stderr asynchronously in order to avoid deadlock (which is enabled by default).
Miscellaneous parsing
The parse_human_readable_size
routine parses strings like 2M
and converts
them to the corresponding number of bytes (2 * 1<<20
in this case). If an
invalid size is found, then a good error message is crafted that typically
tells the user how to fix the problem.
Structs
- An error that can occur while running a command and reading its output.
- A streaming reader for a command’s output.
- Configures and builds a streaming reader for process output.
- A matcher for determining how to decompress files.
- A builder for a matcher that determines which files get decompressed.
- A streaming reader for decompressing the contents of a file.
- Configures and builds a streaming reader for decompressing data.
- An error that occurs when a pattern could not be converted to valid UTF-8.
- An error that occurs when parsing a human readable size description.
- A writer that supports coloring with either line or block buffering.
Functions
- Escapes arbitrary bytes into a human readable string.
- Escapes an OS string into a human readable string.
- Returns the hostname of the current system.
- Returns true if and only if stdin is believed to be readable.
- is_tty_stderrDeprecatedReturns true if and only if stderr is believed to be connected to a tty or a console.
- is_tty_stdinDeprecatedReturns true if and only if stdin is believed to be connected to a tty or a console.
- is_tty_stdoutDeprecatedReturns true if and only if stdout is believed to be connected to a tty or a console.
- Parse a human readable size like
2M
into a corresponding number of bytes. - Convert arbitrary bytes into a regular expression pattern.
- Convert an OS string into a regular expression pattern.
- Read patterns from a file path, one per line.
- Read patterns from any reader, one per line.
- Read patterns from stdin, one per line.
- Resolves a path to a program to a path by searching for the program in
PATH
. - Returns a possibly buffered writer to stdout for the given color choice.
- Returns a block buffered writer to stdout for the given color choice.
- Returns a line buffered writer to stdout for the given color choice.
- Unescapes a string.
- Unescapes an OS string.