Collecting All the Errors - Rust
A common way to handle iterating over results, Result<T, E>
, in Rust is to
use collect()
to collect an iterator of results into a result. To borrow the
example from Rust by Example:
let strings = vec!["7", "42", "one"];
let numbers: Result<Vec<_>,_> = string
.into_iter()
.map(|s| s.parse::<i32>())
.collect::<Result<Vec<_>,_>>();
You only need either the type hint on numbers
or the turbofish on collect
to coerce the iterator of results into a result. I prefer the turbofish on
collect
as I feel it makes the code flow better when reading, and is what you
would do when chaining further computations. In fact, if you’re going to use
?
on the collected result, you wouldn’t need1 either.
I use this pattern a lot, and it’s great. It makes mapping over collections with operations that can fail as ergonomic as applying that operation to a single value.
- You can use the
?
operator to make the error handling almost automatic. - You can use any of the other result combinators like
and_then
, andmap
. - You can chain further computations on the end of the expression. In the above example we could have done:
.collect()?
.into_iter()
.map(|n| n + 1)
.collect()
This pattern basically says:
Go try these things, if one of them fails we’ll fail ourselves, reporting the first error that we encounter.
Which, most of the time, is all you need.
But what about when it’s not? What about when you want to collect all of the errors when there are some?
The very next section of Rust by Example shows how you can use
partition(Result::is_ok)
to separate the Ok
& Err
variants in your
iterator of results. You could then go onto write some logic so your function
tests that the collection of errors is empty, and if not returns a result with
a collection of errors in its Err
variant: Result<Vec<T>, Vec<E>>
.
But then your entire function stack up to the point where you terminate the
need for expressing a collection of errors has to deal with returning a
collection of errors. If you’re writing a CLI tool, that termination point
would be the point where you write all the error messages to stderr and exit,
so your whole program is likely to need this error handling. This ruins a lot
of the ergonomics of ?
for dealing with calls to foreign methods that return
a normal Result<T, E>
(where E: std::err::Error
). Ok, you could replace all
?
with .map_err(|e| vec![e])?
, but that’s still clunky to deal with.
An alternative could be to use an error type such as
valid::ValidationError
that can model a collection of errors,
or build your own. This is probably a better approach, but is still going to
be a bit of work to define your own error type, or integrate another crates’
error type into your application.
If you’re building a library, this alternative is probably the right way to go as your library will be able to return a well-typed expression of “many errors”.
With both of these, you’re still going to be missing the ergonomics of
collect()
.
In much the same way as when building a library you might use
thiserror
to define your own error types, whereas in an
application you’d be happy raising add-hoc errors with
anyhow::anyhow
: if you’re writing an application you might find
defining your own error type(s) to handle “many errors” a bit heavy weight.
I’ve recently found myself in this exact situation while writing a CLI tool. The tool needs to tell the user all the errors they’ve made in the file they’ve provided, rather than just one at a time… as that would be super annoying. It doesn’t need to do any real processing of the collected errors, it only needs to print them all to the terminal in a way that’s reasonably readable to the user, ideally with writing as little special code as possible.
To keep things as simple as possible: I wanted the ergonomics of collect()
;
and for my representation of “many errors” to be “nothing special”, so I can
use all the existing error handling goodies out of the box.
Enter BeauCollector
. Trait BeauCollector
provides method
bcollect
which will collect an iterator of errors into an
anyhow::Result
with an Err
variant containing an ad-hoc
anyhow::Error
with the messages from each error in the
collection of results on a new line in the error message.
Looking at our example above:
use beau_collector::BeauCollector as _;
let strings = vec!["7", "42", "one"];
let numbers = string
.into_iter()
.map(|s| s.parse::<i32>())
.bcollect::<Vec<_>>()?;
Errors with chains of causes, have their causal chains retained by being
formatted with the inline representation where causes are separated
by :
.
One of the results you’re iterating over might have been the result of running
bcollect
. As string concatenation is basically what we’re doing here,
bcollect
will represent them alongside any new errors from this layer in the
error message. You can collect errors with bcollect
at various layers within
you application and they’ll all be represented in one large error at the top
level.
Using bcollect
at multiple levels and having a main
that returns a result:
fn main() -> anyhow::Result<()> {
use anyhow::Context as _;
input_file = ...;
some_data = process_input(input_file)
.with_context(|| format!("Errors found while processing {}", input_file))?;
run_application(some_data)
}
Can have an error output that looks like:
$ ./my-cli-tool input_file.yaml
Errors found while processing input_file.yaml
Caused by:
"foo_bar" is not a valid name for X: underscores are not permitted
field "baz" is missing from input data
errors processing scale data
:
54 is too high for field_y
values must be positive: value -3 given for field_z is negative.
some other error because of: another error because of: specific problem.
The error in the Err
variant of the result returned by main
is an
anyhow::Error
with a message being the block seen below “Caused
by:” and a context of “Errors found while…".
Using BeauCollector
in this way has the advantages that:
- Functions can return a normal
Result
, ?
can be used normally on all otherResults
in functions,- If
main
returnsResult
, no special error handling is needed and the full collection of results will appear in stderr.
There are some limitations too:
- Beyond placing each collected error on a newline,
bcollect
doesn’t do any further formatting, or reformatting when different layers’ errors are combined. It’s left up to the error messages raised to add in some whitespace to appear more readable than a solid block of text. - While the messages from the causal chain of an error are retained you loose some other error handling like getting the backtrace or downcasting to get the original error back.
These are currently the price you pay for the ergonomics of bcollect
.
For writing applications, you’re likely to be happy with this trade off and
find the limitations of bcollect
a small price to pay.
For a library you might want the caller to have more options on what to do with
the many errors than building up an error message, maybe not. Perhaps
bcollect
could be enhanced to return an error type that can represent a tree
of errors that can be added to as errors propagate back up the stack. By all
means send in a pull request or raise an issue.
When I say you don’t need the type hints, that is to say that the compiler won’t need them to infer the type. HUman readers of the code might prefer it is there so they can more easily infer the type. Type hinting is also helpful when refactoring as it can catch errors. ↩︎