A Little Story About the `yes` Unix Command
What's the simplest Unix command you know?
echo, which prints a string to stdout and
true, which always terminates with an exit code of 0.
Among the series of simple Unix commands, there's also
yes. If you execute it without arguments, you get an infinite stream of y's, separated by a newline:
y y y y (...you get the idea)
What seems to be pointless in the beginning turns out to be pretty helpful :
yes | sh boring_installation.sh
Ever installed a program, which required you to type "y" and hit enter to keep going?
yes to the rescue! It will carefully fulfill its duty, so you can keep watching Pootie Tang.
Here's a basic version in... uhm... BASIC.
10 PRINT "y" 20 GOTO 10
And here's the same thing in Python:
Simple, eh? Not so quick!
Turns out, that program is quite slow.
python yes.py | pv -r > /dev/null [4.17MiB/s]
Compare that with the built-in version on my Mac:
yes | pv -r > /dev/null [34.2MiB/s]
So I tried to write a quicker version in Rust. Here's my first attempt:
- The string we want to print in a loop is the first command line parameter and is named expletive. I learned this word from the
- I use
unwrap_orto get the expletive from the parameters. In case the parameter is not set, we use "y" as a default.
- The default parameter gets converted from a string slice (
&str) into an owned string on the heap (
Let's test it.
cargo run --release | pv -r > /dev/null Compiling yes v0.1.0 Finished release [optimized] target(s) in 1.0 secs Running `target/release/yes` [2.35MiB/s]
Whoops, that doesn't look any better. It's even slower than the Python version! That caught my attention, so I looked around for the source code of a C implementation.
Here's the very first version of the program, released with Version 7 Unix and famously authored by Ken Thompson on
No magic here.
Compare that to the 128-line-version from the GNU coreutils, which is mirrored on Github. After 25 years, it is still under active development! The last code change happened around a year ago. That's quite fast:
# brew install coreutils gyes | pv -r > /dev/null [854MiB/s]
The important part is at the end:
/* Repeatedly output the buffer until there is a write error; then fail. */ while continue;
Aha! So they simply use a buffer to make write operations faster. The buffer size is defined by a constant named
BUFSIZ, which gets chosen on each system so as to make I/O efficient (see here). On my system, that was defined as 1024 bytes. I actually had better performance with 8192 bytes.
I've extended my Rust program:
use env; use ; const BUFSIZE: usize = 8192;
The important part is, that the buffer size is a multiple of four, to ensure memory alignment.
Running that gave me 51.3MiB/s. Faster than the version, which comes with my system, but still way slower than the results from this Reddit post that I found, where the author talks about 10.2GiB/s.
Once again, the Rust community did not disappoint.
As soon as this post hit the Rust subreddit, user nwydo pointed out a previous discussion on the same topic. Here's their optimized code, that breaks the 3GB/s mark on my machine:
use env; use ; use process; use Cow; use OsString; pub const BUFFER_CAPACITY: usize = 64 * 1024;
Now that's a whole different ballgame!
- We prepare a filled string buffer, which will be reused for each loop.
- Stdout is protected by a lock. So, instead of constantly acquiring and releasing it, we keep it all the time.
- We use a the platform-native
std::borrow::Cowto avoid unnecessary allocations.
The only thing that I could contribute was removing an unnecessary
The trivial program
yes turns out not to be so trivial after all. It uses output buffering and memory alignment to improve performance. Re-implementing Unix tools is fun and makes me appreciate the nifty tricks, which make our computers fast.
Thanks for reading! I mostly write about Rust and my (open-source) projects. If you would like to receive future posts automatically, you can subscribe via RSS or email: