How Does The Unix `history` Command Work?

Tagged with dev rust
Cozy attic created by [vectorpouch](https://www.freepik.com/vectors/poster) and tux created by [catalyststuff](https://www.freepik.com/vectors/baby) &ampmdash; freepik.com
Source: Cozy attic created by vectorpouch and tux created by catalyststuff — freepik.com

As the day is winding down, I have a good hour just to myself. Perfect time to listen to some Billie Joel (it's either Billie Joel or Billie Eilish for me these days) and learn how the Unix history command works. Life is good.

Learning what makes Unix tick is a bit of a hobby of mine.
I covered yes, ls, and cat before. Don't judge.

How does history even work?

Every command is tracked, so I see the last few commands on my machine when I run history.

❯❯❯ history
8680  cd endler.dev
8682  cd content/2021
8683  mkdir history
8684  cd history
8685  vim index.md

Yeah, but how does it do that?

The manpage on my mac is not really helpful — I also couldn't find much in the first place.

I found this article (it's good etiquette nowadays to warn you that this is a Medium link) and it describes a bit of what's going on.

Every command is stored in $HISTFILE, which points to ~/.zsh_history for me.

❯❯❯ tail $HISTFILE
: 1586007759:0;cd endler.dev
: 1586007763:0;cd content/2021
: 1586007771:0;mkdir history
: 1586007772:0;cd history
: 1586007777:0;vim index.md
...

So let's see. We got a : followed by a timestamp followed by :0, then a separator (;) and finally the command itself. Each new command gets appended to the end of the file. Not too hard to recreate.

Hold on, what's that 0 about!?

It turns out it's the command duration, and the entire thing is called the extended history format:

: <beginning time>:<elapsed seconds>;<command>

(Depending on your settings, your file might look different.)

Hooking into history

But still, how does history really work.

It must run some code whenever I execute a command — a hook of some sort!

💥 Swoooooosh 💥

Matthias from the future steps out of a blinding ball of light: Waaait! That's not really how it works!

It turns out that shells like bash and zsh don't actually call a hook for history. Why should they? When history is a shell builtin, they can just track the commands internally.

Thankfully my editor-in-chief and resident Unix neckbeard Simon Brüggen explained that to me — but only after I sent him the first draft for this article. 😓

As such, the next section is a bit like Lord of the Rings: a sympathetic but naive fellow on a questionable mission with no clue of what he's getting himself into.

In my defense, Lord of the Rings is also enjoyed primarily for its entertainment value, not its historical accuracy.... and just like in this epic story, I promise we'll get to the bottom of things in the end.

I found add-zsh-hook and a usage example in atuin's source code.

I might not fully comprehend all of that is written there, but I'm a man of action, and I can take a solid piece of work and tear it apart.

It's not much, but here's what I got:

# Source this in your ~/.zshrc
autoload -U add-zsh-hook

_past_preexec(){
    echo "preexec"
}

_past_precmd(){
    echo "precmd"
}

add-zsh-hook preexec _past_preexec
add-zsh-hook precmd _past_precmd

This sets up two hooks: the first one gets called right before a command gets executed and the second one directly after. (I decided to call my little history replacement past. I like short names.)

Okay, let's tell zsh to totally run this file whenever we execute a command:

source src/shell/past.zsh

...aaaaaand

❯❯❯ date
preexec
Fri May 28 18:53:55 CEST 2021
precmd

It works! ✨ How exciting!

Actually, I just remember now that I did the same thing for my little environment settings manager envy over two years ago, but hey!

So what to do with our newly acquired power?

Let's Run Some Rust Code

Here's the thing: only preexec gets the "real" command. precmd gets nothing:

_past_preexec(){
    echo "preexec $@"
}

_past_precmd(){
    echo "precmd $@"
}

$@ means "show me what you got" and here's what it got:

❯❯❯ date
preexec date date date
Fri May 28 19:02:11 CEST 2021
precmd

Shouldn't one "date" be enough?
Hum... let's look at the zsh documentation for preexec:

If the history mechanism is active [...], the string that the user typed is passed as the first argument, otherwise it is an empty string. The actual command that will be executed (including expanded aliases) is passed in two different forms: the second argument is a single-line, size-limited version of the command (with things like function bodies elided); the third argument contains the full text that is being executed.

I don't know about you, but the third argument should be all we ever need? 🤨

Checking...

❯❯❯ ls -l
preexec ls -l lsd -l lsd -l

(Shout out to lsd, the next-gen ls command )

Alright, good enough. Let's parse $3 with some Rust code and write it to our own history file.

use std::env;
use std::error::Error;
use std::fs::OpenOptions;
use std::io::Write;

const HISTORY_FILE: &str = "lol";

fn main() -> Result<(), Box<dyn Error>> {
    let mut history = OpenOptions::new()
        .create(true)
        .append(true)
        .open(HISTORY_FILE)?;

    if let Some(command) = env::args().nth(3) {
        writeln!(history, "{}", command)?;
    };
    Ok(())
}
❯❯❯ cargo run -- dummy dummy hello
❯❯❯ cargo run -- dummy dummy world
❯❯❯ cat lol
hello
world

We're almost done — at least if we're willing to cheat a bit. 😏 Let's hardcode that format string:

use std::env;
use std::error::Error;
use std::fs::OpenOptions;
use std::io::Write;
use std::time::SystemTime;

const HISTORY_FILE: &str = "lol";

fn timestamp() -> Result<u64, Box<dyn Error>> {
    let n = SystemTime::now().duration_since(SystemTime::UNIX_EPOCH)?;
    Ok(n.as_secs())
}

fn main() -> Result<(), Box<dyn Error>> {
    let mut history = OpenOptions::new()
        .create(true)
        .append(true)
        .open(HISTORY_FILE)?;

    if let Some(command) = env::args().nth(3) {
        writeln!(history, ": {}:0;{}", timestamp()?, command)?;
    };
    Ok(())
}

Now, if we squint a little, it sorta kinda writes our command in my history format. (That part about the Unix timestamp was taken straight from the docs. Zero regrets.)

Remember when I said that precmd gets nothing?

I lied.

In reality, you can read the exit code of the executed command (from $?). That's very helpful, but we just agree to ignore that and never talk about it again.

With this out of the way, our final past.zsh hooks file looks like that:

autoload -U add-zsh-hook

_past_preexec(){
    past $@
}

add-zsh-hook preexec _past_preexec

Now here comes the dangerous part! Step back while I replace the original history command with my own. Never try this at home. (Actually I'm exaggerating a bit. Feel free to try it. Worst thing that will happen is that you'll lose a bit of history, but don't sue me.)

First, let's change the path to the history file to my real one:

// You should read the ${HISTFILE} env var instead ;)
const HISTORY_FILE: &str = "/Users/mendler/.zhistory";

Then let's install past:

❯❯❯ cargo install --path .
# bleep bloop...

After that, it's ready to use. Let's add that bad boy to my ~/.zshrc:

source "/Users/mendler/Code/private/past/src/shell/past.zsh"

And FINALLY we can test it.

We open a new shell and run a few commands followed by history:

❯❯❯  date
...
❯❯❯ ls
...
❯❯❯ it works
...
❯❯❯ history
 1011  date
 1012  ls
 1013  it works

Yay.The source code for past is on Github.

How it really really works

Our experiment was a great success, but I since learned that reality is a bit different.

"In early versions of Unix the history command was a separate program", but most modern shells have history builtin.

zsh tracks the history in its main run loop. Here are the important bits. (Assume all types are in scope.)

Eprog prog;

/* Main zsh run loop */
for (;;)
{
    /* Init history */
    hbegin(1);
    if (!(prog = parse_event(ENDINPUT)))
    {
        /* Couldn't parse command. Stop history */
        hend(NULL);
        continue;
    }
    /* Store command in history */
    if (hend(prog))
    {
        LinkList args;
        args = newlinklist();
        addlinknode(args, hist_ring->node.nam);
        addlinknode(args, dupstring(getjobtext(prog, NULL)));
        addlinknode(args, cmdstr = getpermtext(prog, NULL, 0));

        /* Here's the preexec hook that we used.
        * It gets passed all the args we saw earlier.
        */
        callhookfunc("preexec", args, 1, NULL);

        /* Main routine for executing a command */
        execode(prog);
    }
}

The history lines are kept in a hash, and also in a ring-buffer to prevent the history from getting too big. (See here.)

That's smart! Without the ring-buffer, a malicious user could just thrash the history with random commands until a buffer overflow is triggered. I never thought of that.

History time (see what I did there?)

The original history command was added to the Unix C shell (csh) in 1978. Here's a link to the paper by Bill Joy (hey, another Bill!). He took inspiration from the REDO command in Interlisp. You can find its specification in the original Interlisp manual in section 8.7.

Lessons learned

  • Rebuild what you don't understand.
  • The history file is human-readable and pretty straightforward.
  • The history command is a shell builtin, but we can use hooks to write our own.
  • Fun fact: Did you know that in zsh, history is actually just an alias for fc -l? More info here or check out the source code.

“What I cannot create, I do not understand” — Richard Feynman

    Thanks for reading! I mostly write about Rust and my (open-source) projects. If you would like to receive future posts automatically, you can subscribe via RSS or email:

    Submit to HN Sponsor me on Github My Amazon wish list

    Thanks to Simon Brüggen for reviewing drafts of this article.