Tips for Faster Rust Compile Times

Tagged with rust

When it comes to runtime performance, Rust is one of the fastest guns in the west. ๐Ÿ”ซ It is on par with the likes of C and C++ and sometimes even surpasses those. Compile times, however? That's another story.

Below is a list of tips and tricks on how to make your Rust project compile faster today. They are roughly ordered by practicality, so start at the top and work your way down until you're happy and your compiler goes brrrrrrr.

Table of Contents

Why Is Rust Compilation Slow?

Wait a sec, slow in comparison to what? That is, if you compare Rust with Go, the Go compiler is doing a lot less work in general. For example, it lacks support for generics and macros. On top of that, the Go compiler was built from scratch as a monolithic toolchain consisting of both, the frontend and the backend (rather than relying on, say, LLVM to take over the backend part, which is the case for Rust or Swift). This has advantages (more flexibility when tweaking the entire compilation process, yay) and disadvantages (higher overall maintenance cost and fewer supported architectures).

In general, comparing across different programming languages makes little sense and overall, the Rust compiler is legitimately doing a great job. That said, above a certain project size, the compile times are... let's just say they could be better.

Why Bother?

According to the Rust 2019 survey, improving compile times is #4 on the Rust wishlist:

Rust Survey results 2019. (<a href='https://xkcd.com/303/'>Obligatory xkcd</a>.)
Rust Survey results 2019. (Obligatory xkcd.)

Compile-Time vs Runtime Performance

As is often cautioned in debates among their designers, programming language design is full of tradeoffs. One of those fundamental tradeoffs is runtime performance vs. compile-time performance, and the Rust team nearly always (if not always) chose runtime over compile-time.
โ€” Brian Anderson

Overall, there are a few features and design decisions that limit Rust compilation speed:

  • Macros: Code generation with macros can be quite expensive.
  • Type checking
  • Monomorphization: this is the process of generating specialized versions of generic functions. E.g., a function that takes an Into<String> gets converted into one that takes a String and one that takes a &str.
  • LLVM: that's the default compiler backend for Rust, where a lot of the heavy-lifting (like code-optimizations) takes place. LLVM is notorious for being slow.
  • Linking: Strictly speaking, this is not part of compiling but happens right after. It "connects" your Rust binary with the system libraries. cargo does not explicitly mark the linking step, so many people add it to the overall compilation time.

If you're interested in all the gory details, check out this blog post by Brian Anderson.

Update The Rust Compiler And Toolchain

Making the Rust compiler faster is an ongoing process, and many fearless people are working on it. Thanks to their hard work, compiler speed has improved 30-40% across the board year-to-date, with some projects seeing up to 45%+ improvements.

So make sure you use the latest Rust version:

rustup update

On top of that, Rust tracks compile regressions on a website dedicated to performance. Work is also put into optimizing the LLVM backend. Rumor has it that there's still a lot of low-hanging fruit. ๐Ÿ‡

Use cargo check Instead Of cargo build

Most of the time, you don't even have to compile your project at all; you just want to know if you messed up somewhere. Whenever you can, skip compilation altogether. What you need instead is laser-fast code linting, type- and borrow-checking.

For that, cargo has a special treat for you: โœจ cargo check โœจ. Consider the differences in the number of instructions between cargo check on the left and cargo debug in the middle. (Pay attention to the different scales.)

Speedup factors: check 1, debug 5, opt 20
Speedup factors: check 1, debug 5, opt 20

A sweet trick I use is to run it in the background with cargo watch. This way, it will cargo check whenever you change a file.

โญ Pro-tip: Use cargo watch -c to clear the screen before every run.

Use Rust Analyzer Instead Of Rust Language Server (RLS)

Another quick way to check if you set the codebase on fire is to use a "language server". That's basically a "linter as a service", that runs next to your editor.

For a long time, the default choice here was RLS, but lately, folks moved over to rust-analyzer, because it's more feature-complete and way more snappy. It supports all major IDEs. Switching to that alone might save your day.

Remove Unused Dependencies

So let's say you tried all of the above and find that compilation is still slow. What now?

Dependencies sometimes become obsolete thanks to refactoring. From time to time it helps to check if all of them are still needed to save compile time.

If this is your own project (or a project you like to contribute to), do a quick check if you can toss anything with cargo-udeps:

cargo install cargo-udeps && cargo +nightly udeps

Update Remaining Dependencies

Next, update your dependencies, because they themselves could have tidied up their dependency tree lately.

Take a deep dive with cargo-outdated or cargo tree (built right into cargo itself) to find any outdated dependencies. On top of that, use cargo audit to get notified about any vulnerabilities which need to be addressed, or deprecated crates which need a replacement.

Here's a nice workflow that I learned from /u/oherrala on Reddit:

  1. Run cargo update to update to the latest semver compatible version.
  2. Run cargo outdated -wR to find newer, possibly incompatible dependencies. Update those and fix code as needed.
  3. Find duplicate versions of a dependency and figure out where they come from: cargo tree --duplicate shows dependencies which come in multiple versions.
    (Thanks to /u/dbdr for pointing this out.)

โญ Pro-tip: Step 3 is a great way to contribute back to the community! Clone the repository and execute steps 1 and 2. Finally, send a pull request to the maintainers.

Replace Heavy Dependencies

From time to time, it helps to shop around for more lightweight alternatives to popular crates.

Again, cargo tree is your friend here to help you understand which of your dependencies are quite heavy: they require many other crates, cause excessive network I/O and slow down your build. Then search for lighter alternatives.

Also, cargo-bloat has a --time flag that shows you the per-crate build time. Very handy!

Here are a few examples:

Here's an example where switching crates reduced compile times from 2:22min to 26 seconds.

Use Cargo Workspaces

Cargo has that neat feature called workspaces, which allow you to split one big crate into multiple smaller ones. This code-splitting is great for avoiding repetitive compilation because only crates with changes have to be recompiled. Bigger projects like servo and vector are using workspaces heavily to slim down compile times. Learn more about workspaces here.

Combine All Integration Tests In A Single Binary

Have any integration tests? (These are the ones in your tests folder.) Did you know that the Rust compiler will create a binary for every single one of them? And every binary will have to be linked individually. This can take most of your build time because linking is slooow. ๐Ÿข The reason is that many system linkers (like ld) are single threaded.

๐Ÿ‘จโ€๐Ÿณ๏ธ๐Ÿ’กโ€๏ธ A linker is a tool that combines the output of a compiler and mashes that into one executable you can run.

To make the linker's job a little easier, you can put all your tests in one crate. (Basically create a main.rs in your test folder and add your test files as mod in there.)

Then the linker will go ahead and build a single binary only. Sounds nice, but careful: it's still a trade-off as you'll need to expose your internal types and functions (i.e. make them pub).

Might be worth a try, though because a recent benchmark revealed a 1.9x speedup for one project.

This tip was brought to you by Luca Palmieri, Lucio Franco, and Azriel Hoh. Thanks!

Disable Unused Features Of Crate Dependencies

Check the feature flags of your dependencies. A lot of library maintainers take the effort to split their crate into separate features that can be toggled off on demand. Maybe you don't need all the default functionality from every crate?

For example, tokio has a ton of features that you can disable if not needed.

Another example is bindgen, which enables clap support by default for its binary usage. This isn't needed for library usage, which is the common use-case. Disabling that feature improved compile time of rust-rocksdb by ~13s and ~9s for debug and release builds respectively. Thanks to reader Lilian Anatolie Moraru for mentioning this.

โš ๏ธ Fair warning: it seems that switching off features doesn't always improve compile time. (See tikv's experiences here.) It may still be a good idea for improving security be reducing the code's attack surface.

A quick way to list all features of a crate is cargo-feature-set.

Admittedly, features are not very discoverable at the moment because there is no standard way to document them, but we'll get there eventually.

Use A Ramdisk For Compilation

When starting to compile heavy projects, I noticed that I was throttled on I/O. The reason was that I kept my projects on a measly HDD. A more performant alternative would be SSDs, but if that's not an option, don't throw in the sponge just yet.

Ramdisks to the rescue! These are like "virtual harddisks" that live in system memory.

User moschroe_de shared the following snippet over on Reddit, which creates a ramdisk for your current Rust project (on Linux):

mkdir -p target && \
sudo mount -t tmpfs none ./target && \
cat /proc/mounts | grep "$(pwd)" | sudo tee -a /etc/fstab

On macOS, you could probably do something similar with this script. I haven't tried that myself, though.

Cache Dependencies With sccache

Another neat project is sccache by Mozilla, which caches compiled crates to avoid repeated compilation.

I had this running on my laptop for a while, but the benefit was rather negligible, to be honest. It works best if you work on a lot of independent projects that share dependencies (in the same version). A common use-case is shared build servers.

Cranelift โ€“ The Alternative Rust Compiler

Lately, I was excited to hear that the Rust project is using an alternative compiler that runs in parallel with rustc for every CI build: Cranelift, also called CG_CLIF.

Here is a comparison between rustc and Cranelift for some popular crates (blue means better):

LLVM compile time comparison between rustc and cranelift in favor of cranelift
LLVM compile time comparison between rustc and cranelift in favor of cranelift

Somewhat unbelieving, I tried to compile vector with both compilers.

The results were astonishing:

  • Rustc: 5m 45s
  • Cranelift: 3m 13s

I could really notice the difference! What's cool about this is that it creates fully working executable binaries. They won't be optimized as much, but they are great for local development.

A more detailed write-up is on Jason Williams' page, and the project code is on Github.

Switch To A Faster Linker

The thing that nobody seems to target is linking time. For me, when using something with a big dependency tree like Amethyst, for example linking time on my fairly recent Ryzen 7 1700 is ~10s each time, even if I change only some minute detail only in my code. โ€” /u/Almindor on Reddit

You can check how long your linker takes by running the following commands:

cargo clean
cargo +nightly rustc --bin <your_binary_name> -- -Z time-passes

It will output the timings of each step, including link time:

...
time:   0.000   llvm_dump_timing_file
time:   0.001   serialize_work_products
time:   0.002   incr_comp_finalize_session_directory
time:   0.004   link_binary_check_files_are_writeable
time:   0.614   run_linker
time:   0.000   link_binary_remove_temps
time:   0.620   link_binary
time:   0.622   link_crate
time:   0.757   link
time:   3.836   total
    Finished dev [unoptimized + debuginfo] target(s) in 42.75s

If the link steps account for a big percentage of the build time, consider switching over to a different linker. There are quite a few options.

According to the official documentation, "LLD is a linker from the LLVM project that is a drop-in replacement for system linkers and runs much faster than them. [..] When you link a large program on a multicore machine, you can expect that LLD runs more than twice as fast as the GNU gold linker. Your mileage may vary, though."

If you're on Linux you can switch to lld like so:

[target.x86_64-unknown-linux-gnu]
rustflags = [
    "-C", "link-arg=-fuse-ld=lld",
]

A word of caution: lld might not be working on all platforms yet. At least on macOS, Rust support seems to be broken at the moment, and the work on fixing it has stalled (see rust-lang/rust#39915).

Update: I recently learned about another linker called mold, which claims a massive 12x performance bump over lld. Compared to GNU gold, it's said to be more than 50x. Would be great if anyone could verify and send me a message.

Update II: Aaand another one called zld, which is a drop-in replacement for Apple's ld linker and is targeting debug builds. [Source]

The zld benchmarks are quite impressive.
The zld benchmarks are quite impressive.

Which one you want to choose depends on your requirements. Which platforms do you need to support? Is it just for local testing or for production usage?

mold is optimized for Linux, zld only works on macOS. For production use, lld might be the most mature option.

Faster Incremental Debug Builds On Macos

Rust 1.51 added an interesting flag for faster incremental debug builds on macOS. It can make debug builds up to seconds faster (depending on your use-case). Just add this to your Cargo.toml:

[profile.dev]
split-debuginfo = "unpacked"

Some engineers report that this flag alone reduces compilation times on macOS by 70%.

The flag might become the standard for macOS soon. It is already the default on nightly.

Tweak More Codegen Options / Compiler Flags

Rust comes with a huge set of settings for code generation. It can help to look through the list and tweak the parameters for your project.

There are many gems in the full list of codegen options. For inspiration, here's bevy's config for faster compilation.

Profile Compile Times

If you like to dig deeper, Rust compilation can be profiled with cargo rustc -- -Zself-profile. The resulting trace file can be visualized with a flamegraph or the Chromium profiler:

Image of Chrome profiler with all crates
Image of Chrome profiler with all crates
Source: Rust Lang Blog

There's also a cargo -Z timings feature that gives some information about how long each compilation step takes, and tracks concurrency information over time.

You might have to run it using the nightly compiler:

cargo +nightly build -Z timings

Another golden one is cargo-llvm-lines, which shows the number of lines generated and objects copied in the LLVM backend:

$ cargo llvm-lines | head -20

  Lines        Copies         Function name
  -----        ------         -------------
  30737 (100%)   1107 (100%)  (TOTAL)
   1395 (4.5%)     83 (7.5%)  core::ptr::drop_in_place
    760 (2.5%)      2 (0.2%)  alloc::slice::merge_sort
    734 (2.4%)      2 (0.2%)  alloc::raw_vec::RawVec<T,A>::reserve_internal
    666 (2.2%)      1 (0.1%)  cargo_llvm_lines::count_lines
    490 (1.6%)      1 (0.1%)  <std::process::Command as cargo_llvm_lines::PipeTo>::pipe_to
    476 (1.5%)      6 (0.5%)  core::result::Result<T,E>::map
    440 (1.4%)      1 (0.1%)  cargo_llvm_lines::read_llvm_ir
    422 (1.4%)      2 (0.2%)  alloc::slice::merge
    399 (1.3%)      4 (0.4%)  alloc::vec::Vec<T>::extend_desugared
    388 (1.3%)      2 (0.2%)  alloc::slice::insert_head
    366 (1.2%)      5 (0.5%)  core::option::Option<T>::map
    304 (1.0%)      6 (0.5%)  alloc::alloc::box_free
    296 (1.0%)      4 (0.4%)  core::result::Result<T,E>::map_err
    295 (1.0%)      1 (0.1%)  cargo_llvm_lines::wrap_args
    291 (0.9%)      1 (0.1%)  core::char::methods::<impl char>::encode_utf8
    286 (0.9%)      1 (0.1%)  cargo_llvm_lines::run_cargo_rustc
    284 (0.9%)      4 (0.4%)  core::option::Option<T>::ok_or_else

Avoid Procedural Macro Crates

Procedural macros are the hot sauce of Rust development: they burn through CPU cycles so use with care (keyword: monomorphization).

Update: Over on Twitter Manish pointed out that "the reason proc macros are slow is that the (excellent) proc macro infrastructure โ€“ syn and friends โ€“ are slow to compile. Using proc macros themselves does not have a huge impact on compile times." (This might change in the future.)

Manish goes on to say

This basically means that if you use one proc macro, the marginal compile time cost of adding additional proc macros is insignificant. A lot of people end up needing serde in their deptree anyway, so if you are forced to use serde, you should not care about proc macros.

If you are not forced to use serde, one thing a lot of folks do is have serde be an optional dependency so that their types are still serializable if necessary.

If you heavily use procedural macros in your project (e.g., if you use serde), you can try to sidestep their impact on compile times with watt, a tool that offloads macro compilation to Webassembly.

From the docs:

By compiling macros ahead-of-time to Wasm, we save all downstream users of the macro from having to compile the macro logic or its dependencies themselves.

Instead, what they compile is a small self-contained Wasm runtime (~3 seconds, shared by all macros) and a tiny proc macro shim for each macro crate to hand off Wasm bytecode into the Watt runtime (~0.3 seconds per proc-macro crate you depend on). This is much less than the 20+ seconds it can take to compile complex procedural macros and their dependencies.

Note that this crate is still experimental.

(Oh, and did I mention that both, watt and cargo-llvm-lines were built by David Tolnay, who is a frickin' steamroller of an engineer?)

Get Dedicated Hardware

If you reached this point, the easiest way to improve compile times even more is probably to spend money on top-of-the-line hardware.

Perhaps a bit surprisingly, the fastest machines for Rust compiles seem to be Apple machines with an M1 chip:

Rik Arends on Twitter
Rik Arends on Twitter

The benchmarks for the new Macbook Pro with M1 Max are absolutely ridiculous โ€” even in comparison to the already fast M1:

ProjectM1 MaxM1 Air
Deno6m11s11m15s
MeiliSearch1m28s3m36s
bat43s1m23s
hyperfine23s42s
ripgrep16s37s

That's a solid 2x performance improvement compared to an already fast M1.

But if you rather like to stick to Linux, people also had great success with a multicore CPU like an AMD Ryzen Threadripper and 32 GB of RAM.

On portable devices, compiling can drain your battery and be slow. To avoid that, I'm using my machine at home, a 6-core AMD FX 6300 with 12GB RAM, as a build machine. I can use it in combination with Visual Studio Code Remote Development.

Compile in the Cloud

If you don't have a dedicated machine yourself, you can offload the compilation process to the cloud instead.
Gitpod.io is superb for testing a cloud build as they provide you with a beefy machine (currently 16 core Intel Xeon 2.80GHz, 60GB RAM) for free during a limited period. Simply add https://gitpod.io/# in front of any Github URL. Here is an example for one of my Hello Rust episodes.

Gitpod has a neat feature called prebuilds. From their docs:

Whenever your code changes (e.g. when new commits are pushed to your repository), Gitpod can prebuild workspaces. Then, when you do create a new workspace on a branch, or Pull/Merge Request, for which a prebuild exists, this workspace will load much faster, because all dependencies will have been already downloaded ahead of time, and your code will be already compiled.

Especially when reviewing pull requests, this could give you a nice speedup. Prebuilds are quite customizable; take a look at the .gitpod.yml config of nushell to get an idea.

Download ALL The Crates

If you have a slow internet connection, a big part of the initial build process is fetching all those shiny crates from crates.io. To mitigate that, you can download all crates in advance to have them cached locally. criner does just that:

git clone https://github.com/the-lean-crate/criner
cd criner
cargo run --release -- mine

The archive size is surprisingly reasonable, with roughly 50GB of required disk space (as of today).

Bonus: Speed Up Rust Docker Builds ๐Ÿณ

Building Docker images from your Rust code? These can be notoriously slow, because cargo doesn't support building only a project's dependencies yet, invalidating the Docker cache with every build if you don't pay attention. cargo-chef to the rescue! โšก

cargo-chef can be used to fully leverage Docker layer caching, therefore massively speeding up Docker builds for Rust projects. On our commercial codebase (~14k lines of code, ~500 dependencies) we measured a 5x speed-up: we cut Docker build times from ~10 minutes to ~2 minutes.

Here is an example Dockerfile if you're interested:

# Step 1: Compute a recipe file
FROM rust as planner
WORKDIR app
RUN cargo install cargo-chef
COPY . .
RUN cargo chef prepare --recipe-path recipe.json

# Step 2: Cache project dependencies
FROM rust as cacher
WORKDIR app
RUN cargo install cargo-chef
COPY --from=planner /app/recipe.json recipe.json
RUN cargo chef cook --release --recipe-path recipe.json

# Step 3: Build the binary
FROM rust as builder
WORKDIR app
COPY . .
# Copy over the cached dependencies from above
COPY --from=cacher /app/target target
COPY --from=cacher /usr/local/cargo /usr/local/cargo
RUN cargo build --release --bin app

# Step 4:
# Create a tiny output image.
# It only contains our final binary.
FROM rust as runtime
WORKDIR app
COPY --from=builder /app/target/release/app /usr/local/bin
ENTRYPOINT ["./usr/local/bin/app"]

cargo-chef can help speed up your continuous integration with Github Actions or your deployment process to Google Cloud.

Drastic Measures: Overclock Your CPU? ๐Ÿ”ฅ

โš ๏ธ Warning: You can damage your hardware if you don't know what you are doing. Proceed at your own risk.

Here's an idea for the desperate. Now I don't recommend that to everyone, but if you have a standalone desktop computer with a decent CPU, this might be a way to squeeze out the last bits of performance.

Even though the Rust compiler executes a lot of steps in parallel, single-threaded performance is still quite relevant.

As a somewhat drastic measure, you can try to overclock your CPU. Here's a tutorial for my processor. (I owe you some benchmarks from my machine.)

Speeding Up Your CI Builds

If you collaborate with others on a Rust project, chances are you use some sort of continuous integration like Github Actions. Optimizing a CI build processes is a whole subject on its own. Thankfully Aleksey Kladov (matklad) collected a few tips on his blog. He touches on bors, caching, splitting build steps, disabling compiler features like incremental compilation or debug output, and more. It's a great read and you can find it here.

Upstream Work

Making the Rust compiler faster is an ongoing process, and many fearless people are working on it. Thanks to their hard work, compiler speed has improved 30-40% across the board year-to-date, with some projects seeing up to 45%+ improvements. On top of that, Rust tracks compile regressions on a website dedicated to performance

Work is also put into optimizing the LLVM backend. Rumor has it that there's still a lot of low-hanging fruit. ๐Ÿ‡

The Rust team is also continuously working to make the compiler faster. Here's an extract of the 2020 survey:

One continuing topic of importance to the Rust community and the Rust team is improving compile times. Progress has already been made with 50.5% of respondents saying they felt compile times have improved. This improvement was particularly pronounced with respondents with large codebases (10,000 lines of code or more) where 62.6% citing improvement and only 2.9% saying they have gotten worse. Improving compile times is likely to be the source of significant effort in 2021, so stay tuned!

Help Others: Upload Leaner Crates For Faster Build Times

cargo-diet helps you build lean crates that significantly reduce download size (sometimes by 98%). It might not directly affect your own build time, but your users will surely be thankful. ๐Ÿ˜Š

More Resources

What's Next?

Phew! That was a long list. ๐Ÿ˜… If you have any additional tips, please let me know.

If compiler performance is something you're interested in, why not collaborate on a tool to see what user code is causing rustc to use lots of time?

Also, once you're done optimizing your build times, how about optimizing runtimes next? My friend Pascal Hertleif has a nice article on that.

Thanks for reading! I mostly write about Rust and my (open-source) projects. If you would like to receive future posts automatically, you can subscribe via RSS or email:

Submit to HN Sponsor me on Github My Amazon wish list

Thanks to DocWilco, Hendrik Maus, Luca Pizzamiglio for reviewing drafts of this article.