Launching a URL Shortener in Rust using Rocket
One common systems design task in interviews is to sketch the software architecture of a URL shortener (a bit.ly clone, if you may). Since I was playing around with Rocket – a web framework for Rust – why not give it a try?
Requirements
A URL shortener has two main responsibilities:
- Create a short URL for a longer one (d’oh!).
- Redirect to the longer link when the short link is requested.
Let’s call our service rust.ly
(Hint, hint: the domain is still available at the time of writing…).
First, let’s create a new Rust project:
cargo new --bin rustly
Next, we add Rocket to our Cargo.toml
:
rocket = "0.2.4"
rocket_codegen = "0.2.4"
Warning: Most likely you need to get the very newest Rocket version. Otherwise, you might get some… entertaining error messages. Find the newest version on crates.io.
Since Rocket requires cutting-edge Rust features, we need to use a recent nightly build. Rustup provides a simple way to switch between stable and nightly.
🤔 Nightly Rust might no longer be required. Has anyone tried without and can report back?
rustup update && rustup override set nightly
A first prototype
Now we can start coding our little service. First, let’s write a simple “hello world” skeleton to get started. Put this into src/main.rs
:
extern crate rocket;
Under the hood, Rocket is doing some magic to enable this nice syntax. More specifically, we use the rocket_codegen
crate for that.
In order to bring the rocket library into scope, we write extern crate rocket;
.
We defined the two routes for our service. Both routes will respond to a GET
request.
This is done by adding an attribute named get
to a function. The attribute can take additional arguments. In our case, we define an id
variable for the lookup
endpoint and a url
variable for the shorten
endpoint. Both variables are Unicode string slices. Since Rust has awesome Unicode support, we respond with a nice emoji just to show off. 🕶
Lastly, we need a main
function, which launches Rocket and mounts our two routes. This way, they become publicly available. If you want to know even more about the in-depth details, I may refer you to the official Rocket documentation.
Let’s check if we’re on the right track by running the application.
cargo run
After some compiling, you should get some lovely startup output from Rocket:
🔧 Configured for development.
=> address: localhost
=> port: 8000
=> log: normal
=> workers: 8
🛰 Mounting '/':
=> GET /<hash>
🛰 Mounting '/shorten':
=> GET /shorten/<url>
🚀 Rocket has launched from https://localhost:8000...
Sweet! Let’s call our service.
> curl localhost:8000/shorten/www.endler.dev
💾 You shortened www.endler.dev. Magnificent!
> curl localhost:8000/www.endler.dev
⏩ You requested www.endler.dev. Wonderful!
So far so good.
Data storage and lookup
We need to keep the shortened URLs over many requests… but how? In a production scenario, we could use some NoSQL data store like Redis for that. Since the goal is to play with Rocket and learn some Rust, we will simply use an in-memory store.
Rocket has a that feature called managed state. In our case, we want to manage a repository of URLs.
First, let’s create a file named src/repository.rs
:
use HashMap;
use Shortener;
Within this module we first import the HashMap
implementation from the standard library. We also include shortener::Shortener;
, which helps us shorten the URLs in the next step. Don’t worry too much about that for now. By convention, we implement a new()
method to create a Repository
struct with an empty HashMap
and a new Shortener
. Additionally, we have two methods, store
and lookup
.
store
takes a URL and writes it to our in-memory HashMap storage. It uses our yet-to-be-defined shortener to create a unique id. It returns the shortened ID for the entry. lookup
gets a given ID from the storage, and returns it as an Option
. If the ID is found, the return value will be Some(url)
; if there is no match it will return None
.
Note that we convert the string slices (&str
) to String
using the to_string()
method. This way we don’t need to deal with lifetimes. As a beginner, don’t think too hard about them.
Additional remarks (can safely be skipped)
A seasoned (Rust) developer™ might do a few things differently here. Did you notice the tight coupling between the repository and the shortener? In a production system, Repository
and Shortener
might simply be concrete implementations of traits (which are a bit like interfaces in other languages, but more powerful). For example, Repository
could implement a Cache
trait:
This way we get clear sepration of concerns, and we can easily switch to a different implementation (e.g. a RedisCache
). Also, we could have a MockRepository
to simplify testing. Same for Shortener
.
On top of that, you might want to use the Into
trait to support both, &str
and String
as parameters of store
:
If you’re curious about this, read this article from Herman J. Radtke III. For now, let’s keep it simple.
Actually shortening URLs
Let’s implement the URL shortener itself. You might be surprised how much was written about URL shortening all over the web. One common way is to create short URLs using base 62 conversion.
After looking around some more, I found this sweet little crate called harsh, which perfectly fits the bill. It creates a hash id from an input string.
To use harsh
, we add it to the dependency section of our Cargo.toml
:
harsh = "0.1.2"
Next, we add the crate to the top of to our main.rs
:
extern crate harsh;
Let’s create a new file named src/shortener.rs
and write the following:
use ;
With use harsh::{Harsh, HarshBuilder};
we bring the required structs into scope. Then we define our own Shortener
struct, which wraps Harsh
. It has two fields: id
stores the next id for shortening. (Since there won’t be any negative ids, we use an unsigned integer for that.) The other field is the generator
itself, for which we use Harsh
. Using the HarshBuilder
you can do a lot of fancy stuff, like setting a custom alphabet for the ids. We’re good for now, but for more info, check out the official docs. With next_id
we retrieve a new String
id for our URLs.
As you can see, we don’t pass the URL to next_id
. That means we actually don’t shorten anything. We merely create a short, unique ID. That’s because most hashing algorithms produce fairly long URLs and having short URLs is kind of the whole idea.
Wiring it up
So we are done with our shortener and the repository. We need to adjust our src/main.rs
again to make use of the two.
This is the point where it gets a little hairy.
I have to admit that I struggled a bit here. Mainly because I was not used to multi-threaded request handling. In Python or PHP you don’t need to think about shared-mutable access.
Initially I had the following code in my main.rs
:
State is the built-in way to save data across requests in Rocket. Just tell it what belongs to your application state with manage()
and Rocket will automatically inject it into the routes.
But the compiler said no:
error: cannot borrow immutable borrowed content as mutable
-/main.rs
|
| repo.store;
In hindsight it all makes sense: What would happen if two requests wanted to modify our repository at the same time? Rust prevented a race condition here! Yikes. Admittedly, the error message could have been a bit more user-friendly, though.
Fortunately, Sergio Benitez (the creator of Rocket) helped me out on the Rocket IRC channel (thanks again!). The solution was to put the repository behind a Mutex.
Here is our src/main.rs
in its full glory:
extern crate rocket;
extern crate harsh;
use RwLock;
use State;
use Form;
use Redirect;
use Repository;
As you can see we’re using a std::sync::RwLock here, to protect our repository from shared mutable access. This type of lock allows any number of readers or at most one writer at the same time. It makes our code a bit harder to read because whenever we want to access our repository, we need to call the read and write methods first.
In our lookup
method, you can see that we are returning a Result type now. It has two cases: if we find an id in our repository, we return Ok(Redirect::permanent(url))
, which will take care of the redirect. If we can’t find the id, we return an Error
.
In our shorten
method, we switched from a get
to a post
request. The advantage is, that we don’t need to deal with URL encoding. We just create a struct Url
and derive FromForm for it, which will handle the deserialization for us. Fancy!
We’re done. Let’s fire up the service again and try it out!
cargo run
In a new window, we can now store our first URL:
curl --data "url=https://www.endler.dev" https://localhost:8000/
We get some ID back that we can use to retrieve the URL again. In my case, this was gY
. Point your browser to https://localhost:8000/gY and you should be redirected to my homepage.
Summary
Rocket provides fantastic documentation and a great community. It really feels like an idiomatic Rustlang web framework.
I hope you had some fun while playing with Rocket.
You can find the full example code on Github.
Thanks for reading! I mostly write about Rust and my (open-source) projects. If you would like to receive future posts automatically, you can subscribe via RSS.