Refactoring Go Code to Avoid File I/O in Unit Tests
— Without Mocking or Temporary Files
At work today, I refactored some simple Go code to make it more testable. The idea was to avoid file handling in unit tests without mocking or using temporary files by separating data input/output and data manipulation.
I was surprised that I couldn’t find a simple explanation on sites like StackOverflow, which is why I wrote down some notes myself so that others can refer to it in the future.
Our example code
The initial version looked like this:
package main
import (
"bufio"
"io/ioutil"
"os"
)
func main()
func analyze(file string) error
As you can see, we take a filename as input, and we open that file inside the analyze
function to do something with its contents.
Writing our first test for the code
A typical test harness for that code might look like this:
package main
import "testing"
func Test_analyze(t *testing.T)
All fine and good?
Problems
This will work, but file I/O while running tests is not always the best idea. For one, you could be running in a constrained environment, where you don’t have access to the file. We could use temporary files to avoid this.
But there might be problems with disk I/O, which makes for flaky tests and frustration.
Another process could also modify the file during the test. All these issues have nothing to do with your code.
Furthermore, it’s not enough to just look at the test and see exactly what’s going on. You also have to read the text file first.
A lot of people suggest mocking instead. There are quite a few powerful libraries like spf13/afero for this purpose. These packages will create temporary files in the background and clean up afterward.
In my opinion, mocking should be the last resort when it comes to testing. Before you mock, check that you use the right abstractions in your code. Maybe implementing against an interface or using Dependency Injection helps decouple components? More often than not, a clear separation of concerns is all you need.
Refactoring to make testing easier
In my case above, we can easily avoid using mocks and temporary files by decoupling file I/O from the analysis. We do so by refactoring our analyze
function to call doSomething
, which takes an io.Reader
. (You could also use an array of strings for now.)
Our main.go
now looks like this:
package main
import (
"bufio"
"io"
"os"
)
func main()
func analyze(file string) error
func doSomething(handle io.Reader) error
Now we can test the actual analysis in isolation:
package main
import (
"strings"
"testing"
)
func Test_analyze(t *testing.T)
We changed analyze("test.txt")
to doSomething(strings.NewReader("This is a test string"))
. (Of course, we should also write a separate test for analyze()
, but the focus is on decoupling the datasource-agnostic part here.)
Result
By slightly refactoring our code, we gained the following advantages:
- Simple testability: No mocks or temporary files.
- Separation of concerns: Each function does exactly one thing.
- Easier code re-use: The
doSomething()
function will work with anyio.Reader
and can be called from other places. We can even move it to its own library if we want.
On Reddit, user soapysops made an important remark:
In general, I prefer to not accept a file name in an API. A file name doesn’t give users enough control. It doesn’t let you use an unusual encoding, special file permissions, or a bytes.Buffer instead of an actual file, for example. Accepting a file name adds a huge dependency to the code: the file system, along with all of its associated OS specific stuff.
So I probably would have eliminated the file name based API and only exposed one based on io.Reader. That way, you have complete code coverage, fast tests, and far fewer edge cases to worry about.
I totally agree with that sentiment.
But often times you can’t simply change the user-facing API easily, because the API might be public and might already have users. The refactoring above is just the first step towards better architecture. There is definitely a lot more you can do to start writing robust, well-tested systems in Go.
More Resources
If that got you interested, also check out justforfunc #29: dependency injection in a code review, which covers the same topic:
A great resource that I can recommend is Learn Go with Tests. It teaches you test-driven development with Go and helps you get a grounding with TDD.
Another one is The Go Programming Language book, co-authored by Brian W. Kernighan (of Unix fame), which shows how to write clear and idiomatic Go to solve real-world problems. It contains a dedicated chapter on interfaces and testing. It also covers io.Reader
in more detail.
- 💬 Comments on Reddit.
Thanks for reading! I mostly write about Rust and my (open-source) projects. If you would like to receive future posts automatically, you can subscribe via RSS.