Refactoring Go Code to Avoid File I/O in Unit Tests
— Without Mocking or Temporary Files
At work today, I refactored some simple Go code to make it more testable. The idea was to avoid file handling in unit tests without mocking or using temporary files by separating data input/output and data manipulation.
I was surprised that I couldn't find a simple explanation on sites like StackOverflow, which is why I wrote down some notes myself so that others can refer to it in the future.
The initial version looked like this:
package main import ( "bufio" "io/ioutil" "os" ) func main() func analyze(file string) error
As you can see, we take a filename as input, and we open that file inside the
analyze function to do something with its contents.
A typical test harness for that code might look like this:
package main import "testing" func Test_analyze(t *testing.T)
All fine and good?
This will work, but file I/O while running tests is not always the best idea. For one, you could be running in a constrained environment, where you don't have access to the file. We could use temporary files to avoid this.
But there might be problems with disk I/O, which makes for flaky tests and frustration.
Another process could also modify the file during the test. All these issues have nothing to do with your code.
Furthermore, it's not enough to just look at the test and see exactly what's going on. You also have to read the text file first.
A lot of people suggest mocking instead. There are quite a few powerful libraries like spf13/afero for this purpose. These packages will create temporary files in the background and clean up afterward.
In my opinion, mocking should be the last resort when it comes to testing. Before you mock, check that you use the right abstractions in your code. Maybe implementing against an interface or using Dependency Injection helps decouple components? More often than not, a clear separation of concerns is all you need.
In my case above, we can easily avoid using mocks and temporary files by decoupling file I/O from the analysis. We do so by refactoring our
analyze function to call
doSomething, which takes an
io.Reader. (You could also use an array of strings for now.)
main.go now looks like this:
package main import ( "bufio" "io" "os" ) func main() func analyze(file string) error func doSomething(handle io.Reader) error
Now we can test the actual analysis in isolation:
package main import ( "strings" "testing" ) func Test_analyze(t *testing.T)
doSomething(strings.NewReader("This is a test string")). (Of course, we should also write a separate test for
analyze(), but the focus is on decoupling the datasource-agnostic part here.)
By slightly refactoring our code, we gained the following advantages:
- Simple testability: No mocks or temporary files.
- Separation of concerns: Each function does exactly one thing.
- Easier code re-use: The
doSomething()function will work with any
io.Readerand can be called from other places. We can even move it to its own library if we want.
In general, I prefer to not accept a file name in an API. A file name doesn't give users enough control. It doesn't let you use an unusual encoding, special file permissions, or a bytes.Buffer instead of an actual file, for example. Accepting a file name adds a huge dependency to the code: the file system, along with all of its associated OS specific stuff.
So I probably would have eliminated the file name based API and only exposed one based on io.Reader. That way, you have complete code coverage, fast tests, and far fewer edge cases to worry about.
I totally agree with that sentiment.
But often times you can't simply change the user-facing API easily, because the API might be public and might already have users. The refactoring above is just the first step towards better architecture. There is definitely a lot more you can do to start writing robust, well-tested systems in Go.
If that got you interested, also check out justforfunc #29: dependency injection in a code review, which covers the same topic:
A great resource that I can recommend is Learn Go with Tests. It teaches you test-driven development with Go and helps you get a grounding with TDD.
Another one is The Go Programming Language book, co-authored by Brian W. Kernighan (of Unix fame), which shows how to write clear and idiomatic Go to solve real-world problems. It contains a dedicated chapter on interfaces and testing. It also covers
io.Reader in more detail.
Thanks for reading! I mostly write about Rust and my (open-source) projects. If you would like to receive future posts automatically, you can subscribe via RSS or email: