In the world of software engineering, testing your code is critical. All testing is important. However, some types of tests are more important than others for a variety of reasons. In this post, I’ll discuss why System Tests are the most important type of software test.
Here’s a few types of tests that are common in software development:
Unit Tests test a class or function without testing its dependencies. No connecting to databases, the file system, or making real network calls. If your function does those things, you need to find a way to mock those dependencies. Otherwise, a unit test isn’t possible.
Integration Tests test a class or function and its dependencies (though not necessarily the dependencies’ dependencies). These tests do not connect over the network to any dependencies (though this rule can be bent).
System Tests test the running software connected to all dependencies including databases, other APIs, etc. System tests test the interface that the end users or clients interact with. So for a Web site, a proper System test would use a Web browser to click buttons and make sure proper things happen. To system test an API, you would invoke it and then use other APIs to make sure it did what it was supposed to do.
Pretty much all programmers are familiar with unit tests, but system and integration tests are often less familiar to newer programmers. We’ll focus on system tests in this post.
System Testing Command Line Tic-tac-toe
Below we have a code block written in Go that is a simple game of tic-tac-toe based on the command line. The game assumes that the two players are sharing a terminal. The users input a number between 1 and 9 where 1 is the upper left hand square, 5 is the middle square, and 9 is the bottom right hand square. It’s pretty simple.
package main
import (
"fmt"
)
var board [3][3]string
func init() {
for i := 0; i < 3; i++ {
for j := 0; j < 3; j++ {
board[i][j] = "-"
}
}
}
func displayBoard() {
for i := 0; i < 3; i++ {
for j := 0; j < 3; j++ {
fmt.Print(board[i][j])
if j != 2 {
fmt.Print("|")
}
}
fmt.Println()
}
}
func didPlayerWin(player string) bool {
for i := 0; i < 3; i++ {
if (board[i][0] == player && board[i][1] == player && board[i][2] == player) ||
(board[0][i] == player && board[1][i] == player && board[2][i] == player) {
return true
}
}
if board[0][0] == player && board[1][1] == player && board[2][2] == player {
return true
}
if board[0][2] == player && board[1][1] == player && board[2][0] == player {
return true
}
return false
}
func main() {
var player = "X"
var move int
var row, col int
var gameOver = false
var winner string
numMoves := 0
for !gameOver && numMoves < 9 {
displayBoard()
fmt.Print("Player ", player, ", enter move (1-9): ")
fmt.Scan(&move)
row = (move - 1) / 3
col = (move - 1) % 3
if board[row][col] == "-" {
board[row][col] = player
} else {
fmt.Println("Invalid move, try again.")
continue
}
if didPlayerWin(player) {
winner = player
gameOver = true
}
if player == "X" {
player = "O"
} else {
player = "X"
}
numMoves = numMoves + 1
}
displayBoard()
if winner != "" {
fmt.Println("Player", winner, "wins!")
} else {
fmt.Println("It's a tie")
}
}
Here is a simple (and very incomplete) unit test for the didPlayerWin function:
package main
import "testing"
func TestDidPlayerWin(t *testing.T) {
board[0][0] = "X"
board[0][1] = "X"
board[0][2] = "X"
board[1][0] = "O"
board[1][0] = "O"
if !didPlayerWin("X") {
t.Fail()
}
}
In this test, it simply checks if a player won. In the tested case, the X player won by connecting 3 in a row on the top row. It’s pretty straightforward.
The rest of the code is not easily unit testable because it is a monolithic function (though for a trivial code base like this, it is possible).
Writing system tests for this game is easy and can be done with a simple bash script. Here is an example system test for the game:
echo "1 2 3 5 2 9 8 4" | go run main.go | grep -q "Player O wins\!" && echo SUCCESS || echo FAILED | grep "SUCCESS"
If you run the above code, then the result will be “SUCCESS” printed to the screen. If change the grep statement to check if Player X wins then the result will be that “FAILED” will print.
How does this snippet work? Let’s break it down:
echo "1 2 3 5 2 9 8 4"
Simply prints the string
"1 2 3 5 2 9 8 4" to the console
This gets piped to the next command
go run main.go
Runs the tic tac go program which prompts for input
Receives the “1 2 3 5 2 9 8 4” string from the previous command
Writes “Player 0 wins!” (and other stuff) to stdout (or it should if the code works!)
Pipes the output to the next command
grep -q “Player O wins\!” && echo SUCCESS || echo FAILED
Checks if the output from the previous command contains “Player O wins”. Writes “SUCCESS” to stdout if so other writes writes “FAILED
The -q flag suppresses the output from showing up in stdout. This isn’t necessary, but it makes the test cleaner
grep “SUCCESS”
Checks to make sure the previous command writes “SUCCESS” to stdout or not.
I simply added this so that the proper error code would show up in $?. This allows the code to be run in a larger script and know when the test is failing
We can add many more tests like the following:
# check if O wins
echo "1 2 3 5 2 9 8" | go run main.go | grep -q "Player O wins\!" && echo SUCCESS || echo FAILED
# check if X wins
echo "1 4 2 5 3" | go run main.go | grep -q "Player X wins\!" && echo SUCCESS || echo FAILED
# check for a tie
echo "1 2 4 7 9 5 8 6 3" | go run main.go | grep -q "tie" && echo SUCCESS || echo FAILED
You can just keep adding tests like this. Unlike our unit test, it tests all of the code.
Now you might be thinking “well, can’t we just write more unit tests”? Yes of course, and we should! However, much of the code is not very easily unit testable.
Let’s refactor the code to be more unit testable. We’ll also put everything into a struct because why not.
package main
import (
"fmt"
)
var game *TicTacToe
func init() {
game = &TicTacToe{
board: [3][3]string{},
currentPlayer: "X",
numMoves: 0,
over: false,
winner: "",
}
for i := 0; i < 3; i++ {
for j := 0; j < 3; j++ {
game.board[i][j] = "-"
}
}
}
type TicTacToe struct {
numMoves int
board [3][3]string
currentPlayer string
over bool
winner string
}
func (game *TicTacToe) overMessage() {
if game.winner != "" {
fmt.Println("Player", game.winner, "wins!")
} else {
fmt.Println("It's a tie")
}
}
func (game *TicTacToe) setCurrentPlayerAsWinner() {
game.winner = game.currentPlayer
}
func (game *TicTacToe) promptInput() int {
var move int
fmt.Print("Player ", game.currentPlayer, ", enter move (1-9): ")
fmt.Scan(&move)
return move
}
func (game *TicTacToe) convertInputToCoordinates(move int) (int, int) {
row := (move - 1) / 3
col := (move - 1) % 3
return row, col
}
func (game *TicTacToe) isOver() bool {
if (game.numMoves) > 8 {
game.setIsOver()
}
return game.over
}
func (game *TicTacToe) displayBoard() {
for i := 0; i < 3; i++ {
for j := 0; j < 3; j++ {
fmt.Print(game.board[i][j])
if j != 2 {
fmt.Print("|")
}
}
fmt.Println()
}
}
func (game *TicTacToe) didCurrentPlayerWin() bool {
for i := 0; i < 3; i++ {
if (game.board[i][0] == game.currentPlayer && game.board[i][1] == game.currentPlayer && game.board[i][2] == game.currentPlayer) ||
(game.board[0][i] == game.currentPlayer && game.board[1][i] == game.currentPlayer && game.board[2][i] == game.currentPlayer) {
return true
}
}
if game.board[0][0] == game.currentPlayer && game.board[1][1] == game.currentPlayer && game.board[2][2] == game.currentPlayer {
return true
}
if game.board[0][2] == game.currentPlayer && game.board[1][1] == game.currentPlayer && game.board[2][0] == game.currentPlayer {
return true
}
return false
}
func (game *TicTacToe) move(row, col int) {
game.board[row][col] = game.currentPlayer
game.numMoves++
}
func (game *TicTacToe) switchPlayer() {
if game.currentPlayer == "X" {
game.currentPlayer = "O"
} else {
game.currentPlayer = "X"
}
}
func (game *TicTacToe) isValidMove(row, col int) bool {
if row < 0 || row > 2 {
return false
}
if col < 0 || col > 2 {
return false
}
return game.board[row][col] != "-"
}
func (game *TicTacToe) setIsOver() {
game.over = true
}
func main() {
for !game.isOver() {
game.displayBoard()
move := game.promptInput()
row, col := game.convertInputToCoordinates(move)
if game.isValidMove(row, col) {
fmt.Println("Invalid move, try again.")
continue
}
game.move(row, col)
if game.didCurrentPlayerWin() {
game.setIsOver()
game.setCurrentPlayerAsWinner()
}
game.switchPlayer()
}
game.displayBoard()
game.overMessage()
}
This code is much easier to unit test. Here are some unit tests:
package main
import "testing"
func TestDidCurentPlayerWin(t *testing.T) {
game.move(0, 0)
game.move(1, 0)
game.move(0, 1)
game.move(1, 0)
game.move(0, 2)
if !game.didCurrentPlayerWin() {
t.Fail()
}
}
func TestIsOver(t *testing.T) {
game.numMoves = 0
if game.isOver() {
t.Fail()
}
game.numMoves = 9
if !game.isOver() {
t.Fail()
}
}
func TestConvertInputToCoordinates(t *testing.T) {
row, col := game.convertInputToCoordinates(1)
if row != 0 || col != 0 {
t.Fail()
}
row, col = game.convertInputToCoordinates(5)
if row != 1 || col != 1 {
t.Fail()
}
}
func TestSwitchPlayer(t *testing.T) {
game.currentPlayer = "X"
game.switchPlayer()
if game.currentPlayer != "O" {
t.Fail()
}
}
One thing to note is that I had to modify the original unit test because the previous function no longer existed. What modifications did I have to make to the system tests? Well…
System Tests Enable Large Refactors
So we just refactored the original code to be more testable. This required the original unit test to be rewritten (or at least heavily modified). Guess what changes I had to make to the system tests? NONE. They are all still valid. (In fact, I used the system tests to validate that the new code works). Why do the system tests still work? They work because the client interface did not change. In this case, the client interface is text streams via stdin and stdout.
The same holds true for software where the interface is an API. As long as the API interface does not change, you can perform a large refactor on the code base and still be confident in your system tests.
In fact, I would argue that without comprehensive system tests, a large refactor is likely not possible in a reasonable timeframe.
Unit Tests are still important
The fact that system tests are superior to unit tests is no excuse to skip writing unit tests. Unit tests will generally find bugs earlier in the software lifecycle process than system tests. They can also easily be written during development whereas that isn’t always possible for system tests (though it was for our tic tac toe example). Unit tests also force you to write code that is unit testable which is a plus. So you should write both types of tests.
So why point out that system tests are more important if both types of tests should always be written? Let me put it this way. Consider the scenario where I start a new job on a new product. I look at the codebase and find that it isn’t very well tested. Maybe it has 50% unit test code coverage and one system tests that only tests the most common scenario. Where should my time be spent? Easy. I’m going to write system tests. A codebase that is in production in this situation probably mostly works. System tests will give more confidence at release time and allow faster iteration. Writing additional unit tests would give me more confidence that the code works as it is, but gives me little additional confidence as the code evolves. This is because evolving code often means evolving unit tests. The system tests will always be valid so long as the interface does not change.
Nice analysis, and very good points, notably your observation that "system tests enable large refactors." But I'm going to _blaspheme_ and disagree with your last point that "unit tests are still important". You're just being kind to the true believers and avoiding flak; I get it. But unit tests are only important for testing isolated, specialized sections of code, code that's just too expensive to test at a higher level, which is maybe up to 5% of code, if that. Otherwise, (1) unit tests are expensive to develop (using actual software developers), (2) they are brittle (as you just proved) (and for what benefit?), and (3) they are almost always redundant; they have to be superseded by a higher-level test anyway (or the higher-level tests are incomplete). Do you have developer time to duplicate work? So why spend the resources? If we all gave up on the unit test religion, we'd have a lot more time for creating and refining actual functionality. Anyway, thanks for your great analysis.