Advanced go - Avoiding `if` for initialization using `sync.Once` and foot guns to watch out for
The short version is that if the structure makes sense and can be used as zero value, or with some fields unset. Or, in other words, if the structure is consistent pretty much in any way it is build. Then you should let the user of the structure just initialize the structure without a constructor.
If, on the other hand, the structure's fields needs to be carefully crafted, or if the structure can be initialized in a way that it is not safe, then we should prefer a constructor where we put all the necessary checks.
Let's keep in mind that if a structure is exposed by a package, it is always possible to initialize it. It may not make sense, but it is possible.
If you allow the structure to be initialized directly, then it is likely that you may find yourself in the situation of checking if it is the very first time you are running a particular function, or if it is the first time that the structure is used at all.
If it is the first time, you may want to do some checks or set some private fields.
In general, this is a code smell that should be avoided. It means that the code can easily be in an unmanageable state, and the check is just a bandage aid on top of a design that could be improved.
Nevertheless, pragmatics and business requests may make taking some technical debt the best call. And so we are stuck with an ugly, but valuable if
.
The code usually goes like this:
type Foo struct {
someField int
// the structure is been initialized already?
// default to false
initialized bool
}
func (f *Foo) DoBaz() {
if !f.initialized {
f.someField = 3
f.initialized = true
}
// do something
}
func (f *Foo) DoBar() {
if !f.initialized {
f.someField = 3
f.initialized = true
}
// do something else
}
Please note how the initialized
flag is such that the False
default value (the zero value of booleans) makes sense.
This code has a lot of issues, beside the one we already discussed, it is not DRY, it is simple to forget to set the initialized
flag to true, and it is hard to maintains.
This code often improve organically, and the initialization is done in a private function just invoked by each function.
type Foo struct {
someField int
// the structure is been initialized already?
// default to false
initialized bool
}
func (f *Foo) initialize() {
f.someField = 3
f.initialized = true
}
func (f *Foo) DoBaz() {
if !f.initialized {
f.initialize()
}
// do something
}
func (f *Foo) DoBar() {
if !f.initialized {
f.initialize()
}
// do something else
}
The new version is a bit better, but it can be improved.
The check for the f.initialized
flag should happen inside the f.initialize()
methods, and all the public methods should just invoke f.initialize()
. One less thing that we cannot forget.
type Foo struct {
someField int
// the structure is been initialized already?
// default to false
initialized bool
}
func (f *Foo) initialize() {
if !f.initialized {
f.someField = 3
f.initialized = true
}
}
func (f *Foo) DoBaz() {
f.initialize()
// do something
}
func (f *Foo) DoBar() {
f.initialize()
// do something else
}
This is much better, since we just need to remember one thing: Always call f.initialize()
before doing any work.
But it can still be improved.
This particular pattern in not concurrency friendly and if the public methods are invoked by multiple goroutines it will run in a race condition.
There are multiple solutions for instance set up a mutex. But what a find, even better it is just to use sync.Once
.
This kind of refactoring it is not something that happens organically in the codebase, and it may be worth to check for it.
sync.Once
is literary done for use cases like this, it is a simple and small structure, with a very clean and idiomatic interface. However, its usage is not immediate and even seasoned go developers may do mistakes with it.
The implementation I prefer is:
type Foo struct {
someField int
once sync.Once
}
func (f *Foo) initialize() {
f.once.Do(func() {
f.someField = 3
})
}
func (f *Foo) DoBaz() {
f.initialize()
// do something
}
func (f *Foo) DoBar() {
f.initialize()
// do something else
}
The sync.Once
allow getting rid of the boolean flag, and it takes care of concurrency.
This is not a perfect solution since it hides a lot of foot guns.
Let's check them together to make sure to avoid mistakes.
We are going to use the following example:
package main
import (
"fmt"
"sync"
)
type Foo struct {
i int
once sync.Once
}
func (f *Foo) initialize() {
f.once.Do(func() {
f.i = 3
fmt.Println("Done initialization")
})
}
func (f *Foo) Hoz() {
fmt.Println("Hoz i = ", f.i)
f.initialize()
fmt.Println("Hoz i = ", f.i)
}
func (f *Foo) Bar() {
fmt.Println("Bar i = ", f.i)
f.initialize()
fmt.Println("Bar i = ", f.i)
}
func main() {
f := Foo{}
f.Hoz()
f.Bar()
}
This code does few prints so that we can follow what is happening.
You can check on the playground that the result is the expected one.
Hoz i = 0
Done initialization
Hoz i = 3
Bar i = 3
Bar i = 3
The first time, Hoz
invokes the initialization function, which sets the i
field. On the second run, the initialization body is not executed, and the field is left intact.
Try to hide the sync.Once
by putting it into the initialize
stack
The sync.Once
is accessible from all the methods of the struct.
We could try to hide the sync.Once
inside the f.initialize
method.
This does not work, since each time we invoke f.initialize
a new sync.Once
is created and the new sync.Once
will invoke the internal function for the first time.
package main
import (
"fmt"
"sync"
)
type Foo struct {
i int
// we remove the sync.Once
}
func (f *Foo) initialize() {
// we create the sync.Once in the stack of
// the initialize methods
once := sync.Once{}
once.Do(func() {
f.i = 3
fmt.Println("Done initialization")
})
}
func (f *Foo) Hoz() {
fmt.Println("Hoz i = ", f.i)
f.initialize()
fmt.Println("Hoz i = ", f.i)
}
func (f *Foo) Bar() {
fmt.Println("Bar i = ", f.i)
f.initialize()
fmt.Println("Bar i = ", f.i)
}
func main() {
f := Foo{}
f.Hoz()
f.Bar()
}
This code won't work, and you can check into the playground.
The body of the initialize
function (the one that set f.i = 3
) is invoked each time we invoke the f.initialize
method. The sync.Once
is recreated each time, and each new sync.Once
will fire once.
The result will be:
Hoz i = 0
Done initialization
Hoz i = 3
Bar i = 3
Done initialization
Bar i = 3
And we can see how initalization
is called twice, not what we want.
Try to hide the sync.Once
by making it global
Another alternative would it be to put the sync.Once
in a global variable.
package main
import (
"fmt"
"sync"
)
// lets use a global sync.Once
var once = sync.Once{}
type Foo struct {
i int
// we remove the sync.Once
}
func (f *Foo) initialize() {
once.Do(func() {
f.i = 3
fmt.Println("Done initialization")
})
}
func (f *Foo) Hoz() {
fmt.Println("Hoz i = ", f.i)
f.initialize()
fmt.Println("Hoz i = ", f.i)
}
func (f *Foo) Bar() {
fmt.Println("Bar i = ", f.i)
f.initialize()
fmt.Println("Bar i = ", f.i)
}
func main() {
fmt.Println("f")
f := Foo{}
f.Hoz()
f.Bar()
fmt.Println("f1")
f1 := Foo{}
f1.Hoz()
}
This fails as soon as you have multiple instance of the same structure that need initialization.
Indeed, the result is wrong with the initialization called only once in the whole process instead of once for structure.
f
Hoz i = 0
Done initialization
Hoz i = 3
Bar i = 3
Bar i = 3
f1
Hoz i = 0
Hoz i = 0
The only reasonable solution it is to have a sync.Once
for each structure.
Do not use the sync.Once
from multiple functions
We start the post by DRY out the code and concentrate all the initialization login in a single method that is invoked by all the other methods. With sync.Once
this is even more important.
sync.Once
will execute once and only once. Even if the function is different.
Spreading the invocation of the sync.Once.Do
method makes much more likely that the sync.Once
is invoked with a different function and introduce software defects.
Moreover, we might forget to wrap the initialization code into the sync.Once
.
Do not do this:
package main
import (
"fmt"
"sync"
)
type Foo struct {
i int
once sync.Once
}
func (f *Foo) initialize() {
f.i = 3
fmt.Println("Done initialization")
}
func (f *Foo) Hoz() {
fmt.Println("Hoz i = ", f.i)
f.once.Do(f.initialize)
fmt.Println("Hoz i = ", f.i)
}
func (f *Foo) Bar() {
fmt.Println("Bar i = ", f.i)
f.once.Do(f.initialize)
fmt.Println("Bar i = ", f.i)
}
func main() {
fmt.Println("f")
f := Foo{}
f.Hoz()
f.Bar()
}
While it works just as well, it makes it much more likely to introduce software defects in the future.
Conclusion
sync.Once
is a very useful structure that can help simplify and make more idiomatic a lot of codes.
Unfortunately, it comes with some foot guns that effective go developers needs to know in order to avoid problems.
I do suggest to check out the sync.Once
source code to everybody since it is minimal, but it tackles interesting concurrency issues in really few lines of codes.