Advanced go - Avoiding `if` for initialization using `sync.Once` and foot guns to watch out for

We already discussed when to use constructors and when let the user of your library initialize their structures.

The short version is that if the structure makes sense and can be used as zero value, or with some fields unset. Or, in other words, if the structure is consistent pretty much in any way it is build. Then you should let the user of the structure just initialize the structure without a constructor.

If, on the other hand, the structure's fields needs to be carefully crafted, or if the structure can be initialized in a way that it is not safe, then we should prefer a constructor where we put all the necessary checks.

Let's keep in mind that if a structure is exposed by a package, it is always possible to initialize it. It may not make sense, but it is possible.

If you allow the structure to be initialized directly, then it is likely that you may find yourself in the situation of checking if it is the very first time you are running a particular function, or if it is the first time that the structure is used at all.

If it is the first time, you may want to do some checks or set some private fields.

In general, this is a code smell that should be avoided. It means that the code can easily be in an unmanageable state, and the check is just a bandage aid on top of a design that could be improved.

Nevertheless, pragmatics and business requests may make taking some technical debt the best call. And so we are stuck with an ugly, but valuable if.

The code usually goes like this:

type Foo struct {
    someField int

    // the structure is been initialized already?
    // default to false
    initialized bool
}

func (f *Foo) DoBaz() {
    if !f.initialized {
        f.someField = 3
        f.initialized = true
    }
    // do something
}

func (f *Foo) DoBar() {
    if !f.initialized {
        f.someField = 3
        f.initialized = true
    }
    // do something else
}

Please note how the initialized flag is such that the False default value (the zero value of booleans) makes sense.

This code has a lot of issues, beside the one we already discussed, it is not DRY, it is simple to forget to set the initialized flag to true, and it is hard to maintains.

This code often improve organically, and the initialization is done in a private function just invoked by each function.

type Foo struct {
    someField int

    // the structure is been initialized already?
    // default to false
    initialized bool
}

func (f *Foo) initialize() {
    f.someField = 3
    f.initialized = true
}

func (f *Foo) DoBaz() {
    if !f.initialized {
        f.initialize()
    }
    // do something
}

func (f *Foo) DoBar() {
    if !f.initialized {
        f.initialize()
    }
    // do something else
}

The new version is a bit better, but it can be improved. The check for the f.initialized flag should happen inside the f.initialize() methods, and all the public methods should just invoke f.initialize(). One less thing that we cannot forget.

type Foo struct {
    someField int

    // the structure is been initialized already?
    // default to false
    initialized bool
}

func (f *Foo) initialize() {
    if !f.initialized {
        f.someField = 3
        f.initialized = true
    }
}

func (f *Foo) DoBaz() {
    f.initialize()
    // do something
}

func (f *Foo) DoBar() {
    f.initialize()
    // do something else
}

This is much better, since we just need to remember one thing: Always call f.initialize() before doing any work.

But it can still be improved.

This particular pattern in not concurrency friendly and if the public methods are invoked by multiple goroutines it will run in a race condition.

There are multiple solutions for instance set up a mutex. But what a find, even better it is just to use sync.Once.

This kind of refactoring it is not something that happens organically in the codebase, and it may be worth to check for it.

sync.Once is literary done for use cases like this, it is a simple and small structure, with a very clean and idiomatic interface. However, its usage is not immediate and even seasoned go developers may do mistakes with it.

The implementation I prefer is:

type Foo struct {
    someField int

    once sync.Once
}

func (f *Foo) initialize() {
    f.once.Do(func() {
        f.someField = 3
    })
}

func (f *Foo) DoBaz() {
    f.initialize()
    // do something
}

func (f *Foo) DoBar() {
    f.initialize()
    // do something else
}

The sync.Once allow getting rid of the boolean flag, and it takes care of concurrency.

This is not a perfect solution since it hides a lot of foot guns.

Let's check them together to make sure to avoid mistakes.

We are going to use the following example:

package main

import (
    "fmt"
    "sync"
)

type Foo struct {
    i int

    once sync.Once
}

func (f *Foo) initialize() {
    f.once.Do(func() {
        f.i = 3
        fmt.Println("Done initialization")
    })
}

func (f *Foo) Hoz() {
    fmt.Println("Hoz i = ", f.i)
    f.initialize()
    fmt.Println("Hoz i = ", f.i)
}

func (f *Foo) Bar() {
    fmt.Println("Bar i = ", f.i)
    f.initialize()
    fmt.Println("Bar i = ", f.i)
}

func main() {
    f := Foo{}
    f.Hoz()
    f.Bar()
}

This code does few prints so that we can follow what is happening.

You can check on the playground that the result is the expected one.

Hoz i =  0
Done initialization
Hoz i =  3
Bar i =  3
Bar i =  3

The first time, Hoz invokes the initialization function, which sets the i field. On the second run, the initialization body is not executed, and the field is left intact.

Try to hide the sync.Once by putting it into the initialize stack

The sync.Once is accessible from all the methods of the struct.

We could try to hide the sync.Once inside the f.initialize method.

This does not work, since each time we invoke f.initialize a new sync.Once is created and the new sync.Once will invoke the internal function for the first time.

package main

import (
    "fmt"
    "sync"
)

type Foo struct {
    i int

    // we remove the sync.Once
}

func (f *Foo) initialize() {
    // we create the sync.Once in the stack of
    // the initialize methods
    once := sync.Once{}
    once.Do(func() {
        f.i = 3
        fmt.Println("Done initialization")
    })
}

func (f *Foo) Hoz() {
    fmt.Println("Hoz i = ", f.i)
    f.initialize()
    fmt.Println("Hoz i = ", f.i)
}

func (f *Foo) Bar() {
    fmt.Println("Bar i = ", f.i)
    f.initialize()
    fmt.Println("Bar i = ", f.i)
}

func main() {
    f := Foo{}
    f.Hoz()
    f.Bar()
}

This code won't work, and you can check into the playground.

The body of the initialize function (the one that set f.i = 3) is invoked each time we invoke the f.initialize method. The sync.Once is recreated each time, and each new sync.Once will fire once.

The result will be:

Hoz i =  0
Done initialization
Hoz i =  3
Bar i =  3
Done initialization
Bar i =  3

And we can see how initalization is called twice, not what we want.

Try to hide the sync.Once by making it global

Another alternative would it be to put the sync.Once in a global variable.

package main

import (
    "fmt"
    "sync"
)

// lets use a global sync.Once
var once = sync.Once{}

type Foo struct {
    i int

    // we remove the sync.Once
}

func (f *Foo) initialize() {
    once.Do(func() {
        f.i = 3
        fmt.Println("Done initialization")
    })
}

func (f *Foo) Hoz() {
    fmt.Println("Hoz i = ", f.i)
    f.initialize()
    fmt.Println("Hoz i = ", f.i)
}

func (f *Foo) Bar() {
    fmt.Println("Bar i = ", f.i)
    f.initialize()
    fmt.Println("Bar i = ", f.i)
}

func main() {
    fmt.Println("f")
    f := Foo{}
    f.Hoz()
    f.Bar()

    fmt.Println("f1")
    f1 := Foo{}
    f1.Hoz()
}

This fails as soon as you have multiple instance of the same structure that need initialization.

Indeed, the result is wrong with the initialization called only once in the whole process instead of once for structure.

f
Hoz i =  0
Done initialization
Hoz i =  3
Bar i =  3
Bar i =  3
f1
Hoz i =  0
Hoz i =  0

The only reasonable solution it is to have a sync.Once for each structure.

Do not use the sync.Once from multiple functions

We start the post by DRY out the code and concentrate all the initialization login in a single method that is invoked by all the other methods. With sync.Once this is even more important.

sync.Once will execute once and only once. Even if the function is different.

Spreading the invocation of the sync.Once.Do method makes much more likely that the sync.Once is invoked with a different function and introduce software defects.

Moreover, we might forget to wrap the initialization code into the sync.Once.

Do not do this:

package main

import (
    "fmt"
    "sync"
)

type Foo struct {
    i int

    once sync.Once
}

func (f *Foo) initialize() {
    f.i = 3
    fmt.Println("Done initialization")
}

func (f *Foo) Hoz() {
    fmt.Println("Hoz i = ", f.i)
    f.once.Do(f.initialize)
    fmt.Println("Hoz i = ", f.i)
}

func (f *Foo) Bar() {
    fmt.Println("Bar i = ", f.i)
    f.once.Do(f.initialize)
    fmt.Println("Bar i = ", f.i)
}

func main() {
    fmt.Println("f")
    f := Foo{}
    f.Hoz()
    f.Bar()
}

While it works just as well, it makes it much more likely to introduce software defects in the future.

Conclusion

sync.Once is a very useful structure that can help simplify and make more idiomatic a lot of codes.

Unfortunately, it comes with some foot guns that effective go developers needs to know in order to avoid problems.

I do suggest to check out the sync.Once source code to everybody since it is minimal, but it tackles interesting concurrency issues in really few lines of codes.