Paul Di Gian
Paul Di Gian's Blog

Paul Di Gian's Blog

Advanced Go – Loops and goroutines

Paul Di Gian's photo
Paul Di Gian
·Aug 18, 2021·

4 min read

Subscribe to my newsletter and never miss my upcoming articles

In the previous post we took a deep look on how the Go runtime schedule and swap execution of goroutines.

In this other post we are going to understand one of the main source of errors with goroutines.

The problem

Invoking goroutines in a loop and misunderstand the scope of the variables captured by the goroutines.

Once the underneath issue is understood, the problem will seems trivial, and it won’t cause bugs anymore.

Let’s consider the following piece of code.

var input []string
// populate input
for _, s := range input {
    result := long_network_call(s)
    outputs <- result
}

There is an array of some inputs, strings in this case.

Each input is fed to a function that talk with the network and takes a long time to complete.

And finally the result is pushed to a channel.

If the objective is to make this code faster, the obvious solution would be to spawn a new goroutine for each long_network_call. However, it is important to make it in the correct way.

The following won’t work as expected.

var input []string
// populate input
for _, s := range input {
    go func() {
        result := long_network_call(s)
        outputs <- result
    }()
}

On the surface, everything is alright, but there is an important detail missing.

The internal closure, will invoke the long_network_call function with s as argument. The pointer s is correctly captured by each goroutine, however, the value to which the pointer s points to will change at each iteration.

The goroutine takes some time to start execution, they will start to work, most likely, at the end of the loop.

At the end of the loop, the argument of the function, s, will always refer to the latest element in the input array. This will yield wrong results every single time.

Consider this example:

# main.go
package main

import (
    "fmt"
    "time"
)

func main() {
    inputs := []string{"aaa", "bbb", "ccc"}
    for _, s := range inputs {
        go func() {
            fmt.Println(s)
        }()
    }
    time.Sleep(1E9)
}

The result in this case will be:

$ go run main.go
ccc
ccc
ccc

Which is clearly wrong and not what it is expected.

The solution

Now that we have understood the problem, the solution is overall simple.

We need a way to capture the variables we are looping over into the closure.

In the example above, we need a way to invoke the fmt.Prinln function, once with "aaa", another time with "bbb" and one last time with "ccc".

Exactly the same idea for the original example with long_network_call.

The standard solution for this problem it is to capture the value of the variable invoking the function instead of relying on the scope capturing the correct variable – which fails in this case.

package main

import (
    "fmt"
    "time"
)

func main() {
    inputs := []string{"aaa", "bbb", "ccc"}
    for _, s := range inputs {
                // the function now needs a parameter
        go func(s string) {  
                        // this `s` now refers to the parameter 
                        // of the function, not the iterator
                        // of the loop
            fmt.Println(s) 
        }(s) 
                // invoke the function with each variable 
                // in the loop
    }
    time.Sleep(1E9)
}

I personally like to keep the same variable name as range iterator and as parameter. This makes clearer, to me at least, what we are doing and what variables are included.

Few Go details are at play in this particular example.

When spawning a goroutine, the arguments are immediately copied to the stack of the function.

In the wrong example, this is what is happening in the for loop and for each function.

s := ""

// for loop - in one goroutine
for i := 0; i < len(input); i++:
    s = input[i]
    invoke a function `F` that uses `s`

// clearly at the end of the for loop, s will point to
// the end of the input array

// each function `F` in their own goroutine
fmt.Println(s)

While in the correct example, something different is happening.

// for loop - in one goroutine
for i := 0; i < len(input); i++:
    s = input[i]
    copy the value of s to the stack of `F`
    invoke a function `F` that uses the values in its own stack

// each invocation of `F` has the value of `s` copied
// in its own stack

// each function `F` in their own goroutine
fmt.Println(theValueCopiedInItsOwnStack)

Conclusion

Overall, this is one of the most common defect with Go and goroutines. And indeed, go vet has a special lint specifically for this case.

Hence, if you are coding in a structured environment, for instance with VS Code with the Golang extension. The linter should point out the problem.

While the linter points out the problem, it is still important to have a deep understanding of what is happening so to be able to quickly debug issues like this and just now wasting time on them.

Follow Me

Follow @siscia_

Newsletter

We publish new content each week, subscribe to don't miss any article.

Email Address

Leave this field empty if you're human:

 
Share this