Shared memory

Concurrency is a topic that I believe will take multiple blog posts to cover properly and thoroughly. The ultimate aim of this series is to illuminate the idiomatic Go patterns for writing concurrent and correct code.

One of the key and early lessons a programmer will learn when writing concurrent programs is that sharing memory or variables between concurrent processes is crucial. How to do this correctly is a very fundamental and important skill to understand. When a program has two concurrent processes updating and reading shared state there can be dragons lurking. At best, the programmer will encounter hard to understand race conditions, and at worse, she will be faced with corrupted memory.

What does this look like?

Here is a very simple example of using a shared map that gets updated by two concurrent go routines. After the go routines execute the program prints the value of a key that is shared between the two running routines.

package main

        import (
                "fmt"
                "time"
        )

        var sharedState = map[string]int{}

        func main() {
                go updateFoo(1)
                go updateFoo(2)

                time.Sleep(1000 * time.Millisecond)
                fmt.Printf("%d\n", sharedState["bar"])
        }

        func updateFoo(val int) {
                sharedState["bar"] = val
        }

What is the value of bar?

In the above code, we see the program uses the "go" routine to start a concurrent process. The output of the above program prints "2", at least it does on my computer. However, that is not necessarily guaranteed to be the case all the time, on different platforms the result could be "1". Orchestrating the order of concurrent processes, will be a topic for a later time, however, our chief concern is what happens if the two processes write to the map at the same time? Well according to the Go memory model both read and writing to shared memory must be serialized. The model does not actually define what happens when a write happens at exactly the same time, this is left undefined. What this really means is that nasal demons
could fly out of any programmers nose at any time at anytime!

What is wrong with this?

Besides nasal demons, if an update happens exactly at the same time, there could be memory corruption, it's left undefined.

This is where a programmer needs to control access to shared state via synchronization and this can be done with Go's core "sync" library. The sync library has a number of functions to assist with synchronization in a concurrent environment. So, if we wanted to rewrite the above
program with sync we could lock down the access to the key value pair with a mutex (Perhaps well deserving a separate post, however for more details see Wikipedia).

The code that works and is predictable

package main

        import (
                "fmt"
                "sync"
                "time"
        )

        var sharedState = map[string]int{}
        var mutex = &sync.Mutex{}

        func main() {
                go updateFoo(1)
                go updateFoo(2)
                time.Sleep(1000 * time.Millisecond)
                fmt.Printf("%d\n", sharedState["bar"])
        }

        func updateFoo(val int) {
                mutex.Lock()
                sharedState["bar"] = val
                mutex.Unlock()
        }

The above code will now safely update the sharedState without any fear
of updating the sharedState in unpredictable ways. The key to make this work is to initialize a "mutex" and call the associated "Lock" and "Unlock" methods when updating the shared memory.

But why do we sleep?

time.Sleep(1000 * time.Millisecond)

The sleep is there to allocate enough time for the go routines to
finish updating the shared map. However, this is not not very precise
methodology. What happens if the execution of updateFoo takes longer than the allotted time to finish? The answer is, the program exits before the running go routine can finish.

A more ideal solution would be to utilize some other useful functions
in the sync library that give the programmer the ability to wait for the execution of currently running routines to finish.

The sync library comes with the ability to keep track of executed go
routines and wait for a set of go routines to finish. This is done with
the WaitGroup type and the associated "Add", "Done", and "Wait" methods. For example, the above code can now be modified to look like this:

package main

        import (
                "fmt"
                "sync"
        )

        var sharedState = map[string]int{}
        var mutex = &sync.Mutex{}
        var wg sync.WaitGroup

        func main() {
                wg.Add(1)
                go updateFoo(1)
                wg.Add(1)
                go updateFoo(2)
                wg.Wait()
                fmt.Printf("%d\n", sharedState["bar"])
        }

        func updateFoo(val int) {
                mutex.Lock()
                sharedState["bar"] = val
                mutex.Unlock()
                wg.Done()
        }

The "Add" method increments an internal counter that indicates a "go" routine is currently in execution. The "Done" method call decrements this counter and should be called when the go routine is finished.
Finally, the "Wait" method will block the execution of the code until the counter reaches 0. So in our example, the code will wait until all the go routines are done before running the final print statement and exiting.

So what's on the agenda for next week?

I hope these examples will be helpful to those who are just beginning to explore the process of writing safe and concurrent programs in Go. Next week, I'd like to explore designing concurrent safe maps with just channels and go routines.