Skip to content

fix LibVirtPoolSync #159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

fix LibVirtPoolSync #159

wants to merge 2 commits into from

Conversation

monstermunchkin
Copy link
Contributor

@monstermunchkin monstermunchkin commented Aug 10, 2017

Declare methods of LibVirtPoolSync as pointers since otherwise the
entire construct is ineffective. Callers of AcquireLock are now
responsible for making sure that the mutex is locked.

This will prevent the following errors:
* fatal error: concurrent map writes
* panic: sync: unlock of unlocked mutex

This fixes #121.

Signed-off-by: Thomas Hipp [email protected]

@coveralls
Copy link

Coverage Status

Coverage increased (+0.07%) to 56.763% when pulling 6bb8b27 on monstermunchkin:issues/121 into 92ff086 on dmacvicar:master.

@monstermunchkin monstermunchkin requested a review from flavio August 11, 2017 10:21
@flavio
Copy link
Collaborator

flavio commented Aug 11, 2017

The code looks good, however I think we should understand why we are trying to unlock the same lock twice. Why and how is this happening? Then we can create a proper fix, this one looks like a workaround.

Can you reproduce the issue? If so do you know what might be the cause of it?

@monstermunchkin
Copy link
Contributor Author

I was able to reproduce issue #121 by patching pool_sync.go with:

diff --git a/libvirt/pool_sync.go b/libvirt/pool_sync.go
index 8c53ac05..6a6fe9c2 100644
--- a/libvirt/pool_sync.go
+++ b/libvirt/pool_sync.go
@@ -1,6 +1,7 @@
 package libvirt
 
 import (
+       "log"
        "sync"
 )
 
@@ -29,9 +30,11 @@ func (ps LibVirtPoolSync) AcquireLock(pool string) {
        lock, exists := ps.PoolLocks[pool]
        if !exists {
                lock = new(sync.Mutex)
+               log.Printf("[DEBUG] acquire lock (1); pool: '%s', ps: %+v, lock: %+v\n", pool, ps, lock)
                ps.PoolLocks[pool] = lock
        }
 
+       log.Printf("[DEBUG] acquire lock (2); pool: '%s', ps: %+v, lock: %+v\n", pool, ps, lock)
        lock.Lock()
 }
 
@@ -45,5 +48,6 @@ func (ps LibVirtPoolSync) ReleaseLock(pool string) {
                return
        }
 
+       log.Printf("[DEBUG] release lock; pool: '%s', ps: %+v, lock: %+v\n", pool, ps, lock)
        lock.Unlock()
 }

(Oddly, it won't fail without the 1st log.Printf(...))

The Terraform file:

provider "libvirt" {
  uri = "qemu:///system"
}

resource "libvirt_cloudinit" "disk" {
  name = "commoninit.iso"
  local_hostname = "node"
  pool = "home"
}

resource "libvirt_volume" "disk" {
  name = "disk_${count.index}"
  size = "40960"
  pool = "home"
  count = 1
}

(I guess it's just important to have multiple volumes)

The log shows the following:

...
libvirt_volume.disk: Destroying... (ID: /home/thomas/work/var/lib/virt/images/disk_0)
libvirt_cloudinit.disk: Destroying... (ID: /home/thomas/work/var/lib/virt/images/c...o;5992b807-e00b-5692-9388-3997552f8ea1)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: 2017/08/15 10:59:52 [DEBUG] acquire lock (1); pool: 'home', ps: {PoolLocks:map[] internalMutex:{state:1 sema:0}}, lock: &{state:0 sema:0}
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: 2017/08/15 10:59:52 [DEBUG] acquire lock (1); pool: 'home', ps: {PoolLocks:map[] internalMutex:{state:1 sema:0}}, lock: &{state:0 sema:0}
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: 2017/08/15 10:59:52 [DEBUG] acquire lock (2); pool: 'home', ps: {PoolLocks:map[home:0xc42035ac30] internalMutex:{state:1 sema:0}}, lock: &{state:0 sema:0}
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: 2017/08/15 10:59:52 [DEBUG] acquire lock (2); pool: 'home', ps: {PoolLocks:map[home:0xc420330468] internalMutex:{state:1 sema:0}}, lock: &{state:0 sema:0}
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: 2017/08/15 10:59:52 [DEBUG] release lock; pool: 'home', ps: {PoolLocks:map[home:0xc420330468] internalMutex:{state:1 sema:0}}, lock: &{state:1 sema:0}
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: 2017/08/15 10:59:52 [DEBUG] release lock; pool: 'home', ps: {PoolLocks:map[home:0xc420330468] internalMutex:{state:1 sema:0}}, lock: &{state:0 sema:0}
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: fatal error: sync: unlock of unlocked mutex
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: 
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: goroutine 68 [running]:
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: runtime.throw(0xc8e987, 0x1e)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /usr/lib64/go/1.8/src/runtime/panic.go:596 +0x95 fp=0xc4203716c8 sp=0xc4203716a8
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: sync.throw(0xc8e987, 0x1e)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /usr/lib64/go/1.8/src/runtime/panic.go:585 +0x35 fp=0xc4203716e8 sp=0xc4203716c8
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: sync.(*Mutex).Unlock(0xc420330468)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /usr/lib64/go/1.8/src/sync/mutex.go:113 +0xa4 fp=0xc420371710 sp=0xc4203716e8
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: github.com/dmacvicar/terraform-provider-libvirt/libvirt.LibVirtPoolSync.ReleaseLock(0xc4201893b0, 0x0, 0xc42035ac08, 0x4)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /home/thomas/go/src/github.com/dmacvicar/terraform-provider-libvirt/libvirt/pool_sync.go:52 +0x215 fp=0xc4203717b8 sp=0xc420371710
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: github.com/dmacvicar/terraform-provider-libvirt/libvirt.RemoveVolume(0xc420366190, 0xc4203762a0, 0x34, 0x0, 0x0)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /home/thomas/go/src/github.com/dmacvicar/terraform-provider-libvirt/libvirt/utils.go:101 +0x5da fp=0xc4203718e0 sp=0xc4203717b8
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: github.com/dmacvicar/terraform-provider-libvirt/libvirt.resourceCloudInitDelete(0xc420368c40, 0xb2abe0, 0xc420366198, 0x24, 0x133fe00)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /home/thomas/go/src/github.com/dmacvicar/terraform-provider-libvirt/libvirt/resource_cloud_init.go:118 +0xc8 fp=0xc420371930 sp=0xc4203718e0
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: github.com/dmacvicar/terraform-provider-libvirt/vendor/github.com/hashicorp/terraform/helper/schema.(*Resource).Apply(0xc4201fac00, 0xc420380280, 0xc42035d320, 0xb2abe0, 0xc42
0366198, 0x1, 0x28, 0xc42035f110)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /home/thomas/go/src/github.com/dmacvicar/terraform-provider-libvirt/vendor/github.com/hashicorp/terraform/helper/schema/resource.go:166 +0x539 fp=0xc420371a08 sp=0xc42
0371930
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: github.com/dmacvicar/terraform-provider-libvirt/vendor/github.com/hashicorp/terraform/helper/schema.(*Provider).Apply(0xc4201f4150, 0xc420380230, 0xc420380280, 0xc42035d320, 0
x7f8a423ca960, 0x0, 0x0)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /home/thomas/go/src/github.com/dmacvicar/terraform-provider-libvirt/vendor/github.com/hashicorp/terraform/helper/schema/provider.go:259 +0x9b fp=0xc420371a68 sp=0xc420
371a08
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: github.com/dmacvicar/terraform-provider-libvirt/vendor/github.com/hashicorp/terraform/plugin.(*ResourceProviderServer).Apply(0xc4201f22e0, 0xc42035d280, 0xc42035abf0, 0x0, 0x0
)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /home/thomas/go/src/github.com/dmacvicar/terraform-provider-libvirt/vendor/github.com/hashicorp/terraform/plugin/resource_provider.go:488 +0x57 fp=0xc420371ac8 sp=0xc4
20371a68
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: runtime.call64(0xc42035f110, 0xc4201e8060, 0xc42035f140, 0x1800000028)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /usr/lib64/go/1.8/src/runtime/asm_amd64.s:515 +0x48 fp=0xc420371b18 sp=0xc420371ac8
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: reflect.Value.call(0xc42028c3c0, 0xc4201e8060, 0x13, 0xc7689e, 0x4, 0xc420223f20, 0x3, 0x3, 0x0, 0x0, ...)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /usr/lib64/go/1.8/src/reflect/value.go:434 +0x91f fp=0xc420371e60 sp=0xc420371b18
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: reflect.Value.Call(0xc42028c3c0, 0xc4201e8060, 0x13, 0xc420260f20, 0x3, 0x3, 0x0, 0x0, 0x0)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /usr/lib64/go/1.8/src/reflect/value.go:302 +0xa4 fp=0xc420371ec8 sp=0xc420371e60
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: net/rpc.(*service).call(0xc4201ec500, 0xc4201ec4c0, 0xc4201ee460, 0xc4201f6380, 0xc4201f23e0, 0xb2bca0, 0xc42035d280, 0x16, 0xb2bce0, 0xc42035abf0, ...)
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /usr/lib64/go/1.8/src/net/rpc/server.go:387 +0x144 fp=0xc420371f78 sp=0xc420371ec8
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: runtime.goexit()
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /usr/lib64/go/1.8/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc420371f80 sp=0xc420371f78
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt: created by net/rpc.(*Server).ServeCodec
2017/08/15 10:59:52 [DEBUG] plugin: terraform-provider-libvirt:         /usr/lib64/go/1.8/src/net/rpc/server.go:481 +0x404
...

This raises the question as to why two separate mutexes for home are created. Isn't ps.internalMutex.Lock() there to prevent exactly that?

@monstermunchkin
Copy link
Contributor Author

After searching the web, I found this: golang/go#20060. I guess we're doing it wrong. I'll fix this.

@monstermunchkin monstermunchkin changed the title prevent unlocking unlocked mutexes fix LibVirtPoolSync Aug 15, 2017
@coveralls
Copy link

Coverage Status

Coverage increased (+0.1%) to 57.099% when pulling b6ca6fa on monstermunchkin:issues/121 into 1ec3dcd on dmacvicar:master.

@monstermunchkin
Copy link
Contributor Author

FYI: go vet would have detected this:

$ go vet ./libvirt
...
libvirt/pool_sync.go:21: return copies lock value: libvirt.LibVirtPoolSync contains sync.Mutex     
libvirt/pool_sync.go:25: AcquireLock passes lock by value: libvirt.LibVirtPoolSync contains sync.Mutex
libvirt/pool_sync.go:39: ReleaseLock passes lock by value: libvirt.LibVirtPoolSync contains sync.Mutex
...

Perhaps we should run go vet inside of travis/run-tests-inside-guest once we've cleaned up our code.

Copy link
Contributor

@inercia inercia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right approach to the problem. I would create a singleton instance (a global, static instance, initialized probably in a init() function) and use it as a gatekeeper to the pools. I find the idea of keeping another map with booleans slightly hackish...

@@ -62,7 +62,11 @@ func (ci *defCloudInit) CreateAndUpload(virConn *libvirt.Connect) (string, error
}
defer pool.Free()

PoolSync.AcquireLock(ci.PoolName)
for {
if PoolSync.AcquireLock(ci.PoolName) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looping until we get the lock doesn't seem a very elegant solution...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know how to avoid looping in this case. I added a time.Sleep(..) to make it more CPU-friendly ;)


return pool
return &pool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use a more go-ish syntax:

func NewLibVirtPoolSync() *LibVirtPoolSync {
    return &LibVirtPoolSync{
        PoolLocks: make(map[string]*sync.Mutex),
        //...
    }
}

@coveralls
Copy link

Coverage Status

Coverage increased (+0.05%) to 57.027% when pulling cd38a8d on monstermunchkin:issues/121 into 1ec3dcd on dmacvicar:master.

@monstermunchkin monstermunchkin added this to the 0.1 milestone Sep 18, 2017
@MalloZup MalloZup mentioned this pull request Nov 16, 2017
2 tasks
Thomas Hipp added 2 commits November 17, 2017 10:09
Declare methods of LibVirtPoolSync as pointers since otherwise the
entire construct is ineffective. Callers of `AcquireLock` are now
responsible for making sure that the mutex is locked.

This will prevent the following errors:
	* fatal error: concurrent map writes
	* panic: sync: unlock of unlocked mutex

This fixes #121.

Signed-off-by: Thomas Hipp <[email protected]>
Change code structure, and sleep after trying (and failing) to acquire a
lock.

Signed-off-by: Thomas Hipp <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash on destroy
4 participants