I've configured the Vsphere module in metricbeat 6.0.0-alpha2 (on a Windows 2008R2 machine) and let is run for a while. Please note than my VSphere configuration requires authentication (with insecure:true).
Metricbeat gathered data for about 12+ hours then Elasticseach started being filled with metric with the error.message field populated to NotAuthenticated (error.message:NotAuthenticated).
It happened when VSphere went "offline" for a scheduled backup activity; as it got back "online" the vsphere module seems not had performed auth again and as such ES started being populated with events with the NotAuthenticated error. I'm attaching a screenshot from kibana which details the described flow.
Please note also that no error messages can be found in the metricbeat log itself.
Is there a way to have the vsphere module to perform auth again ?
thank you for reporting @alextxm, I have some questions:
I'll have a look to our client library (govmomi), perhaps the session expired and it doesn't authenticate on reconnect
Hi @exekias,
the username/password didn't change but the Vsphere server services had been restarted due to the nightly scheduled vshere DBbackup activity... this should be the cause of the session expiration.
Please note that this scenario could be pretty common IMHO since AFAIK vsphere backup requires stop/restart of the related services; as such, a session expiration/reconnect handling mechanism in the client lib (or the module itself) would be really useful.
So I think the client library is not issuing a new Login after reconnect, we will need to confirm that and fix it
Hi, is there any news on this ? Can i help with further testing ?
@exekias
First I tried to use iptables to block the access and it worked fine. Reconnects after I deleted the rule.
Then I try to terminate the metricbeat session using VSphere Web Client. This way I got the same error "NotAuthenticated" all the time.
I'm not sure if this is the best way to fix and reconnect but at least in this test that I made, it worked.
Update: I was getting a nil pointer but just needed to recreate the view after reauthenticate
@amandahla from what I see we could benefit from using Session instead of Client? It would handle keepalive and some more goodies, what do you think? https://github.com/vmware/vic/blob/master/pkg/vsphere/session/session.go#L50-L87
@exekias I'm just afraid if it will work on both versions (6.0/6.5) because I see here that uses finder to populate. Same thing that I had to change because of the 'datastore (or host) '*' not found' error. But I believe that it's worth to give it a try. I'll try to follow this to make changes and test. What do you think?
Sounds good to me, I guess we can wait for that before merging https://github.com/elastic/beats/pull/4883?
Yes, I think it would be better.
Please, can you help me with something? I made a test and now I get this:
./metricbeat flag redefined: version
panic: ./metricbeat flag redefined: version
goroutine 1 [running]:
flag.(*FlagSet).Var(0xc42001c120, 0x38f87a0, 0x396f426, 0x29f1b1c, 0x7, 0x2a026f9, 0x11)
/home/user/Documentos/CDGIN/2017/ESTALEIRO/metricbeat/go-1.8.3/go/src/flag/flag.go:793 +0x420
flag.BoolVar(0x396f426, 0x29f1b1c, 0x7, 0x0, 0x2a026f9, 0x11)
/home/user/Documentos/CDGIN/2017/ESTALEIRO/metricbeat/go-1.8.3/go/src/flag/flag.go:572 +0x72
github.com/elastic/beats/metricbeat/module/vsphere/vendor/github.com/vmware/vic/pkg/version.init.1()
/home/user/monvm1/src/github.com/elastic/beats/metricbeat/module/vsphere/vendor/github.com/vmware/vic/pkg/version/version.go:56 +0x5c
github.com/elastic/beats/metricbeat/module/vsphere/vendor/github.com/vmware/vic/pkg/version.init()
/home/user/monvm1/src/github.com/elastic/beats/metricbeat/module/vsphere/vendor/github.com/vmware/vic/pkg/version/version.go:146 +0x5d
github.com/elastic/beats/metricbeat/module/vsphere/vendor/github.com/vmware/vic/lib/config/executor.init()
/home/user/monvm1/src/github.com/elastic/beats/metricbeat/module/vsphere/vendor/github.com/vmware/vic/lib/config/executor/network_interface.go:85 +0x58
github.com/elastic/beats/metricbeat/module/vsphere/vendor/github.com/vmware/vic/lib/config.init()
/home/user/monvm1/src/github.com/elastic/beats/metricbeat/module/vsphere/vendor/github.com/vmware/vic/lib/config/virtual_container_host.go:351 +0x67
github.com/elastic/beats/metricbeat/module/vsphere/vendor/github.com/vmware/vic/pkg/vsphere/session.init()
/home/user/monvm1/src/github.com/elastic/beats/metricbeat/module/vsphere/vendor/github.com/vmware/vic/pkg/vsphere/session/session.go:330 +0x8e
github.com/elastic/beats/metricbeat/module/vsphere/host.init()
/home/user/monvm1/src/github.com/elastic/beats/metricbeat/module/vsphere/host/host.go:128 +0x6c
github.com/elastic/beats/metricbeat/include.init()
/home/user/monvm1/src/github.com/elastic/beats/metricbeat/include/list.go:114 +0x1ec
github.com/elastic/beats/metricbeat/cmd.init()
/home/user/monvm1/src/github.com/elastic/beats/metricbeat/cmd/root.go:30 +0x71
main.init()
/home/user/monvm1/src/github.com/elastic/beats/metricbeat/main.go:21 +0x49
I needed to import "github.com/vmware/vic/pkg/vsphere/session" and added to vsphere vendor. I'm not sure how to resolve this. :-(
You can patch https://github.com/vmware/vic/blob/master/pkg/version/version.go#L56 from the vendor folder temporarily and keep going, then we can treat that issue once/if it's working. Try by changing "version" to something else
Thanks @exekias I tested with both versions and it was fine. Now, when the session is deactivacte, after the keepalive time, he re-authenticates.
For this commit, I changed 'version' to 'version1' in https://github.com/vmware/vic/blob/master/pkg/version/version.go#L56
Log:
WARN[0060] session keepalive error: ServerFaultCode: The session is not authenticated.
INFO[0060] session keepalive re-authenticated
Hi @amandahla,
i'm testing the vsphere module in 6.0.0-rc1 since last week and the bug seems to be still present:
Hi all,
i can confirm it still happens regurarly due to vsphere daily backup activities: vsphere closes connections, metricbeats then starts logging "unable to connect" errors and after a while swiches to "not authenticated" and gets stuck in such condition even if vsphere is availlable again. The only way to get metricbeat collect data again is to restart it.
Hi @alextxm . The fix was not merged yet. Still working.
https://github.com/elastic/beats/pull/4883
I gave a try to #4883 without success :(
I have another idea, what about initializing a new client on every fetch? That would be moving https://github.com/elastic/beats/blob/master/metricbeat/module/vsphere/virtualmachine/virtualmachine.go#L58 to the Fetch method. Any opinion on this @amandahla ?
Hi all,
i understand this is still a WIP but is there any change to get a fix in time for 6.0-GA ? It would be really nice !
Thank you
@exekias I'll try that and see if logout like it's used here really works fine.
Still using the PR #4883