My salt environment has been setup for 2 years now and I've been experiencing GPG decryption failed every, now and then.
Based on salt master logs:
{ "highstate-timestamp": "2018-12-16 17:03:58,595", "name": "salt.loaded.int.render.gpg", "levelname": "WARNING ", "message": "Could not decrypt cipher -----BEGIN PGP MESSAGE-----
Version: GnuPG v1
hQEMAw9iLKXotYWZAQgAqWanFkBMos5tEeH0P6dyHonlA5pkUv2xFj4oDGoaBCG2
TwbvD3/HXI6rTvvvSK757nvKiTHR+Aq6LLVYJ0+S2HYuhG5A33FWCvZ/pMQBCHhg
1hQ/oTzuHLSZAsk9dhtEskNnXJRt0iGkmAeH5qUk8ZOBHNUHsLzggQ56BosSuglA
e63QAf9PMvd0tgNGnC7ZBmchWWDh8VPHJCotQFrEb6BvDvKXvp5fSVed/7QFPQK3
5mlasbKWwPQBZwe66HZ2xs/XhF60cvz1Jh5lv/KjQtrGnkzXcr6pe9vBZjmItPw7
d8GyWy9BFSAGFY5iINkFpg8bnPAYM9oIKjHw+B/jxNJJAWHvb54Fggl8wNdA3gDj
Yd74jjvM3OC0i7JDUAacjg7hdTxLbL9JxNNKvBhCtRzcTfy9Uinm2N6W1tuEBu4N
TT9L3ZlUAAn0nA==
=+diQ
-----END PGP MESSAGE-----
, received: [GNUPG:] ENC_TO <reducted>
gpg: can't connect to the agent: IPC connect call failed
[GNUPG:] KEY_CONSIDERED <reducted>
gpg: encrypted with 2048-bit RSA key, ID <reducted>, created 2017-02-07
"companykey"
[GNUPG:] NO_SECKEY <reducted>
[GNUPG:] BEGIN_DECRYPTION
[GNUPG:] DECRYPTION_FAILED
gpg: decryption failed: No secret key
[GNUPG:] END_DECRYPTION
" }
But on the next scheduled highstate, i don't experience the problem.
/srv/salt/pillar/schedule.sls:
schedule:
highstate:
function: state.highstate
seconds: 300
splay: 60
/srv/salt/pillar/db.sls
#!jinja|yaml|gpg
api:
db_username: |
-----BEGIN PGP MESSAGE-----
Version: GnuPG v1
hQEMAw9iLKXotYWZAQgAqWanFkBMos5tEeH0P6dyHonlA5pkUv2xFj4oDGoaBCG2
TwbvD3/HXI6rTvvvSK757nvKiTHR+Aq6LLVYJ0+S2HYuhG5A33FWCvZ/pMQBCHhg
1hQ/oTzuHLSZAsk9dhtEskNnXJRt0iGkmAeH5qUk8ZOBHNUHsLzggQ56BosSuglA
e63QAf9PMvd0tgNGnC7ZBmchWWDh8VPHJCotQFrEb6BvDvKXvp5fSVed/7QFPQK3
5mlasbKWwPQBZwe66HZ2xs/XhF60cvz1Jh5lv/KjQtrGnkzXcr6pe9vBZjmItPw7
d8GyWy9BFSAGFY5iINkFpg8bnPAYM9oIKjHw+B/jxNJJAWHvb54Fggl8wNdA3gDj
Yd74jjvM3OC0i7JDUAacjg7hdTxLbL9JxNNKvBhCtRzcTfy9Uinm2N6W1tuEBu4N
TT9L3ZlUAAn0nA==
=+diQ
-----END PGP MESSAGE-----
Also, I don't generate my keys anymore. I just copy it manually from my stash to /etc/salt/gpgkeys, then just do import the public key:
/usr/bin/gpg --import /etc/salt/gpgkeys/public_key.gpg
Right now here's the contents of my /etc/salt/gpgkeys:
$ ls /etc/salt/gpgkeys/
S.gpg-agent
S.gpg-agent.browser
S.gpg-agent.extra
S.gpg-agent.ssh
private-keys-v1.d
public_key.gpg
pubring.gpg
secring.gpg
trustdb.gpg
I tried from salt master to list keys and it doesn't fail:
$ sudo salt 'server-1' pillar.items
server-1:
----------
api:
----------
db_username:
db_username
I'm not exactly sure how to reproduce this because this only happened just once in a blue moon (but this has happened multiple times before).
I was reading some issues posted in Google about this same issue:
http://joshbolling.com/2017/05/21/quick-tip-fix-gpg-cant-connect-to-the-agent-ipc-connect-call-failed-error/
https://blog.badgerops.net/2017/05/31/gpg-encrypting-secrets-with-saltstack/
According to those two:
Am i doing something wrong here? Is it because of the directory where the keys are located? Is it because my scheduler is too fast?
Salt Version:
Salt: 2018.3.3
Dependency Versions:
cffi: 1.11.5
cherrypy: Not Installed
dateutil: 2.7.5
docker-py: Not Installed
gitdb: 2.0.3
gitpython: 2.1.8
ioflo: Not Installed
Jinja2: 2.10
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: Not Installed
Mako: 1.0.7
msgpack-pure: Not Installed
msgpack-python: 0.5.6
mysql-python: Not Installed
pycparser: 2.19
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 2.7.15rc1 (default, Nov 12 2018, 14:31:15)
python-gnupg: 0.4.1
PyYAML: 3.13
PyZMQ: 16.0.2
RAET: Not Installed
smmap: 2.0.3
timelib: Not Installed
Tornado: 4.5.3
ZMQ: 4.2.5
System Versions:
dist: Ubuntu 18.04 bionic
locale: UTF-8
machine: x86_64
release: 4.15.0-1021-aws
system: Linux
version: Ubuntu 18.04 bionic
can you check the gpg log file and see if there is any useful information there.
also can you check when this occurs in the salt debug log if there is another process right before that thats trying to render the gpg
I just started the GPG logging. I'll update this thread once i encounter it again and post the GPG log.
But here's the output of the salt debug log:
kenan@salt-master:~$ sudo salt 'server-1' pillar.items -l debug
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Including configuration from '/etc/salt/master.d/engine.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/engine.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/f_defaults.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/f_defaults.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/lxc_profiles.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/lxc_profiles.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/reactor.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/reactor.conf
[DEBUG ] Changed git to gitfs in master opts' fileserver_backend list
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: salt-master
[DEBUG ] Missing configuration file: /home/kenan/.saltrc
[DEBUG ] Configuration file path: /etc/salt/master
[WARNING ] Insecure logging configuration detected! Sensitive data may be logged.
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Including configuration from '/etc/salt/master.d/engine.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/engine.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/f_defaults.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/f_defaults.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/lxc_profiles.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/lxc_profiles.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/reactor.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/reactor.conf
[DEBUG ] Changed git to gitfs in master opts' fileserver_backend list
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: salt-master
[DEBUG ] Missing configuration file: /home/kenan/.saltrc
[DEBUG ] MasterEvent PUB socket URI: /var/run/salt/master/master_event_pub.ipc
[DEBUG ] MasterEvent PULL socket URI: /var/run/salt/master/master_event_pull.ipc
[DEBUG ] Popen(['git', 'version'], cwd=/home/kenan, universal_newlines=False, shell=None)
[DEBUG ] Popen(['git', 'version'], cwd=/home/kenan, universal_newlines=False, shell=None)
[DEBUG ] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/master', u'salt-master', u'tcp://127.0.0.1:4506', u'clear')
[DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://127.0.0.1:4506
[DEBUG ] Trying to connect to: tcp://127.0.0.1:4506
[DEBUG ] Initializing new IPCClient for path: /var/run/salt/master/master_event_pub.ipc
[DEBUG ] LazyLoaded local_cache.get_load
[DEBUG ] Reading minion list from /var/cache/salt/master/jobs/2c/9d118024bbe4db9ff2b585b3635c1c597e8493bcfab9aba8c06d250675ee3a/.minions.p
[DEBUG ] get_iter_returns for jid 20181218070623416000 sent to set(['server-1']) will timeout at 07:06:53.428735
[DEBUG ] jid 20181218070623416000 return from server-1
[DEBUG ] return event: {u'server-1', u'retcode': 0, u'ret': { <reducted>
Hi @Ch3LL it happened again to me. This is the output for gpg-agent.log
2019-01-11 10:06:25 gpg-agent[26930] gpg-agent (GnuPG) 2.2.4 stopped
2019-01-11 23:24:22 gpg-agent[10138] gpg-agent (GnuPG) 2.2.4 started
2019-01-11 23:24:22 gpg-agent[10138] socket file has been removed - shutting down
2019-01-11 23:24:22 gpg-agent[10138] gpg-agent (GnuPG) 2.2.4 stopped
2019-01-11 23:24:22 gpg-agent[10150] gpg-agent (GnuPG) 2.2.4 started
2019-01-11 23:24:22 gpg-agent[10150] error enabling daemon termination by socket removal: No such file or directory
2019-01-11 23:25:22 gpg-agent[10150] can't connect my own socket: IPC connect call failed
2019-01-11 23:25:22 gpg-agent[10150] this process is useless - shutting down
2019-01-11 23:26:22 gpg-agent[10150] gpg-agent (GnuPG) 2.2.4 stopped
2019-01-12 00:52:50 gpg-agent[29756] gpg-agent (GnuPG) 2.2.4 started
2019-01-12 00:52:50 gpg-agent[29756] socket file has been removed - shutting down
Any ideas how to resolve this?
and do you have the salt time stamps to go along with those logs by chance?
the salt debug log time stamps that is
Hi,
It happened again. Sorry for the late reply. During that time i wasn't able to turn on debug log level.
Here's the output:
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Including configuration from '/etc/salt/master.d/engine.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/engine.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/f_defaults.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/f_defaults.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/lxc_profiles.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/lxc_profiles.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/reactor.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/reactor.conf
[DEBUG ] Changed git to gitfs in master opts' fileserver_backend list
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: salt-master
[DEBUG ] Missing configuration file: /home/kenan/.saltrc
[DEBUG ] Configuration file path: /etc/salt/master
[WARNING ] Insecure logging configuration detected! Sensitive data may be logged.
[DEBUG ] LazyLoaded jobs.lookup_jid
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG ] Including configuration from '/etc/salt/minion.d/f_defaults.conf'
[DEBUG ] Reading configuration from /etc/salt/minion.d/f_defaults.conf
[DEBUG ] Changed git to gitfs in minion opts' fileserver_backend list
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: salt-master
[DEBUG ] Grains refresh requested. Refreshing grains.
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Including configuration from '/etc/salt/master.d/engine.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/engine.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/f_defaults.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/f_defaults.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/lxc_profiles.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/lxc_profiles.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/reactor.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/reactor.conf
[DEBUG ] Please install 'virt-what' to improve results of the 'virtual' grain.
[DEBUG ] Retrieving credentials from metadata server.
[DEBUG ] Retrieved credentials will expire in 5:59:55.817917 at: 2019-01-31T09:37:47Z
[DEBUG ] Method: POST
[DEBUG ] Path: /
[DEBUG ] Data:
[DEBUG ] Headers: {}
[DEBUG ] Host: ec2.us-east-1.amazonaws.com
[DEBUG ] Port: 443
[DEBUG ] Params: {'Filter.1.Name': 'resource-id', 'Filter.2.Value.1': 'instance', 'Action': 'DescribeTags', 'Version': '2014-10-01', 'Filter.2.Name': 'resource-type', 'Filter.1.Value.1': u'i-04'}
[DEBUG ] establishing HTTPS connection: host=ec2.us-east-1.amazonaws.com, kwargs={'port': 443, 'timeout': 70}
[DEBUG ] Response headers: [('date', 'Thu, 31 Jan 2019 03:37:50 GMT'), ('vary', 'Accept-Encoding'), ('content-length', '2329'), ('content-type', 'text/xml;charset=UTF-8'), ('server', 'AmazonEC2')]
[DEBUG ] <?xml version="1.0" encoding="UTF-8"?>
<DescribeTagsResponse xmlns="http://ec2.amazonaws.com/doc/2014-10-01/">
<requestId>4734f5b7-f43a-4853-b36b-72966ee6beb8</requestId>
<tagSet>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>Name</key>
<value>salt-master</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>aws:autoscaling:groupName</key>
<value>cf-prod-saltmaster-asg</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>aws:cloudformation:logical-id</key>
<value>saltMasterAutoScalingGroup</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>environment</key>
<value>prod</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>location</key>
<value>us-east-1</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>role</key>
<value>salt</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>type</key>
<value>master</value>
</item>
</tagSet>
</DescribeTagsResponse>
[DEBUG ] Loading static grains from /etc/salt/grains
[DEBUG ] MasterEvent PUB socket URI: /var/run/salt/master/master_event_pub.ipc
[DEBUG ] MasterEvent PULL socket URI: /var/run/salt/master/master_event_pull.ipc
[DEBUG ] Popen(['git', 'version'], cwd=/home/kenan, universal_newlines=False, shell=None)
[DEBUG ] Popen(['git', 'version'], cwd=/home/kenan, universal_newlines=False, shell=None)
[DEBUG ] Initializing new IPCClient for path: /var/run/salt/master/master_event_pull.ipc
[DEBUG ] Sending event: tag = salt/run/20190131033750545717/new; data = {u'fun': u'runner.jobs.lookup_jid', u'fun_args': [u'20190131033359699746'], u'jid': u'20190131033750545717', u'user': u'sudo_kenan', u'_stamp': '2019-01-31T03:37:51.513845'}
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG ] Including configuration from '/etc/salt/minion.d/f_defaults.conf'
[DEBUG ] Reading configuration from /etc/salt/minion.d/f_defaults.conf
[DEBUG ] Changed git to gitfs in minion opts' fileserver_backend list
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: salt-master
[DEBUG ] Grains refresh requested. Refreshing grains.
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Including configuration from '/etc/salt/master.d/engine.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/engine.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/f_defaults.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/f_defaults.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/lxc_profiles.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/lxc_profiles.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/reactor.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/reactor.conf
[DEBUG ] Please install 'virt-what' to improve results of the 'virtual' grain.
[DEBUG ] Retrieving credentials from metadata server.
[DEBUG ] Retrieved credentials will expire in 5:59:55.072222 at: 2019-01-31T09:37:47Z
[DEBUG ] Method: POST
[DEBUG ] Path: /
[DEBUG ] Data:
[DEBUG ] Headers: {}
[DEBUG ] Host: ec2.us-east-1.amazonaws.com
[DEBUG ] Port: 443
[DEBUG ] Params: {'Filter.1.Name': 'resource-id', 'Filter.2.Value.1': 'instance', 'Action': 'DescribeTags', 'Version': '2014-10-01', 'Filter.2.Name': 'resource-type', 'Filter.1.Value.1': u'i-04'}
[DEBUG ] establishing HTTPS connection: host=ec2.us-east-1.amazonaws.com, kwargs={'port': 443, 'timeout': 70}
[DEBUG ] Response headers: [('date', 'Thu, 31 Jan 2019 03:37:51 GMT'), ('vary', 'Accept-Encoding'), ('content-length', '2329'), ('content-type', 'text/xml;charset=UTF-8'), ('server', 'AmazonEC2')]
[DEBUG ] <?xml version="1.0" encoding="UTF-8"?>
<DescribeTagsResponse xmlns="http://ec2.amazonaws.com/doc/2014-10-01/">
<requestId>b0cfd6a2-5f5d-43af-b3a1-977a3d48444b</requestId>
<tagSet>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>Name</key>
<value>salt-master</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>aws:autoscaling:groupName</key>
<value>cf-prod-saltmaster-asg</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>aws:cloudformation:logical-id</key>
<value>saltMasterAutoScalingGroup</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>environment</key>
<value>prod</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>location</key>
<value>us-east-1</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>role</key>
<value>salt</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>type</key>
<value>master</value>
</item>
</tagSet>
</DescribeTagsResponse>
[DEBUG ] Loading static grains from /etc/salt/grains
[DEBUG ] LazyLoaded jinja.render
[DEBUG ] LazyLoaded yaml.render
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG ] Including configuration from '/etc/salt/minion.d/f_defaults.conf'
[DEBUG ] Reading configuration from /etc/salt/minion.d/f_defaults.conf
[DEBUG ] Changed git to gitfs in minion opts' fileserver_backend list
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: salt-mater
[DEBUG ] Grains refresh requested. Refreshing grains.
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Including configuration from '/etc/salt/master.d/engine.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/engine.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/f_defaults.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/f_defaults.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/lxc_profiles.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/lxc_profiles.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/reactor.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/reactor.conf
[DEBUG ] Please install 'virt-what' to improve results of the 'virtual' grain.
[DEBUG ] Retrieving credentials from metadata server.
[DEBUG ] Retrieved credentials will expire in 5:59:54.508691 at: 2019-01-31T09:37:47Z
[DEBUG ] Method: POST
[DEBUG ] Path: /
[DEBUG ] Data:
[DEBUG ] Headers: {}
[DEBUG ] Host: ec2.us-east-1.amazonaws.com
[DEBUG ] Port: 443
[DEBUG ] Params: {'Filter.1.Name': 'resource-id', 'Filter.2.Value.1': 'instance', 'Action': 'DescribeTags', 'Version': '2014-10-01', 'Filter.2.Name': 'resource-type', 'Filter.1.Value.1': u'i-04'}
[DEBUG ] establishing HTTPS connection: host=ec2.us-east-1.amazonaws.com, kwargs={'port': 443, 'timeout': 70}
[DEBUG ] Response headers: [('date', 'Thu, 31 Jan 2019 03:37:52 GMT'), ('vary', 'Accept-Encoding'), ('content-length', '2329'), ('content-type', 'text/xml;charset=UTF-8'), ('server', 'AmazonEC2')]
[DEBUG ] <?xml version="1.0" encoding="UTF-8"?>
<DescribeTagsResponse xmlns="http://ec2.amazonaws.com/doc/2014-10-01/">
<requestId>ee6077fc-b748-4931-9cdf-ba134c35deb9</requestId>
<tagSet>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>Name</key>
<value>salt-master</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>aws:autoscaling:groupName</key>
<value>cf-prod-saltmaster-asg</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>aws:cloudformation:logical-id</key>
<value>saltMasterAutoScalingGroup</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>environment</key>
<value>prod</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>location</key>
<value>us-east-1</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>role</key>
<value>salt</value>
</item>
<item>
<resourceId>i-04</resourceId>
<resourceType>instance</resourceType>
<key>type</key>
<value>master</value>
</item>
</tagSet>
</DescribeTagsResponse>
[DEBUG ] Loading static grains from /etc/salt/grains
[DEBUG ] LazyLoaded jinja.render
[DEBUG ] LazyLoaded yaml.render
[DEBUG ] LazyLoaded local_cache.get_load
[DEBUG ] Reading minion list from /var/cache/salt/master/jobs/64/209b78813d5e0f2f4282ceeccba12782eee5dfa8beb5350a266596d050d0d3/.minions.p
[DEBUG ] LazyLoaded local_cache.prep_jid
[DEBUG ] Adding minions for job 20190131033750545717: []
[DEBUG ] Sending event: tag = salt/run/20190131033750545717/ret; data = {
u 'fun_args': [u '20190131033359699746'], u 'jid': u '20190131033750545717', u 'return': {
u 'outputter': 'highstate',
u 'data': {
u 'salt-minion': {
'pkgrepo_|-add-yarn-repo_|-deb https://dl.yarnpkg.com/debian/ stable main_|-managed': {
'comment': "Package repo 'deb https://dl.yarnpkg.com/debian/ stable main' already configured",
'name': 'deb https://dl.yarnpkg.com/debian/ stable main',
'start_time': '03:33:51.277731',
'result': True,
'duration': 25.234,
'__run_num__': 39,
'__sls__': 'api.install',
'changes': {},
'__id__': 'add-yarn-repo'
},
'pkgrepo_|-add-newrelic-repo_|-deb http://apt.newrelic.com/debian/ newrelic non-free_|-managed': {
'comment': "Package repo 'deb http://apt.newrelic.com/debian/ newrelic non-free' already configured",
'name': 'deb http://apt.newrelic.com/debian/ newrelic non-free',
'start_time': '03:33:51.250564',
'result': True,
'duration': 27.04,
'__run_num__': 38,
'__sls__': 'api.install',
'changes': {},
'__id__': 'add-newrelic-repo'
},
'cron_|-watch_tcp_cron_|-/opt/watch_tcp_connections.sh_|-present': {
'comment': 'Cron /opt/watch_tcp_connections.sh already present',
'name': '/opt/watch_tcp_connections.sh',
'start_time': '03:33:51.340448',
'result': True,
'duration': 75.293,
'__run_num__': 44,
'__sls__': 'api.install',
'changes': {},
'__id__': 'watch_tcp_cron'
},
'file_|-dnsmasq_conf_|-/etc/dnsmasq.conf_|-managed': {
'comment': 'File /etc/dnsmasq.conf is in the correct state',
'pchanges': {},
'name': '/etc/dnsmasq.conf',
'start_time': '03:33:49.900309',
'result': True,
'duration': 59.757,
'__run_num__': 21,
'__sls__': 'dnsmasq',
'changes': {},
'__id__': 'dnsmasq_conf'
},
'file_|-diamond_init_|-/etc/init.d/diamond_|-managed': {
'comment': 'File /etc/init.d/diamond is in the correct state',
'pchanges': {},
'name': '/etc/init.d/diamond',
'start_time': '03:33:50.744594',
'result': True,
'duration': 20.954,
'__run_num__': 29,
'__sls__': 'diamond',
'changes': {},
'__id__': 'diamond_init'
},
'pkg_|-filebeat_|-filebeat_|-installed': {
'comment': 'All specified packages are already installed',
'name': 'filebeat',
'start_time': '03:33:59.176673',
'result': True,
'duration': 8.348,
'__run_num__': 71,
'__sls__': 'filebeat.install',
'changes': {},
'__id__': 'filebeat'
},
'service_|-dnsmasq_|-dnsmasq_|-running': {
'comment': 'The service dnsmasq is already running',
'name': 'dnsmasq',
'start_time': '03:33:50.034913',
'result': True,
'duration': 49.404,
'__run_num__': 23,
'__sls__': 'dnsmasq',
'changes': {},
'__id__': 'dnsmasq'
},
'file_|-Remove existing npm_|-/usr/local/lib/node_modules/npm_|-absent': {
'comment': 'State was not run because none of the onchanges reqs changed',
'__state_ran__': False,
'start_time': '03:33:51.004466',
'result': True,
'duration': 0.011,
'__run_num__': 35,
'__sls__': 'node.binary',
'changes': {}
},
'pkg_|-dnsmasq_|-dnsmasq_|-installed': {
'comment': 'All specified packages are already installed',
'name': 'dnsmasq',
'start_time': '03:33:49.882203',
'result': True,
'duration': 4.572,
'__run_num__': 19,
'__sls__': 'dnsmasq',
'changes': {},
'__id__': 'dnsmasq'
},
'pkg_|-nginx_|-nginx_|-installed': {
'comment': 'All specified packages are already installed',
'name': 'nginx',
'start_time': '03:33:51.303584',
'result': True,
'duration': 5.442,
'__run_num__': 40,
'__sls__': 'api.install',
'changes': {},
'__id__': 'nginx'
},
'file_|-/etc/diamond/collectors_|-/etc/diamond/collectors_|-directory': {
'comment': 'Directory /etc/diamond/collectors is in the correct state\nDirectory /etc/diamond/collectors updated',
'pchanges': {},
'name': '/etc/diamond/collectors',
'start_time': '03:33:50.719816',
'result': True,
'duration': 0.836,
'__run_num__': 26,
'__sls__': 'diamond',
'changes': {},
'__id__': '/etc/diamond/collectors'
},
'archive_|-Extract binary package_|-/usr/local/src/_|-extracted': {
'comment': 'Path /usr/local/src/node-v8.11.3-linux-x64 exists',
'name': '/usr/local/src/',
'start_time': '03:33:50.967691',
'result': True,
'duration': 0.32,
'__run_num__': 33,
'__sls__': 'node.binary',
'changes': {},
'__id__': 'Extract binary package'
},
'file_|-/etc/update-motd.d/motd-message_|-/etc/update-motd.d/motd-message_|-managed': {
'comment': 'File /etc/update-motd.d/motd-message is in the correct state',
'pchanges': {},
'name': '/etc/update-motd.d/motd-message',
'start_time': '03:33:49.704051',
'result': True,
'duration': 21.457,
'__run_num__': 9,
'__sls__': 'pigeon.config',
'changes': {},
'__id__': '/etc/update-motd.d/motd-message'
},
'service_|-filebeat_service_|-filebeat_|-running': {
'comment': 'The service filebeat is already running',
'name': 'filebeat',
'start_time': '03:33:59.259874',
'result': True,
'duration': 78.15,
'__run_num__': 73,
'__sls__': 'filebeat.install',
'changes': {},
'__id__': 'filebeat_service'
},
'file_|-/etc/nginx/nginx.conf_|-/etc/nginx/nginx.conf_|-managed': {
'comment': 'File /etc/nginx/nginx.conf is in the correct state',
'pchanges': {},
'name': '/etc/nginx/nginx.conf',
'start_time': '03:33:51.423733',
'result': True,
'duration': 15.982,
'__run_num__': 52,
'__sls__': 'api.config',
'changes': {},
'__id__': '/etc/nginx/nginx.conf'
},
'npm_|-install-forever_|-forever_|-installed': {
'comment': "Package(s) 'forever' satisfied by [email protected]",
'name': 'forever',
'start_time': '03:33:54.854859',
'result': True,
'duration': 1400.46,
'__run_num__': 62,
'__sls__': 'api.config',
'changes': {},
'__id__': 'install-forever'
},
'pkg_|-yarn_|-yarn_|-installed': {
'comment': 'All specified packages are already installed',
'name': 'yarn',
'start_time': '03:33:51.314743',
'result': True,
'duration': 4.385,
'__run_num__': 42,
'__sls__': 'api.install',
'changes': {},
'__id__': 'yarn'
},
'cmd_|-Copy lib_|-cp -r bin/ include/ lib/ share/ /usr/local/_|-run': {
'comment': 'unless execution succeeded',
'name': 'cp -r bin/ include/ lib/ share/ /usr/local/',
'start_time': '03:33:51.004514',
'skip_watch': True,
'__id__': 'Copy lib',
'duration': 211.827,
'__run_num__': 36,
'__sls__': 'node.binary',
'changes': {},
'result': True
},
'file_|-filebeat_config_|-/etc/filebeat/filebeat.yml_|-managed': {
'comment': 'File /etc/filebeat/filebeat.yml is in the correct state',
'pchanges': {},
'name': '/etc/filebeat/filebeat.yml',
'start_time': '03:33:59.187541',
'result': True,
'duration': 71.152,
'__run_num__': 72,
'__sls__': 'filebeat.config',
'changes': {},
'__id__': 'filebeat_config'
},
'file_|-watch_tcp_connections_|-/opt/watch_tcp_connections.sh_|-managed': {
'comment': 'File /opt/watch_tcp_connections.sh is in the correct state',
'pchanges': {},
'name': '/opt/watch_tcp_connections.sh',
'start_time': '03:33:51.319222',
'result': True,
'duration': 21.117,
'__run_num__': 43,
'__sls__': 'api.install',
'changes': {},
'__id__': 'watch_tcp_connections'
},
'pkgrepo_|-filebeat_repo_|-deb https://artifacts.elastic.co/packages/6.x/apt stable main_|-managed': {
'comment': "Package repo 'deb https://artifacts.elastic.co/packages/6.x/apt stable main' already configured",
'name': 'deb https://artifacts.elastic.co/packages/6.x/apt stable main',
'start_time': '03:33:59.099836',
'result': True,
'duration': 75.371,
'__run_num__': 70,
'__sls__': 'filebeat.install',
'changes': {},
'__id__': 'filebeat_repo'
},
'file_|-diamond_service_|-/etc/systemd/system/diamond.service_|-managed': {
'comment': 'File /etc/systemd/system/diamond.service is in the correct state',
'pchanges': {},
'name': '/etc/systemd/system/diamond.service',
'start_time': '03:33:50.767076',
'result': True,
'duration': 13.341,
'__run_num__': 30,
'__sls__': 'diamond',
'changes': {},
'__id__': 'diamond_service'
},
'file_|-Remove existing include directory_|-/usr/local/include/node_|-absent': {
'comment': 'unless execution succeeded',
'name': '/usr/local/include/node',
'start_time': '03:33:50.968463',
'skip_watch': True,
'__id__': 'Remove existing include directory',
'duration': 35.069,
'__run_num__': 34,
'__sls__': 'node.binary',
'changes': {},
'result': True
},
'file_|-dnsmasq_conf_dir_|-/etc/dnsmasq.d_|-recurse': {
'comment': 'The directory /etc/dnsmasq.d is in the correct state',
'pchanges': {},
'name': '/etc/dnsmasq.d',
'start_time': '03:33:49.960696',
'result': True,
'duration': 73.066,
'__run_num__': 22,
'__sls__': 'dnsmasq',
'changes': {},
'__id__': 'dnsmasq_conf_dir'
},
'file_|-Get binary package_|-/usr/local/src/node-v8.11.3-linux-x64.tar.gz_|-managed': {
'comment': 'File /usr/local/src/node-v8.11.3-linux-x64.tar.gz is in the correct state',
'pchanges': {},
'name': '/usr/local/src/node-v8.11.3-linux-x64.tar.gz',
'start_time': '03:33:50.824000',
'result': True,
'duration': 142.988,
'__run_num__': 32,
'__sls__': 'node.binary',
'changes': {},
'__id__': 'Get binary package'
},
'pip_|-diamond_|-diamond_|-installed': {
'comment': 'Python package diamond was already installed\nAll specified packages are already installed',
'name': 'diamond',
'start_time': '03:33:50.084489',
'result': True,
'duration': 633.598,
'__run_num__': 24,
'__sls__': 'diamond',
'changes': {},
'__id__': 'diamond'
},
'pkgrepo_|-add-nginx-repo_|-deb http://nginx.org/packages/ubuntu xenial nginx_|-managed': {
'comment': "Package repo 'deb http://nginx.org/packages/ubuntu xenial nginx' already configured",
'name': 'deb http://nginx.org/packages/ubuntu xenial nginx',
'start_time': '03:33:51.216530',
'result': True,
'duration': 33.909,
'__run_num__': 37,
'__sls__': 'api.install',
'changes': {},
'__id__': 'add-nginx-repo'
},
'file_|-/etc/diamond_|-/etc/diamond_|-directory': {
'comment': 'Directory /etc/diamond is in the correct state\nDirectory /etc/diamond updated',
'pchanges': {},
'name': '/etc/diamond',
'start_time': '03:33:50.718373',
'result': True,
'duration': 1.298,
'__run_num__': 25,
'__sls__': 'diamond',
'changes': {},
'__id__': '/etc/diamond'
},
'service_|-newrelic_service_|-newrelic-daemon_|-running': {
'comment': 'Service newrelic-daemon has been enabled, and is in the desired state',
'name': 'newrelic-daemon',
'start_time': '03:33:58.708508',
'__id__': 'newrelic_service',
'duration': 390.929,
'__run_num__': 69,
'__sls__': 'api.config',
'changes': {},
'result': True
},
'cmd_|-disable_nginx_default_site_|-rm -rf /etc/nginx/conf.d/default.conf_|-run': {
'comment': 'unless execution succeeded',
'name': 'rm -rf /etc/nginx/conf.d/default.conf',
'start_time': '03:33:51.440235',
'skip_watch': True,
'__id__': 'disable_nginx_default_site',
'duration': 11.623,
'__run_num__': 53,
'__sls__': 'api.config',
'changes': {},
'result': True
},
'service_|-nginx_service_|-nginx_|-running': {
'comment': 'The service nginx is already running',
'name': 'nginx',
'start_time': '03:33:58.550449',
'result': True,
'duration': 157.816,
'__run_num__': 68,
'__sls__': 'api.config',
'changes': {},
'__id__': 'nginx_service'
},
'file_|-/var/log/diamond_|-/var/log/diamond_|-directory': {
'comment': 'Directory /var/log/diamond is in the correct state\nDirectory /var/log/diamond updated',
'pchanges': {},
'name': '/var/log/diamond',
'start_time': '03:33:50.720776',
'result': True,
'duration': 0.819,
'__run_num__': 27,
'__sls__': 'diamond',
'changes': {},
'__id__': '/var/log/diamond'
},
'pkg_|-new_relic_|-newrelic-php5_|-installed': {
'comment': 'All specified packages are already installed',
'name': 'newrelic-php5',
'start_time': '03:33:51.309730',
'result': True,
'duration': 4.525,
'__run_num__': 41,
'__sls__': 'api.install',
'changes': {},
'__id__': 'new_relic'
},
'file_|-/etc/update-motd.d/9999-pigeon_|-/etc/update-motd.d/9999-pigeon_|-managed': {
'comment': 'File /etc/update-motd.d/9999-pigeon is in the correct state',
'pchanges': {},
'name': '/etc/update-motd.d/9999-pigeon',
'start_time': '03:33:49.725615',
'result': True,
'duration': 13.342,
'__run_num__': 10,
'__sls__': 'pigeon.config',
'changes': {},
'__id__': '/etc/update-motd.d/9999-pigeon'
},
'file_|-diamond_conf_|-/etc/diamond/diamond.conf_|-managed': {
'comment': 'File /etc/diamond/diamond.conf is in the correct state',
'pchanges': {},
'name': '/etc/diamond/diamond.conf',
'start_time': '03:33:50.722618',
'result': True,
'duration': 21.883,
'__run_num__': 28,
'__sls__': 'diamond',
'changes': {},
'__id__': 'diamond_conf'
}
}
}
}, u 'success': True, u '_stamp': '2019-01-31T03:37:52.686848', u 'user': u 'sudo_kenan', u 'fun': u 'runner.jobs.lookup_jid'
}
[DEBUG ] LazyLoaded highstate.output
[DEBUG ] LazyLoaded nested.output
[DEBUG ] LazyLoaded nested.output
[DEBUG ] LazyLoaded nested.output
[DEBUG ] LazyLoaded nested.output
[DEBUG ] LazyLoaded nested.output
[DEBUG ] LazyLoaded nested.output
Hope this helps.
thanks for the additional information. I was specifically looking for the logs in teh log file /var/log/salt/master as those include timestamps. I am trying to tie your error: socket file has been removed - shutting down to what happens in salt. It looks like something is shutting down your gpg agent.
are you handing the gpg-agent service anywhere else? or even possibly in your state files?
Unfortunately, my debug logs wasn't turned on during that period. It's happening now more regularly.
February 17th 2019, 20:56:44.039 | gpg: decryption failed: No secret key
February 17th 2019, 20:56:44.039 | gpg: can't connect to the agent: IPC connect call failed
February 17th 2019, 20:56:44.039 | gpg: encrypted with 2048-bit RSA key, ID 0F622CA5E8B58599, created 2018-02-07
February 15th 2019, 06:01:35.675 | gpg: decryption failed: No secret key
February 15th 2019, 06:01:35.675 | gpg: encrypted with 2048-bit RSA key, ID 0F622CA5E8B58599, created 2018-02-07
February 15th 2019, 06:01:35.674 | gpg: can't connect to the agent: IPC connect call failed
February 14th 2019, 14:13:33.970 | gpg: encrypted with 2048-bit RSA key, ID 0F622CA5E8B58599, created 2018-02-07
February 14th 2019, 14:13:33.970 | gpg: can't connect to the agent: IPC connect call failed
February 14th 2019, 14:13:33.970 | gpg: decryption failed: No secret key
February 13th 2019, 23:00:39.255 | gpg: encrypted with 2048-bit RSA key, ID 0F622CA5E8B58599, created 2018-02-07
February 13th 2019, 23:00:39.255 | gpg: decryption failed: No secret key
February 13th 2019, 23:00:39.255 | gpg: can't connect to the agent: IPC connect call failed
February 12th 2019, 22:25:20.725 | gpg: decryption failed: No secret key
February 12th 2019, 22:25:20.725 | gpg: encrypted with 2048-bit RSA key, ID 0F622CA5E8B58599, created 2018-02-07
GPG service is within the salt-master server. I already have the keys initially. I just imported them via these set of commands:
cp public_key.gpg /etc/salt/gpgkeys
cp pubring.gpg /etc/salt/gpgkeys
cp secring.gpg /etc/salt/gpgkeys
cp trustdb.gpg /etc/salt/gpgkeys
chmod 600 -R /etc/salt/gpgkeys
/usr/bin/gpg --import /etc/salt/gpgkeys/public_key.gpg
ping @saltstack/team-core have any other ideas here whats going on?
I'll take a look at this either today or tomorrow
Thanks @waynew. Let me know what other details you need from me.
@kenanbiel do you have gpg logs and salt master logs from the same time period?
I've done a cursory check - for gpg-agent, or S.gpg-agent (the default socket filename), or gpg-connect-agent (which gives me can't connect to the agent: IPC connect call failed) but I haven't found anything suspect.
Also, what is the output of ps aux | grep gpg when the call is failing?
I have just reactivated gpg logs, salt-debug logs and adding ps aux|grep gpg monitor when this happens again. I'll update this thread as soon as i experience it again.
@kenanbiel I presume that you are using Ubuntu 18.04 since gpg v2.2.4 ( the version which ships with Ubuntu 18.04). I have found that with the move to using systemd internally by gpg-agent that items now should be done as a non-root user rather than as root, for gpg-agent to function similar as in the past, especially with the use of passphrase caching. I first noticed this issue with non-root instantiation with on Raspbian 9 (approx. Debian 9.2). The standard sockets are created differently between root and non-root users.
Had to change how debbuild.py (debbuild_pkgbuild.py in develop) in 2019.2.0 used gpg internally for a non-root user, also check salt-pack or salt-pack-py3 for how file_roots/setup/ubuntu/gpg_agent.sls starts and stop gpg-agent.
Run into some issues still with orchestration and Ubuntu 18.04 and gpg-agent v2.2.4, however with Ubuntu 18.10 and gpg-agent v2.2.8 do not occur (some fixes for pinentry may explain no more issues).
Hope this helps resolve your issue
@waynew so here are the debug logs i got:
From /var/log/gpg_agent.log, basically no changes from the moment i have re-started it:
2019-03-25 16:12:30 gpg-agent[12649] gpg-agent (GnuPG) 2.2.4 started
/var/log/gpg_agent.log (END)
From /var/log/salt/master:
2019-03-28 05:58:26,917 [salt.utils.lazy :100 ][DEBUG ][7941] LazyLoaded local_cache.clean_old_jobs
2019-03-28 05:58:26,985 [salt.utils.gitfs :2675][DEBUG ][7954] Re-using gitfs object for process 7954
2019-03-28 05:58:26,986 [salt.utils.gitfs :2675][DEBUG ][7954] Re-using gitfs object for process 7954
2019-03-28 05:58:26,989 [salt.utils.gitfs :2675][DEBUG ][7947] Re-using gitfs object for process 7947
2019-03-28 05:58:26,991 [salt.utils.gitfs :2675][DEBUG ][7947] Re-using gitfs object for process 7947
2019-03-28 05:58:26,991 [salt.utils.lazy :100 ][DEBUG ][7941] LazyLoaded localfs.list_tokens
2019-03-28 05:58:26,992 [salt.utils.gitfs :852 ][DEBUG ][7941] Set update lock for git_pillar remote 'master git@gitrepo:project/
devops-saltpillar.git'
2019-03-28 05:58:26,992 [salt.utils.gitfs :755 ][DEBUG ][7941] Fetching git_pillar remote 'master git@@gitrepo:project/devops-salt
pillar.git'
2019-03-28 05:58:26,993 [git.cmd :719 ][DEBUG ][7941] Popen(['git', 'fetch', '-v', 'origin'], cwd=/var/cache/salt/master/git_pillar/02fa59166
b906c9ae290dec334229800a9b78a3309cc11766b2fc77dd9e8e719, universal_newlines=True, shell=None)
2019-03-28 05:58:26,994 [salt.utils.gitfs :2675][DEBUG ][7956] Re-using gitfs object for process 7956
2019-03-28 05:58:27,172 [salt.utils.gitfs :2675][DEBUG ][7955] Re-using gitfs object for process 7955
2019-03-28 05:58:27,179 [salt.utils.gitfs :2675][DEBUG ][7955] Re-using gitfs object for process 7955
2019-03-28 05:58:27,182 [salt.utils.gitfs :2675][DEBUG ][7945] Re-using gitfs object for process 7945
2019-03-28 05:58:27,189 [salt.utils.gitfs :2675][DEBUG ][7945] Re-using gitfs object for process 7945
2019-03-28 05:58:27,192 [salt.utils.gitfs :2675][DEBUG ][7954] Re-using gitfs object for process 7954
2019-03-28 05:58:27,400 [salt.loaded.int.returner.local_cache:249 ][DEBUG ][7947] Adding minions for job 20190328055827399749: [u'salt-minion']
2019-03-28 05:58:27,401 [salt.utils.job :68 ][INFO ][7947] Got return from salt-minion for job 20190328055
827399749
2019-03-28 05:58:27,401 [salt.utils.event :739 ][DEBUG ][7947] Sending event: tag = salt/job/20190328055827399749/ret/salt-minion; data data = {u'tgt_type': u'glob', 'fun_args': [], 'jid': u'20190328055827399749', 'return': { <<< REDACTED >>> apikey: <redacted> \n+ fullstory.apikey: -----BEGIN PGP MESSAGE-----\\nVersion: GnuPG v1\\n\\nhQEMAw9iLKXotYWZAQgAqKs6bj2GT1c99IR32lmiW/e7Rtm/8F9OqtCQ5lL7/X3t\\nTEjpLwzjuGlvL0i4hgeKIZ4IkfnlMpTwkfBk57GSn1ftPvBr71NCsrSRlbDrGjK0\\ntpfPm//4UIxAHYmaZPz6zDAN2fpetXl/IKIvH8m6VX/kkPN9cNBQbA8sZpHxHIm8\\nuEcg0ZkHrQ+lg4caXocdLNWoP8oJS8AKyEKKbu9/aBoVeNnio3EMwbwE7vtwYQJg\\nhhIypLGAzuw+FWPee69K3b6S6326cC1Bh+ctW/0TKNpGsYFh4CEFEagVDlRGeR3w\\n84Lnffxb5Z9fZXSRaPfzkX1u0ycF91n7XOspsbp6/dJbAa4h9CSRxi9zizFiMaB8\\nC1+DGfx9QGTU6YcDOeYSbPfDIsXziUcnAB8x7cYmzEllLhB9yXZhJ4CI9pLhvirB\\nA0J4QWn9aL/nwTAqb+7TfvPh3Ib1jgQM0JtW/w==\\n=UbsW\\n-----END PGP MESSAGE-----\\n\n}}, , 'retcode': 0, 'success': True, 'schedule': 'highstate', u'tgt': salt-minion', 'cmd': '_return', 'pid': 9763, '_stamp': '2019-03-28T05:58:27.401316', u'arg': [], 'fun': 'state.highstate', 'id': 'salt-minion', 'out': 'highstate'}
2019-03-28 05:58:27,407 [salt.utils.reactor:109 ][DEBUG ][7940] Gathering reactors for tag salt/job/20190328055827399749/ret/salt-minion
2019-03-28 05:58:27,407 [salt.utils.reactor:196 ][DEBUG ][7940] Compiling reactions for tag salt/job/20190328055827399749/ret/salt-minion
2019-03-28 05:58:27,407 [salt.utils.gitfs :2675][DEBUG ][7940] Re-using gitfs object for process 7940
2019-03-28 05:58:27,404 [salt.loaded.int.returner.local_cache:249 ][DEBUG ][7947] Adding minions for job 20190328055827399749: [u'salt-minion']
2019-03-28 05:58:27,414 [salt.utils.gitfs :2675][DEBUG ][7940] Re-using gitfs object for process 7940
2019-03-28 05:58:27,414 [salt.utils.gitfs :2675][DEBUG ][7940] Re-using gitfs object for process 7940
According to the debug logs, there's no indication why decryption failed. I have redacted the other states in the debug file. The weird part is that i am decrypting around 20-30 keys for this server and only one key failed to be decrypted. Of course, on the next highstate, salt-master was able to decrypt and provide correct value.
@dmurphy18 Yes, you are right - i'm using Ubuntu 18.04 and gpg v2.2.4.
This really doesn't happen regularly, just once in a while but it has affected my prod environment multiple times already. So your advice is to update my salt-master's Ubuntu to 18.10 so it would use gpg 2.2.8?
That seems like the suggestion - though I wonder if there's a backport of gpg 2.2.8
@kenanbiel @waynew Most times 18.10 works for me, but still fighting 18.04 gpg-agent v2.2.4. Have not found a satisfactory solution myself yet. When I get time, plan on looking at changes and back-porting, since attempting to utilise v2.2.8 directly was a non-starter, too many dependent changes.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.
We've seen this on our server too, The problem was that we were running out of locked memory and gpg was just failng to run and decrypt.
Sadly salt does not detect any failure and passed on the undecrypted file, to overwrite the correct decrypted file, breaking whatever relies on it.
It seems like a pretty bad flaw. You may want to try this in your systemd file:
LimitMEMLOCK=infinity
We are having the same issue. It is impossible to target more than a handful of nodes at once or else GPG will fail to decrypt all pillars for all nodes. We run the salt master in a Docker container and I tried setting all necessary flags for unlimited mlock, but I can't seem to get it working. Setting LimitMEMLOCK in the salt-master unit file has no effect and the gpg-agent isn't managed by systemd. Any tips how I can solve this issue?
I can start the whole container with --privileged --ulimit memlock=-1:-1, but it has no effect. When I run saltutil.refresh_pillar on a handful of nodes, I see a gpg-agent process with limits
Max locked memory unlimited unlimited bytes
But when I run it on more hosts, the gpg-agent just crashes at some point and I start seeing the decryption errors in the log.
I enabled debugging for the gpg-agent and this is what appears in the log:
2020-10-21 15:49:20 gpg-agent[15763] gpg-agent (GnuPG) 2.2.4 stopped
2020-10-21 15:49:21 gpg-agent[15770] gpg-agent (GnuPG) 2.2.4 started
2020-10-21 15:49:50 gpg-agent[15770] Warning: using insecure memory!
2020-10-21 15:49:50 gpg-agent[15770] decryption failed: Cannot allocate memory
2020-10-21 15:49:50 gpg-agent[15770] command 'PKDECRYPT' failed: Cannot allocate memory <gcrypt>
2020-10-21 15:49:50 gpg-agent[15770] decryption failed: Cannot allocate memory
2020-10-21 15:49:50 gpg-agent[15770] command 'PKDECRYPT' failed: Cannot allocate memory <gcrypt>
2020-10-21 15:49:50 gpg-agent[15770] Ohhhh jeeee: ... this is a bug (../../src/sexp.c:1433:do_vsexp_sscan)
2020-10-21 15:49:51 gpg-agent[16203] gpg-agent (GnuPG) 2.2.4 started
There is enough free memory left on the host and the memlock limits of the gpg-agent process are unlimited.
I can reduce the number of workers on the master to 30, which gets rid of the problem. Any number above that will reliably trigger the bug. Unfortunately, this means that now we are getting the occasional minion response timeout again when the worker pool is exhausted.
what version of libgcrypt do you have installed? There was some improvements implemented in gpg to help gpg handle more concurrent connections in 1.8.2 of libgcrypt and gpg 2.2.4 (which you have)
There is also this flag to gpg you can use if using secure memory: --auto-expand-secmem https://dev.gnupg.org/T3530
gpg-agent 1.8.1, which is the latest in Ubuntu 20.04 at the moment. I already tried auto-expand-secmem, but that doesn't solve the problem.
@phoerious I suspect the libgcrypt version is the problem. Updating it past 1.8.1 was the resolution for Ubuntu 18.04 in another case I've recently been involved with. To summarize (and this is essentially what @Ch3LL posted above):
libgcrypt added support for auto-expand-secmem in 1.8.2auto-expand-secmem in 2.2.4libgcrypt 1.8.1 and GnuPG 2.2.4See also #51356.