(Please provide relevant configs and/or SLS files (Be sure to remove sensitive info).)
cat minion | grep -v "^#" | grep -v "^$"
master: ***
user: root
pidfile: /var/run/salt2019-minion.pid
conf_file: /etc/salt2019/minion
pki_dir: /etc/salt2019/pki/minion
id: ***
cachedir: /var/cache/salt2019/minion
sock_dir: /var/run/salt2019/minion
file_roots: /srv/salt2019
log_file: /var/log/salt2019/minion
key_logfile: /var/log/salt2019/key
log_level: debug
tcp_keepalive: True
user: ***
publisher_acl:
user***:
- test.*
- state.*
- cmd.*
(Include debug logs if possible and relevant.)
2020-03-20 14:50:40,465 [salt.minion :1465][INFO ][1397] User saltapi Executing command state.sls with jid 20200320145808931182
2020-03-20 14:50:40,493 [salt.minion :1472][DEBUG ][1397] Command details {'fun': 'state.sls', 'ret': '', 'tgt': '1.2.3.4', 'user': 'saltapi', 'tgt_type': 'glob', 'jid': '20200320145808931182', 'arg': ['------sensitive info is replaced ----------', 'pillar={"tasktype":"ent-serial-collect"}', {'queue': True, '__kwarg__': True}]}
2020-03-20 14:50:40,526 [salt.minion :1605][INFO ][17093] Starting a new job 20200320145808931182 with PID 17093
2020-03-20 14:50:40,550 [salt.utils.lazy :107 ][DEBUG ][17093] Could not LazyLoad {0}.allow_missing_func: '{0}.allow_missing_func' is not available.
2020-03-20 14:50:40,570 [salt.utils.lazy :104 ][DEBUG ][17093] LazyLoaded state.sls
2020-03-20 14:50:40,582 [salt.utils.lazy :104 ][DEBUG ][17093] LazyLoaded saltutil.is_running
2020-03-20 14:50:40,587 [salt.utils.lazy :104 ][DEBUG ][17093] LazyLoaded grains.get
2020-03-20 14:50:40,589 [salt.loader.2.2.2.2.int.module.config :398 ][DEBUG ][17093] key: test, ret: _|-
2020-03-20 14:50:40,611 [salt.transport.zeromq :132 ][DEBUG ][17093] Initializing new AsyncZeroMQReqChannel for ('/etc/salt2019/pki/minion', '1.2.3.4', 'tcp://2.2.2.2:4506', 'aes')
2020-03-20 14:50:40,611 [salt.crypt :463 ][DEBUG ][17093] Initializing new AsyncAuth for ('/etc/salt2019/pki/minion', '1.2.3.4', 'tcp://2.2.2.2:4506')
2020-03-20 14:50:40,613 [salt.transport.zeromq :203 ][DEBUG ][17093] Connecting the Minion to the Master URI (for the return server): tcp://2.2.2.2:4506
2020-03-20 14:50:41,995 [salt.minion :1465][INFO ][1397] User saltapi Executing command state.sls with jid 20200320145810461355
2020-03-20 14:50:41,995 [salt.minion :1472][DEBUG ][1397] Command details {'fun': 'state.sls', 'ret': '', 'tgt': '1.2.3.4', 'user': 'saltapi', 'tgt_type': 'glob', 'jid': '20200320145810461355', 'arg': ['------sensitive info is replaced ----------', 'pillar={"tasktype":"ent-serial-collect"}', {'queue': True, '__kwarg__': True}]}
2020-03-20 14:50:42,006 [salt.minion :1605][INFO ][17213] Starting a new job 20200320145810461355 with PID 17213
2020-03-20 14:50:42,010 [salt.utils.lazy :107 ][DEBUG ][17213] Could not LazyLoad {0}.allow_missing_func: '{0}.allow_missing_func' is not available.
2020-03-20 14:50:42,014 [salt.utils.lazy :104 ][DEBUG ][17213] LazyLoaded state.sls
2020-03-20 14:50:42,018 [salt.utils.lazy :104 ][DEBUG ][17213] LazyLoaded saltutil.is_running
2020-03-20 14:50:42,020 [salt.utils.lazy :104 ][DEBUG ][17213] LazyLoaded grains.get
2020-03-20 14:50:42,022 [salt.loader.2.2.2.2.int.module.config :398 ][DEBUG ][17213] key: test, ret: _|-
2020-03-20 14:50:42,038 [salt.transport.zeromq :132 ][DEBUG ][17213] Initializing new AsyncZeroMQReqChannel for ('/etc/salt2019/pki/minion', '1.2.3.4', 'tcp://2.2.2.2:4506', 'aes')
2020-03-20 14:50:42,038 [salt.crypt :463 ][DEBUG ][17213] Initializing new AsyncAuth for ('/etc/salt2019/pki/minion', '1.2.3.4', 'tcp://2.2.2.2:4506')
2020-03-20 14:50:42,040 [salt.transport.zeromq :203 ][DEBUG ][17213] Connecting the Minion to the Master URI (for the return server): tcp://2.2.2.2:4506
2020-03-20 15:01:15,337 [salt.utils.schedule :1627][DEBUG ][1397] schedule: Job __mine_interval was scheduled with jid_include, adding to cache (jid_include defaults to True)
2020-03-20 15:01:15,337 [salt.utils.schedule :1630][DEBUG ][1397] schedule: Job __mine_interval was scheduled with a max number of 2
2020-03-20 15:01:15,337 [salt.utils.schedule :1647][INFO ][1397] Running scheduled job: __mine_interval
2020-03-20 15:01:15,458 [salt.utils.schedule :689 ][DEBUG ][10760] schedule.handle_func: adding this job to the jobcache with data {'fun': 'mine.update', 'pid': 10760, 'id': '1.2.3.4', 'jid': '20200320150115452799', 'schedule': '__mine_interval', 'fun_args': []}
2020-03-20 15:01:15,461 [salt.utils.lazy :104 ][DEBUG ][10760] LazyLoaded mine.update
2020-03-20 15:01:15,463 [salt.utils.lazy :104 ][DEBUG ][10760] LazyLoaded config.merge
2020-03-20 15:01:15,463 [salt.utils.schedule :836 ][DEBUG ][10760] schedule.handle_func: Removing /var/cache/salt2019/minion/proc/20200320150115452799
(Provided by running salt --versions-report
. Please also mention any differences in master/minion versions.)
[root@localhost chenyanyan]# salt-master -V
Salt Version:
Salt: 2018.3.4
Dependency Versions:
cffi: 1.6.0
cherrypy: 5.6.0
dateutil: 1.5
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
ioflo: Not Installed
Jinja2: 2.7.2
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: 0.21.1
Mako: 0.8.1
msgpack-pure: Not Installed
msgpack-python: 0.4.6
mysql-python: Not Installed
pycparser: 2.14
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 2.7.5 (default, Aug 27 2018, 16:21:36)
python-gnupg: Not Installed
PyYAML: 3.10
PyZMQ: 15.3.0
RAET: Not Installed
smmap: Not Installed
timelib: Not Installed
Tornado: 4.2.1
ZMQ: 4.1.4
System Versions:
dist:
locale: UTF-8
machine: x86_64
release: 3.10.0-693.21.1.el7.x86_64
system: Linux
version: Not Installed
linux-jc57:/var/log # /root/miniconda3/bin/salt-minion -V
Salt Version:
Salt: 2019.2.0
Dependency Versions:
cffi: Not Installed
cherrypy: Not Installed
dateutil: Not Installed
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
ioflo: Not Installed
Jinja2: 2.10.1
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: Not Installed
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.6.1
mysql-python: Not Installed
pycparser: Not Installed
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 3.5.2 |Continuum Analytics, Inc.| (default, Jul 2 2016, 17:53:06)
python-gnupg: Not Installed
PyYAML: 3.11
PyZMQ: 15.3.0
RAET: Not Installed
smmap: Not Installed
timelib: Not Installed
Tornado: 4.5.3
ZMQ: 4.1.5
System Versions:
dist: SuSE 11 x86_64
locale: UTF-8
machine: x86_64
release: 3.0.76-0.11-default
system: Linux
version: SUSE Linux Enterprise Server 11 x86_64
How many minions do you have connected to your master? And what I'm reading here is that _some_ minions will no longer connect to the master after a few days, or all minions will no longer connect to the master after a few days?
Also, I noticed your master is on 2018.3.4, which is a version we are no longer officially supporting. Is there any way you'd be able to try this setup in another environment with your master on 3000 or also on 2019.2?
In our test environment,we have about 6 minions with 3 types of os: suse and other two developed by our company, only the minions installed in suse severs will occasionally meets with that 'not return' problem, others run very well;
Yes, we can set up with the latest version and see if this will occur;
I should point out that this problem happened very occasionally,we have no idea when will it happen again,but in our product environment, we have thousands minion boxes,so it could be a serious problem.
Thank you very much for your reply.
------------------ 原始邮件 ------------------
发件人: "Joe Eacott"<[email protected]>;
发送时间: 2020年3月31日(星期二) 凌晨0:34
收件人: "saltstack/salt"<[email protected]>;
抄送: "1280129660"<[email protected]>;"Author"<[email protected]>;
主题: Re: [saltstack/salt] Minion did not return. [No response] appears occasionally,but once happened,minion never returns (#56467)
Also, I noticed you're on 2018.3.4, which is a version we are no longer officially supporting. Is there any way you'd be able to try this setup in another environment with our latest Salt 3000?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Are you able to run basic commands such as ip ping <address>
and also check the ports on the suse minions? It could be a configuration issue on the suse minion end since it seems your other in-house minions are running fine without any hiccups!
After minion doesn't return, test.ping also doesn't return ,but ip ping is OK. When we restart the minion manually,it runs well again.
I didn't check the ports on suse, and now I am waiting for it happens again.
发自我的iPhone
------------------ Original ------------------
From: Joe Eacott <[email protected]>
Date: Tue,Mar 31,2020 1:22 AM
To: saltstack/salt <[email protected]>
Cc: marilyn6483 <[email protected]>, Author <[email protected]>
Subject: Re: [saltstack/salt] Minion did not return. [No response] appears occasionally,but once happened,minion never returns (#56467)
Are you able to run basic commands such as ip ping
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
address
Are you able to run basic commands such as
ip ping <address>
and also check the ports on the suse minions? It could be a configuration issue on the suse minion end since it seems your other in-house minions are running fine without any hiccups!
I found some system logs:
Mar 20 14:50:40 linux kernel: [89579.613557] python[17093]: segfault at 5f436 ip 000000000005f436 sp 00007fff29e20658 error 14 in python3.5[400000+1000]
Mar 20 14:50:42 linux kernel: [89581.040805] python[17213]: segfault at 5f436 ip 000000000005f436 sp 00007fff29e20658 error 14 in python3.5[400000+1000]
At the time 'Mar 20 14:50:40', the minion‘s log is:
020-03-20 14:50:40,613 [salt.transport.zeromq :203 ][DEBUG ][17093] Connecting the Minion to the Master URI (for the return server): tcp://xxxxx
Is there any connections?
I am facing the same issue in Salt: 2019.2.5. Where salt master and minion running inside centos 7 container. Almost 24 test.ping is worked. After that when do several tries also will getting "Minion did not return. [No response]".
If i rebooted the salt-minion, next almost 24 test.ping worked and then this problem occurred again.
What's your Python version?
发自我的iPhone
------------------ Original ------------------
From: SaravanakumarSivasankaran <[email protected]>
Date: Tue,Jul 14,2020 2:24 PM
To: saltstack/salt <[email protected]>
Cc: marilyn6483 <[email protected]>, Author <[email protected]>
Subject: Re: [saltstack/salt] Minion did not return. [No response] appears occasionally,but once happened,minion never returns (#56467)
I am facing the same issue in Salt: 2019.2.5. Where salt master and minion running inside centos 7 container. Almost 16 test.ping is worked. After that when do several tries also will get "Minion did not return. [No response]".
If i rebooted the salt-minion, next almost 16 test.ping worked and the problem occurred again.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Python 2.7.5
Found the below observation in salt-minion trace logs for not responded test.ping
Looks like salt-minion responded that test.ping
[DEBUG ] minion return: {u'fun_args': [], u'jid': u'20200715071230671893', u'return': True, u'retcode': 0, u'success': True, u'fun': u'test.ping'}
Looks like salt-master not received the above response, it asking for that jid again to salt-minion
[DEBUG ] Command details {u'tgt_type': u'list', u'jid': u'20200715071235735268', u'tgt': [u' node-name'], u'ret': u'', u'user': u'root', u'arg': [u'20200715071230671893'], u'fun': u'saltutil.find_job'}
I have the same problem in Salt 3000.3 (master and minons) too, but it's been going on for quite some time across prior versions to. I've struggled to find any ideas on how to identify it or how to even write a bug report for it.
We're running python 2.7 too and running on AWS Linux.
What I can determine is that there appears to be two cases:
You can see it play out by using salt-run state.event pretty=True
.
In the happy case, the following happens:
20200716191834527792 {
"_stamp": "2020-07-16T19:18:34.528083",
"minions": [
"the-minion,"
]
}
salt/job/20200716191834527792/new {
"_stamp": "2020-07-16T19:18:34.528555",
"arg": [],
"fun": "test.ping",
"jid": "20200716191834527792",
"minions": [
"the-minion,"
],
"missing": [],
"tgt": "the-minion,",
"tgt_type": "glob",
"user": "root"
}
salt/job/20200716191834527792/ret/the-minion, {
"_stamp": "2020-07-16T19:18:34.642001",
"cmd": "_return",
"fun": "test.ping",
"fun_args": [],
"id": "the-minion,",
"jid": "20200716191834527792",
"retcode": 0,
"return": true,
"success": true
}
The job is created, the return is received and all is right with the world.
In the unhappy case:
20200716191533486948 {
"_stamp": "2020-07-16T19:15:33.487146",
"minions": [
"the-minion,"
]
}
salt/job/20200716191533486948/new {
"_stamp": "2020-07-16T19:15:33.487613",
"arg": [],
"fun": "test.ping",
"jid": "20200716191533486948",
"minions": [
"the-minion,"
],
"missing": [],
"tgt": "the-minion,",
"tgt_type": "glob",
"user": "root"
}
20200716191538601450 {
"_stamp": "2020-07-16T19:15:38.601653",
"minions": [
"the-minion,"
]
}
salt/job/20200716191538601450/new {
"_stamp": "2020-07-16T19:15:38.602129",
"arg": [
"20200716191533486948"
],
"fun": "saltutil.find_job",
"jid": "20200716191538601450",
"minions": [
"the-minion,"
],
"missing": [],
"tgt": [
"the-minion,"
],
"tgt_type": "list",
"user": "root"
}
salt/job/20200716191538601450/ret/the-minion, {
"_stamp": "2020-07-16T19:15:38.711512",
"cmd": "_return",
"fun": "saltutil.find_job",
"fun_args": [
"20200716191533486948"
],
"id": "the-minion,",
"jid": "20200716191538601450",
"retcode": 0,
"return": {},
"success": true
}
It's broadly the same, but you can see a find_job
instance is created as @SaravanakumarSivasankaran suggests.
The minion logs similarly in both cases:
In the happy case:
2020-07-16 19:18:34,536 [salt.minion :1491][INFO ][8630] User root Executing command test.ping with jid 20200716191834527792
2020-07-16 19:18:34,619 [salt.minion :1621][INFO ][5442] Starting a new job 20200716191834527792 with PID 5442
2020-07-16 19:18:34,637 [salt.minion :1949][INFO ][5442] Returning information for job: 20200716191834527792
In the unhappy case, there is the additional find_job
:
2020-07-16 19:15:33,494 [salt.minion :1491][INFO ][8630] User root Executing command test.ping with jid 20200716191533486948
2020-07-16 19:15:33,595 [salt.minion :1621][INFO ][5192] Starting a new job 20200716191533486948 with PID 5192
2020-07-16 19:15:33,616 [salt.minion :1949][INFO ][5192] Returning information for job: 20200716191533486948
2020-07-16 19:15:38,610 [salt.minion :1491][INFO ][8630] User root Executing command saltutil.find_job with jid 20200716191538601450
2020-07-16 19:15:38,701 [salt.minion :1621][INFO ][5195] Starting a new job 20200716191538601450 with PID 5195
2020-07-16 19:15:38,706 [salt.minion :1949][INFO ][5195] Returning information for job: 20200716191538601450
Note there are two returns - one for the first test.ping and one for the following find_job.
Some print
debugging shows that the minion is, as best as I can tell, returning sensible messages. I modified the minion function here:
to:
def _do_transfer():
log.debug("_do_transfer")
log.debug(base64.b64encode(pickle.dumps(load))
# Yield control to the caller. When send() completes, resume by populating data with the Future.result
data = yield self.message_client.send(
self._package_load(self.auth.crypticle.dumps(load)),
timeout=timeout,
tries=tries,
)
And used this to dump the output in JSON:
print(json.dumps(pickle.loads(base64.b64decode("...."))))
Here's the successful and unsuccessful cases for two runs.
Successful case:
{"fun_args": ["sleep 10"], "jid": "20200716224848367462", "return": "", "retcode": 0, "success": true, "cmd": "_return", "fun": "cmd.run", "id": "the-minion"}
Unsuccessful case:
{"fun_args": ["sleep 10"], "jid": "20200716224718720155", "return": "", "retcode": 0, "success": true, "cmd": "_return", "fun": "cmd.run", "id": "ip-172-20-5-15.eu-west-1.compute.internal"}
{"fun_args": ["20200716224718720155"], "jid": "20200716224723832936", "return": {}, "retcode": 0, "success": true, "cmd": "_return", "fun": "saltutil.find_job", "id": "the-minion"}
Nothing to suggest that the minions are behaving differently.
tcpdump
shows that messages are going from the minion to the master - I can see no obvious differences between the two cases.
It looks to me that from running with -ltrace
and some print debugging that the minions are doing exactly what they're supposed to and the problem lies with the master not picking up the response from the minion, or at least not doing anything with it.
I find that restarting the master almost always resolves the problem, which makes tracking it down really hard. As far as I can tell - my best guess is that something about the socket receiving logic means that the messages aren't received correctly. I found that the following code: https://github.com/saltstack/salt/blob/master/salt/ext/tornado/iostream.py#L1051-L1054
if modified to:
def read_from_fd(self):
try:
chunk = self.socket.recv(self.read_chunk_size)
print(chunk) # print the chunk out when received
except socket.error as e:
would print the chunk message from the minion immediately when sending test.ping
and it was working, but would never print those messages if not. But I'm unfamiliar how this socket is created, what it's used for and whether it's indicative of an error or not. I'm running out of ideas on how to try and track this down - my socket, Python, Tornado and IPC knowledge is very limited, and so I'm struggling to determine where to even look.
I don't think there's anything unusual about our environment - we've around ~50-70 nodes. Most are static, with ~15 being auto scaling. The rest don't change. It's all in AWS within peered VPCs - and as above the traffic regarding tcpdump it seems to be traversing the network perfectly fine - sometimes it works, sometimes it doesn't and it's proving impossible to tell why.
Any thoughts or recommendations on how best to proceed would be utterly and completely magnificent.
@edhgoose I apologize it took me so long to actually follow up, here. Wondering if we should schedule a session with a Core Team member to attempt to hammer on this, I can help to schedule something, is there a time that is best for you?
@sagetherage, that'd be great.
If you want to take a take a look at https://calend.ly/edhgoose that's probably a good start? Generally evenings (UK time) are pretty good too.
@sagetherage - hey, I'd really love to get that help. This issue is driving me mad, sorry to chase - can we try and setup that session with a team member?
@sagetherage - hey, I'd really love to get that help. This issue is driving me mad, sorry to chase - can we try and setup that session with a team member?
yes! I have assigned @DmitryKuzmenko who can help here.
@DmitryKuzmenko have you been able to get with @edhgoose on this issue?
@sagetherage @DmitryKuzmenko and I had a conversation about turning on certain debugging, but I'm on holiday atm and haven't seen the issue for a little while. Unfortunately it seems extremely intermittent - but I'll be sure to confirm if it comes up again what the logs are.
@edhgoose no worries! I was checking in because I didn't want things to be left undone, thank you for the extra effort and have a nice holiday.
Unfortunately, I'm getting this same problem and manage about 1000s of servers.
[root@master]# salt-master -V
Salt Version:
Salt: 3001.1
Dependency Versions:
cffi: Not Installed
cherrypy: Not Installed
dateutil: Not Installed
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 2.11.1
libgit2: Not Installed
M2Crypto: 0.35.2
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.6.2
mysql-python: Not Installed
pycparser: Not Installed
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 3.6.8 (default, Apr 2 2020, 13:34:55)
python-gnupg: Not Installed
PyYAML: 3.13
PyZMQ: 17.0.0
smmap: Not Installed
timelib: Not Installed
Tornado: 4.5.3
ZMQ: 4.1.4
System Versions:
dist: centos 7 Core
locale: UTF-8
machine: x86_64
release: 3.10.0-1127.19.1.el7.x86_64
system: Linux
version: CentOS Linux 7 Core
[root@server salt]# salt-minion -V
Salt Version:
Salt: 3001.1
Dependency Versions:
cffi: Not Installed
cherrypy: Not Installed
dateutil: Not Installed
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 2.11.1
libgit2: Not Installed
M2Crypto: 0.35.2
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.6.2
mysql-python: Not Installed
pycparser: Not Installed
pycrypto: Not Installed
pycryptodome: Not Installed
pygit2: Not Installed
Python: 3.6.8 (default, Apr 2 2020, 13:34:55)
python-gnupg: Not Installed
PyYAML: 3.13
PyZMQ: 17.0.0
smmap: Not Installed
timelib: Not Installed
Tornado: 4.5.3
ZMQ: 4.1.4
System Versions:
dist: centos 7 Core
locale: UTF-8
machine: x86_64
release: 3.10.0-1127.19.1.el7.x86_64
system: Linux
version: CentOS Linux 7 Core
Like @edhgoose mentioned. Usually, a restart fixes it, however, it's been happening more and more lately. The company I work for adds new dedicated servers weekly and the issue persists even on new updated provisioned servers.
We use the python 3 versions of Salt.
@marilyn6483 do you have any chance to provide corresponding minion debug log?
We found out that the Python 3. 5 had a bug that caused the problem : here's the bug report
https: //bugs. python. org/issue29386
Please check your python version first ,and what I want to know is what have you do,did you use salt to execute some nested shell scripts?
发自我的iPhone
------------------ Original ------------------
From: Dmitry Kuzmenko <[email protected]>
Date: Tue,Sep 1,2020 6:56 PM
To: saltstack/salt <[email protected]>
Cc: marilyn6483 <[email protected]>, Mention <[email protected]>
Subject: Re: [saltstack/salt] Minion did not return. [No response] appears occasionally,but once happened,minion never returns (#56467)
Well, I'm running python3.6 and the issue persists.
I do run a few cmd.script
that execute shell scripts, however, it's never been this bad.
Just a follow-up as it's been 10 days without a reply.
I checked this bug and ran:
strace python3.6 -c 'import select; select.epoll().poll(timeout=-1)' &
stat("/usr/lib64/python3.6", {st_mode=S_IFDIR|0755, st_size=8192, ...}) = 0
stat("/usr/lib64/python3.6/lib-dynload", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/usr/local/lib/python3.6/site-packages", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/usr/lib64/python3.6/site-packages", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/usr/lib/python3.6/site-packages", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
brk(NULL) = 0x1184000
brk(0x11a5000) = 0x11a5000
getcwd("/root", 1024) = 6
stat("/root", {st_mode=S_IFDIR|0550, st_size=246, ...}) = 0
stat("/root", {st_mode=S_IFDIR|0550, st_size=246, ...}) = 0
stat("/root", {st_mode=S_IFDIR|0550, st_size=246, ...}) = 0
openat(AT_FDCWD, "/root", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
getdents(3, /* 17 entries */, 32768) = 528
getdents(3, /* 0 entries */, 32768) = 0
close(3) = 0
stat("/usr/lib64/python3.6", {st_mode=S_IFDIR|0755, st_size=8192, ...}) = 0
stat("/usr/lib64/python3.6/lib-dynload", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/usr/lib64/python3.6/lib-dynload/select.cpython-36m-x86_64-linux-gnu.so", {st_mode=S_IFREG|0755, st_size=33536, ...}) = 0
futex(0x7f3acc3a80d0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
open("/usr/lib64/python3.6/lib-dynload/select.cpython-36m-x86_64-linux-gnu.so", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000 \0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=33536, ...}) = 0
mmap(NULL, 2127544, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3ac5187000
mprotect(0x7f3ac518d000, 2093056, PROT_NONE) = 0
mmap(0x7f3ac538c000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x5000) = 0x7f3ac538c000
close(3) = 0
mprotect(0x7f3ac538c000, 4096, PROT_READ) = 0
epoll_create1(EPOLL_CLOEXEC) = 3
epoll_wait(3,
Just waits...
You mentioned nested shell scripts. We do something like this:
install_backups:
cmd.script:
- name: install_backups.sh
- source: salt://backups/files/install_backups.sh
updates
applied.@JustOneMoreBlock I've asked before and asking again. Could you please provide a hung minion debug log?
I got the same problem, my os is centos, python version is 3.6.8.
sall-master version is 3001.1.
It seems like we might have a few different issues all in the same ticket here:
@marilyn6483 Just to confirm, was the original issue that you had posted related to Python 3.5 and with an upgrade has been resolved?
@JustOneMoreBlock Following on your the comments from @DmitryKuzmenko can you provide a minion log with information when the issue occurs.
@edhgoose Is the issue you're seeing still persisting? Does it appear to be a timeout issue between master and minion? Have you tried increasing the timeout values? Are you running Salt in the same AWS region or is between regions? From outside AWS into AWS?
Thanks!
Most helpful comment
We found out that the Python 3. 5 had a bug that caused the problem : here's the bug report
https: //bugs. python. org/issue29386
Please check your python version first ,and what I want to know is what have you do,did you use salt to execute some nested shell scripts?
发自我的iPhone
------------------ Original ------------------
From: Dmitry Kuzmenko <[email protected]>
Date: Tue,Sep 1,2020 6:56 PM
To: saltstack/salt <[email protected]>
Cc: marilyn6483 <[email protected]>, Mention <[email protected]>
Subject: Re: [saltstack/salt] Minion did not return. [No response] appears occasionally,but once happened,minion never returns (#56467)