I was streaming Kafka on AWS EC2 CentOS 7. My Session Manager Idle Timeout is set to 60min. And yet, after running for much less than that, the terminal got frozen, saying My session has been terminated
. Of course, the Kafka streaming for disrupted as well.
When I tried to restart a new session with a new terminal, I got this error popup
Your session has been terminated for the following reasons: Plugin with name Standard_Stream not found. Step name: Standard_Stream
and I am still unable to restart a terminal.
What does this error mean and how to resolve it? Thanks.
CodePudding user response:
- So far you need to access the EC2 using SSH with key-pem to debug (ask your admin)
Running tail -f
got issue
tail: inotify resources exhausted
tail: inotify cannot be used, reverting to polling
Restart ssm-agent service also got issue
No space left on device
but it's not about disk space[root@env-test ec2-user]# systemctl restart amazon-ssm-agent.service Error: No space left on device
[root@env-test ec2-user]# df -h |grep dev devtmpfs 32G 0 32G 0% /dev tmpfs 32G 0 32G 0% /dev/shm /dev/nvme0n1p1 100G 82G 18G 83% /
So the error itself means that system is getting low on inotify watches, that enable programs to monitor file/dirs changes. To see the currently set limit (including output on my machine)
$ cat /proc/sys/fs/inotify/max_user_watches
8192
Check which processes using inotify to improve your apps or increase max_user_watches
for foo in /proc/*/fd/*; do readlink -f $foo; done | grep inotify | sort | uniq -c | sort -nr
5 /proc/1/fd/anon_inode:inotify
2 /proc/7126/fd/anon_inode:inotify
2 /proc/5130/fd/anon_inode:inotify
1 /proc/4497/fd/anon_inode:inotify
1 /proc/4437/fd/anon_inode:inotify
1 /proc/4151/fd/anon_inode:inotify
1 /proc/4147/fd/anon_inode:inotify
1 /proc/4028/fd/anon_inode:inotify
1 /proc/3913/fd/anon_inode:inotify
1 /proc/3841/fd/anon_inode:inotify
1 /proc/31146/fd/anon_inode:inotify
1 /proc/2829/fd/anon_inode:inotify
1 /proc/21259/fd/anon_inode:inotify
1 /proc/1934/fd/anon_inode:notify
- Notice that the above inotify list include PID of ssm-agent
processes, it explains why we got issue with SSM when
max_user_watches
reached limit
ps -ef | grep ssm-ag
root 3841 1 0 00:02 ? 00:00:05 /usr/bin/amazon-ssm-agent
root 4497 3841 0 00:02 ? 00:00:33 /usr/bin/ssm-agent-worker
- Final Solution: Permanent solution (preserved across restarts)
echo "fs.inotify.max_user_watches=1048576" >> /etc/sysctl.conf sysctl -p
Verify:
$ aws ssm start-session --target i-123abc456efd789xx --region ap-northeast-2
Starting session with SessionId: userdev-03ccb1a04a6345bf5
sh-4.2$
- This issue comes from EC2 instance not about SSM agent Go to link to undestanding SSM agent.
optional link