Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix proc_num monitor #77

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

onlymellb
Copy link

@onlymellb onlymellb commented Dec 19, 2016

现在的情况

根据配置的cmd和cmdline 周期性扫描/proc下符合条件的进程, 当/proc下有符合条件的进程时,则proc_num数目大于0,反正则等于0.

问题

考虑这种情况,假如有一个程序存在缓慢的内存泄漏,持续很长时间后因OOM被干掉,被干掉后立马由supervisor等进程管理工具自动拉起,如果整个挂掉到被拉起的过程刚好在agent扫描/proc的周期中, 则从agent的视角来看认为此程序状态是正常的(实际上发生了异常重启),使得这种隐蔽问题被发现的周期大大延长.进程端口监控同样有这种问题.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant