-
Notifications
You must be signed in to change notification settings - Fork 276
Description
Hello,
We are encountring a strange behavior on a Windows 2012 Event Collector.
This server use > 8vCPU and 20GB RAM, monitoring it does not show specific usage peaks.
NXLog is used to fetch logs and send them to a SIEM.
We are using parts of the wef-subscriptions (~40) on +4000 workstations. (customized to filter at Source).
For some (unknown) reasons the Event Collector stops "working" randomly after sometimes 2 days, 3 days , 7 days... By stops working i mean, the service is still running but no events are coming in the forwarded events...
Regarding deployed subscriptions, the only modifications we performed were:
- Everything in ForwardedEvents
- ~40 subscriptions
- MaxItems set to 25 or 50 (depending on some subscripitons)
25000
We also tried to perform some customozation recommended in:
"Windows™ Event Forwarding (WEF) into ArcSight at Scale"
We tried to correlate this issue with user activities but it does not seem to have any link, last stops to send log at 6 in the morning...
The analysis we performed shown the following behavior;
- The Subscription URL become suddenly inaccessible (404) and generate 2150859027 on client side.
- BUT WinRM and WSMan are still accessible from the client to the collector (tested with Winrm & Test-WSMan)
- The usage of the wecutil command (wecutil es/gs/gr) are not possible (process never ends)
- A deeper analysis using ProcessExplorer shown that all wecsvc.exe process threads are in a "Lock"/Waiting status with 0% CPU and 0% RAM used.
Sometimes we are able to restart the service, sometimes we need to kill it before restarting.
I have multiple questions on this:
-
Your configs use MaxItems set to 1, are we agreeed that performances should be better by buffering a little bit (as we are doing by setting 25 or 50) ?
-
The pro of multiple subscriptions is also to have the capacity to manage ACL separately but it also create 4000*40 registry keys to maintain states in the registry, could this have any impact ?
-
On the fact to have 40 dedicated subscriptions, can't this have any impact on parallel network connections from source initated ? (i mean opening 40 parallel tcp sockets instead of 1 only x 4000 ?)
This is a very interesting topic as large environment deplyment, tunning and troubleshooting are not well (not to say not at all) documented by MS...
Any feedbacks, ideas, and answers on this issue would be more than appreciated and i assume help the community !!!
Kind regards,