Skip to content

🐛 [BUG]: Nginx/Canarie logging parsing bottleneck #593

@fmigneault

Description

@fmigneault

Summary

There is an issue with the logging/parsing strategy of nginx access logs.

Details

Since the conversion of CanarieAPI with separate cron/nginx/api definitions, the cron command that was invoked (https://github.com/Ouranosinc/CanarieAPI/blob/39c32e7ee2b21b5a3a6ba671ab41aa6195fde5b7/Dockerfile#L39) that led to some log rotation to occur (since logrotate config is available by default, but not cron or logrotate utilities, see below), is not being done anymore (current proxy only calls nginx -g daemon off;).

> docker exec -ti proxy bash 
> service --status-all
 [ ? ]  hwclock.sh
 [ + ]  nginx
 [ + ]  nginx-debug
> cat /etc/logrotate.d/nginx 
/var/log/nginx/*.log {
        daily
        missingok
        rotate 52
        compress
        delaycompress
        notifempty
        create 640 nginx adm
        sharedscripts
        postrotate
                if [ -f /var/run/nginx.pid ]; then
                        kill -USR1 `cat /var/run/nginx.pid`
                fi
        endscript
}

This leads to, eventually, very large logs to be collected.

3.9G Oct 2 06:26 access_file.log

This was not "that much a big deal" before.

However, since the fix of the log parser (Ouranosinc/CanarieAPI#53) and its integration (#588), the operation ran by CanarieAPI actually runs rather than preemptively failing. With this big file, it then proceeds to parse the log, which can take ~4min to parse based on the above size:

[2025-10-02 06:24:44,766] [13] [INFO] canarieapi.app_object : Loading log file : /var/logs/access_file.log
[2025-10-02 06:28:43,812] [13] [INFO] canarieapi.app_object : Compiling stats from 27483724 records

Provided that this command is launched every minute (https://github.com/Ouranosinc/CanarieAPI/blob/master/docker/canarieapi-cron), this leads to an ever-growing amount of duplicate tasks in parallel.

I have received memory monitoring alarms from our server, and found that dozens of cron jobs were hogging the memory and bringing the server to a halt.

Potential Strategies to Fix

Note

I can think of a few strategies to fix the problem.
Maybe others are possible, suggestions/concerns welcomed.

  1. use a docker with nginx+logrotate preinstalled and actually run it
    • mostly drop-in replacement, but one must be defined
    • maybe it could be considered to provide template/vars to override the default logrotate config provided by nginx
  2. discard the log-file method (https://github.com/search?q=repo%3Abird-house%2Fbirdhouse-deploy+%28PROXY_LOG_FILE+OR+proxy-logs%29&type=code) and use stdout (default of nginx) to leverage docker's log rotation
    • a separate config than our current x-logging: &default-logging in composes would be nedded to have a dedicated log
    • prometheus and canarieapi would need remapping to that resulting file
  3. define a cron on the host
    • since the file is mapped to the host, the logrotate config could be on the server directly, making it transparent to the stack
    • however, that is not very portable, and will most likely break on new servers where we will forget to set it up

To Reproduce

Steps to reproduce the behavior:

  1. Let the server run for a while with nginx + canarieapi.
  2. Trigger a lot of requests to force the creation of many access logs.
  3. Observe the logs reaching a big size.
  4. Inspect running commands on the host (eg: ps faux), there should be a lot of cron commands with canarie.logparser calls.

Concerned Organizations

@tlvu @mishaschwartz

Metadata

Metadata

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions