After requesting a backup, Nginx failed with nginx.service: Failed with result 'core-dump' error

Ade · May 22, 2024, 7:50am

Hi everyone,

I’ve just run my first website backup and I see that Nginx failed with the following (taken from syslog):

May 22 07:33:02 hp systemd[1]: Reloaded A high performance web server and a reverse proxy server.
May 22 07:33:02 hp systemd[1]: Reloading Dovecot IMAP/POP3 email server.
May 22 07:33:02 hp systemd[1]: Reloaded Dovecot IMAP/POP3 email server.
May 22 07:33:02 hp systemd[1]: Reloading LSB: exim Mail Transport Agent.
May 22 07:33:02 hp exim4[667270]:  * Reloading exim4 configuration files
May 22 07:33:02 hp systemd[1]: nginx.service: Main process exited, code=dumped, status=11/SEGV
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565812 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565813 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565814 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565815 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565816 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565817 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565819 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565821 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565822 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565812 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565813 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565814 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565815 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565816 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565817 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565819 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565821 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Killing process 565822 (nginx) with signal SIGKILL.
May 22 07:33:02 hp systemd[1]: nginx.service: Failed with result 'core-dump'.
May 22 07:33:02 hp exim4[667270]:    ...done.
May 22 07:33:02 hp systemd[1]: Reloaded LSB: exim Mail Transport Agent.
May 22 07:34:01 hp CRON[667738]: (admin) CMD (sudo /usr/local/hestia/bin/v-update-sys-queue restart)
May 22 07:35:01 hp CRON[667767]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May 22 07:35:01 hp CRON[667768]: (admin) CMD (sudo /usr/local/hestia/bin/v-update-sys-queue backup)

A simple systemctl restart nginx fixed the issue, but does anyone of you can think of a reason why this happened in the first place?

sahsanu · May 22, 2024, 8:26am

It’s hard to know the reason it failed without analyzing the core file. I’ve no idea where the system has left the core file but you could try to find it.

find / -type file -name core

If you find it, you could use gdb to debug the core file:

Note: The path to nginx binary is usually here: /usr/sbin/nginx

gdb /path/to/nginx /path/to/nginx/core

Ade · May 22, 2024, 8:57am

Unfortunately, no joy:

root@oo:/# find / -type f -name core
root@oo:/#

sahsanu · May 22, 2024, 9:04am

Maybe it is disabled:

ulimit -c

If the output is 0, it is disabled.

Ade · May 22, 2024, 9:28am

Yes the output is indeed 0, what is a reasonable size? I’ll try to change that and then run a new backup to see if I am able to replicate the problem

sahsanu · May 22, 2024, 9:36am

To be safe it should be unlimited

Ade · May 22, 2024, 9:51am

Changed I’ll run a backup now. If I use v-backup-user instead of the web interface, will the process send an email notification to the user?

sahsanu · May 22, 2024, 9:56am

Yes, it will.

Ade · May 22, 2024, 10:40am

I tried again but so Nginx is doing fine. I guess now I have the logs enabled in case it happens again

Is there any way to run site backups as admin and avoid a user notification?

sahsanu · May 22, 2024, 11:18am

By default, when executing automatic backups, users shouldn’t be notified (if the backup fails, the user will receive a mail).

Ade · June 1, 2024, 1:54pm

Hi!

I’d like to reopen this as the same issue just happened again, in this case, Hestia failed to install a SSL certificate (it was right) but this caused Nginx to fail.

nginx -t shows that everything is fine, a nginx restart will fix that but I’m trying to understand what is causing this.

ulimit -c shows unlimited but find / -type f -name core gives no results

Any other logs I can check?

sahsanu · June 1, 2024, 2:46pm

How did you try to install the certificate? Did you see the nginx error? When it happens, instead of restart, check the status first systemctl status nginx --no-pager -l

You can always check these logs:

/var/log/hestia/error.log
/var/log/hestia/system.log
/var/log/nginx/error.log
/var/log/nginx/domains/YourDomain.error.log

Ade · June 1, 2024, 5:29pm

I’ve tried to install the SSL certificate from the site settings

I did check the status before restarting Nginx but there was no error, just a killing process and a Failed with result ‘core-dump’ message

sahsanu · June 1, 2024, 5:41pm

Without a core dump it’s really hard to analyze the problem.

You could try to install systemd-coredump and it should save core dumps in /var/lib/systemd/coredump

apt install systemd-coredump

And if a core dump is created, you could use coredumpctl to interact with it.

Ade · June 1, 2024, 6:22pm

Thank you @sahsanu, I’ve installed systemd-coredump, I’ll send a message if I’m able to replicate the issue once again

Ade · June 1, 2024, 10:00pm

Just happened again while I was trying to generate a SSL certificate (simply enabling the checkbox and the HTTPS redirect checkbox)

Ade · June 1, 2024, 10:16pm

Maybe it’s related to the nginx extras I’ve installed some time ago
I’m going to try what has been suggested here