So the solution was:
innodb_buffer_pool_size = 64G
and
max_connections=500
Pointed by @alber
Then @eris spend some time on the server tweaking php-fpm.
Thank you both guys. Many thanks. Hope that’ll end my problems for now.
So the solution was:
innodb_buffer_pool_size = 64G
and
max_connections=500
Pointed by @alber
Then @eris spend some time on the server tweaking php-fpm.
Thank you both guys. Many thanks. Hope that’ll end my problems for now.
Unfortunately I celebrated way to early. Server crashed again under DDOS. This time Mariadb and PHP-FPM crashed. Had to reboot server.
What’s weird for me is that server is not reaching max CPU/Memory usage. It’s just ,stuck" without any obvious reason in log.
So far I changed my php-fpm.tpl a little.
@eris set it to :
pm = ondemand
pm.max_children = 32
pm.start_servers = 8
pm.max_requests = 8000
I changed
pm.max_children = 42
Maybe someone have another idea. I’m fine if the server crash under 100%CPU/RAM usage, but it’s strange when it’s dying under 30% CPU load and 30GB/90GB ram usage.
I also tweaked opcache a little.
Changed:
;opcache.force_restart_timeout=180
to:
opcache.force_restart_timeout=0
as mentioned here: https://www.cogmentis.com/php-fpm-crashing-on-cpanel-server-fixed/
Rest of my settings is the same (opcache). Couple of weeks ago I tweaked some memory limits with opcache, but this looks fine.
Offtopic, meanwhile you can use autorestart the fallen services with script running via cron, e.g.:
#!/usr/bin/env bash
set -e
# set -x
services="mysql nginx php7.4-fpm php8.1-fpm etc"
for service in $services; do
if systemctl is-failed --quiet $service ; then
if systemctl restart $service ; then
message="$service is down, restarted at $(date +"%Y-%m-%d %T")"
else message="ERROR: $service is down, can't be autorestarted at $(date +"%Y-%m-%d %T")"
fi
# echo $message;
echo $message >> /var/log/check-restart-services.log
fi
done
Maybe it killed by OOM Killer?
dmesg -T | egrep -i 'killed process'
Thanks for that. The problem is bit more complicated than than.
This services just ,hang" no obvious error. For example when site is down doing
systemctl status mariadb/php8.3-fpm
is returing that everything’s fine.
This returning nothing. So it’s not OOM. This server have 96GB of memory and only one Wordpress site on it ( someone decided to throw money at the problem instead of fixing).
Also current RAM usage under normal site load is
Memory: 8525MiB / 96312MiB
Also you can protect server from direct connections to IP that bypassed Cloudflare with Authenticated Origin Pulls CF certificate validation.
# Cloudflare Origin CA
ssl_client_certificate /etc/nginx/certs/cloudflare.crt;
ssl_verify_client optional;
# ssl_verify_client on;
enable Authenticated Origin Pulls at Cloudflare SSL > Origin Server
and download cert from there Zone-level authenticated origin pulls
and killing process not helps? what shows lsof
for process that hung?
I’m already using cloudflare certificate on this server. Where should I put this settings? Nginx template?
/usr/local/hestia/data/templates/web/nginx/php-fpm/
Or directly in
/home/user/conf/web/domain/nginx.conf
I’m using this Nginx template:
I rebooted this server. If it hangs again I’ll check with lsof.
Is:
lsof -i | grep mariadb
enough?
No it can’t start this process again ( mariadb).
Restarting PHP-FPM is fine.
Reboot is my only option then.
For some reasons query logging was enabled
So the drives could be a bottleneck
I enabled that few days ago hoping it’ll help me to find the issue.
I guess drives are not bottlenecks. It’s 2 ssd’s in raid.
Here’s screenshot from new relic. Highlighted moment when it crashed.
Try downgrading the php version back to 7.4 and see if it’s working fine or not?
The issue might be due to memory leaks.
Unfortunately this is not helping.
I’m marking this post as a solution for crashing database. Problem still persist (getting ddosed and server crashing) but now only PHP8.3-FPM is crashing without any error log. I’ll open another thread with another question. Also @alber offered to help me tomorrow, but I hope I’ll resolve this issue faster.
Thanks again.
yes, nginx templates, server section, it described at Cloudflare manual by link