We have a Azure based VPS instance (2vCPU, 4Gb RAM, SSD 30 GB) with an up to date Ubuntu 18.04, a fresh Hestia install with Prestashop e-commerce and a single domain.
User traffic is fairly low, about 100 visits/day to the e-commerce site and average CPU load is about 7% .
Every 5 or 6 days, around 15:30 hours (+/- 10 mins) we are left with a hung server and the only course of action is to turn off the instance and restart it.
We have disabled all mail services, increased max_children to 20 (we were receiving errors in PHP log, “server reached max_children setting (8), consider raising it”) and checked that there aren’t any cron processes running at that time.
Has anything like this happened to anyone? Any idea where to investigate the problem?
The signals we have are: web service time-out, CPU usage around 100%, free all memory ?!, network IN and OUT at 0 mb / sec.
We think that we do not access the console due to the extreme slowness of response.
Shutting down the VM takes 15 minutes, when normal is 2-3 minutes.
Try adding the venerable munin. Look to add in additional plugins, explicitly for multips_memory and CPU/disc activity. It could just be a runaway process, such as the infamous clamd but sounds to me like a buffer/cache isn’t getting trimmed, then running out of resources.
Take a look at the size of files in /var/log/* and ensure they are truncated when they reach a sensible size, say 5MB. A reboot may well be clearing one of these logs, making it less intensive for processes that read/write to it. Then as it grows to GB sizes, in time, the said process gets “its’ knickers in a twist”.
Azure Linux images do not have swap configured … The reason is that “the user should decide on the size and location of the swap and do it post provisioning” … And that despite having a temporary disk attached for this purpose.
Anyway, I just configured 4GB of swap and in a few days I will know if this was the reason for the apparent lack of resources.
Thanks for your contributions, they have helped us!