Since few weeks, mysql server regulary (and randomly) stop working.
On the logs, I get these messages for all my hosted databases:
...
2024-08-13 23:13:30 100731 [Warning] Aborted connection 100731 to db: 'MSH_mairie' user: 'MSH_mairie' host: 'localhost' (Got timeout reading communication packets)
2024-08-13 23:13:39 101338 [Warning] Aborted connection 101338 to db: 'LUD_luda' user: 'LUD_luda' host: 'localhost' (Got timeout reading communication packets)
2024-08-13 23:13:39 100002 [Warning] Aborted connection 100002 to db: 'CGA_coco' user: 'CGA_coco' host: 'localhost' (Got timeout reading communication packets)
2024-08-13 23:13:39 101207 [Warning] Aborted connection 101207 to db: 'LGO_nespa' user: 'LGO_nespa' host: 'localhost' (Got timeout reading communication packets)
2024-08-13 23:13:39 100212 [Warning] Aborted connection 100212 to db: 'KCO_lady' user: 'KCO_lady' host: 'localhost' (Got timeout reading communication packets)
2024-08-13 23:13:39 101206 [Warning] Aborted connection 101206 to db: 'LGO_nespa' user: 'LGO_nespa' host: 'localhost' (Got timeout reading communication packets)
2024-08-13 23:13:40 101342 [Warning] Aborted connection 101342 to db: 'SSO_sano' user: 'SSO_sano' host: 'localhost' (Got timeout reading communication packets)
2024-08-13 23:13:40 101344 [Warning] Aborted connection 101344 to db: 'LGO_tef' user: 'LGO_tef' host: 'localhost' (Got timeout reading communication packets)
2024-08-13 23:13:51 99984 [Warning] Aborted connection 99984 to db: 'CDA_ecole' user: 'CDA_ecole' host: 'localhost' (Got timeout reading communication packets)
...
After some searches, I read the issue comes from settings inside /etc/mysql/my.cnf, by increasing some values, but itās look hard to know how exactly to find the correct values.
In my case, I have changed these values :
max_allowed_packet = 256M (instead of 32M)
wait_timeout=30 (instead of 10)
long_query_time=10 (instead of 5)
But unfortunatly, I steel have mysql server randomly stop
How can I tune finely mysql ?
Which values could solve the issue ?
If I was in your position I would probably run a test on a different server perhaps in a different country with very good latency just to be sure it is a software issue. It may cost you around $5 to obtain a month of VPS hosting to run such a test, but it would be worth it in my view.
Hopefully someone at this forum may have a solution, but if not then a bit of trial and error may be needed. Good luck
Looks like your mysql is on the same server as hestia . Is it overloaded ? Is it extremely cheap with poor disk speeds ? Of the values you mentioned I would think these would help the most.
@soulyears thanks for your answer, but I donāt understand your tip to test my server from another one.
This HestiaCP is hosting approximatively 100 websites peacefully, except from few weeks with this mysql instabilityā¦
@jperkins Youāre right, the mysql serevr is on the same server than Hestia. But I have a good VPS hosted by Infomaniak, with good resourcesā¦ I donāt thing the issue comes from hereā¦
Here the begining of the log when launching mysql :
2024-08-14 15:55:09 0 [Note] Starting MariaDB 10.11.6-MariaDB-0+deb12u1 source revision as process 1188562
2024-08-14 15:55:09 0 [Note] InnoDB: Compressed tables use zlib 1.2.13
2024-08-14 15:55:09 0 [Note] InnoDB: Number of transaction pools: 1
2024-08-14 15:55:09 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2024-08-14 15:55:09 0 [Note] InnoDB: Using liburing
2024-08-14 15:55:09 0 [Note] InnoDB: Initializing buffer pool, total size = 128.000MiB, chunk size = 2.000MiB
2024-08-14 15:55:09 0 [Note] InnoDB: Completed initialization of buffer pool
2024-08-14 15:55:09 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
2024-08-14 15:55:09 0 [Note] InnoDB: End of log at LSN=386783369028
2024-08-14 15:55:09 0 [Note] InnoDB: 128 rollback segments are active.
2024-08-14 15:55:10 0 [Note] InnoDB: Setting file './ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ...
2024-08-14 15:55:10 0 [Note] InnoDB: File './ibtmp1' size is now 12.000MiB.
2024-08-14 15:55:10 0 [Note] InnoDB: log sequence number 386783369028; transaction id 301432878
2024-08-14 15:55:10 0 [Note] Plugin 'FEEDBACK' is disabled.
2024-08-14 15:55:10 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2024-08-14 15:55:10 0 [Warning] You need to use --log-bin to make --expire-logs-days or --binlog-expire-logs-seconds work.
2024-08-14 15:55:10 0 [Note] Server socket created on IP: '127.0.0.1'.
2024-08-14 15:55:11 0 [Note] /usr/sbin/mariadbd: ready for connections.
Version: '10.11.6-MariaDB-0+deb12u1' socket: '/run/mysqld/mysqld.sock' port: 3306 Debian 12
2024-08-14 15:55:14 0 [Note] InnoDB: Buffer pool(s) load completed at 240814 15:55:14
As you can see, the is a note about Setting file ā./ibtmp1ā but I donāt know if it significantā¦
Sorry that was about all I had as I dont know much about mysql. But would look at ātopā to be sure it wasnt overloaded and pay attention to these numbers, especially WA and ST.
us : time running un-niced user processes
sy : time running kernel processes
ni : time running niced user processes
id : time spent in the kernel idle handler
wa : time waiting for I/O completion
hi : time spent servicing hardware interrupts
si : time spent servicing software interrupts
st : time stolen from this vm by the hypervisor
It is possible that it is trying to load the data and create indexes. But the memory is not available.
1.
How much RAM and virtual memory is available on the system? Are the databases huge? Whats the size in total of all DB?
2.
Does your VPS have a dedicated core available or does the package have shared CPU made available?
All this is normal. You do not have any problems in starting the mysql server. It started normally, which the logs prove. The connection is lost due to a different reason, either due to bottleneck configuration (which I doubt if the total size of the databases is huge) or due to hardware bottlenecks.
@soulyears Itās an unmanaged one : I have full control on it
@Deepak :
the total memory of the server is 12Go. You can have a glances screenshot :
myserver.net (Debian GNU/Linux 12 64bit / Linux 6.1.0-23-cloud-amd64) Uptime: 2 days, 9:12:41
Cloud cpu4-ram12-disk20-perf1 instance i-0000a405 (AZ1)
AMD EPYC-Rome Processor CPU ā 19.4% idle 74.0% ctx_sw 1K MEM ā 25.0% active 3.55G SWAP - 0.0% LOAD - 4core
CPU [|||||||| 19.4%] user 13.9% irq 0.0% inter 2K total 11.7G inacti 5.18G total 0 1 min 1.38
MEM [||||||||||| 25.0%] system 10.2% nice 0.0% sw_int 2K used 2.92G buffer 474M used 0 5 min 1.28
SWAP [ 0.0%] iowait 1.6% steal 0.0% free 8.77G cached 7.72G free 0 15 min 1.54
NETWORK Rx/s Tx/s TASKS 145 (268 thr), 1 run, 89 slp, 55 oth Threads sorted automatically by CPU consumption
ens3 267Kb 9.20Mb
lo 148Kb 148Kb CPU% MEM% VIRT RES PID USER TIME+ THR NI S R/s W/s Command ('k' to kill)
>20.5 0.6 422M 71.0M 1833054 SSO 0:00 1 0 S 718K 0 php-fpm: pool sanoah.fr
DefaultGateway 20ms 2.0 0.4 197M 43.2M 1833029 root 0:00 1 0 R 0 0 python3 /usr/bin/glances
1.0 0.5 100M 65.4M 1420925 www-data 5:23 1 0 S 0 0 nginx: worker process
DISK I/O R/s W/s 0.5 8.0 2.67G 959M 1188562 mysql 30:21 26 0 S 149K 47K mariadbd
sda 935K 352K 0.5 0.5 1.88G 56.9M 1421080 www-data 1:57 27 0 S 0 4K apache2 -k start
sdb 0 77K 0.5 0.1 311M 15.6M 514 root 1:09 1 0 S 0 0 php-fpm.conf)
sdb1 0 77K 0.5 0.1 20.7M 7.29M 418 systemd-r 1:11 1 0 S 0 0 systemd-resolved
sdb14 0 0 0.5 0.0 0 0 15 root 2:49 1 0 I 0 0 [rcu_preempt]
sdb15 0 0 0.0 1.1 486M 137M 1833045 GRE 0:01 1 0 S 0 0 php-fpm: pool site1.com
sr0 0 0 0.0 1.1 486M 136M 1832948 MMA 0:07 1 0 S 0 0 php-fpm: pool site2.com
0.0 1.1 475M 127M 1833051 GRE 0:01 1 0 S 0 0 php-fpm: pool site3.com
FILE SYS Used Total 0.0 1.0 464M 117M 1832961 CSA 0:01 1 0 S 0 0 php-fpm: pool site14.fr
/ (sdb1) 9.62G 19.5G 0.0 1.0 401M 116M 1833049 GRE 0:01 1 0 S 0 0 php-fpm: pool site15.com
/backup (sda) 158G 245G 0.0 0.8 442M 91.6M 1833028 CSA 0:01 1 0 S 0 0 php-fpm: pool site6.fr
/home (sda) 158G 245G 0.0 0.7 368M 80.3M 1832952 MMA 0:01 1 0 S 0 0 php-fpm: pool site16.com
/var/lib/mysql (sda) 158G 245G 0.0 0.7 368M 79.7M 1833052 EBA 0:02 1 0 S 0 0 php-fpm: pool site17.com
0.0 0.5 352M 64.9M 1833056 SSO 0:00 1 0 S ? ? php-fpm: pool site12.fr
No warning or critical alert detected
2024-08-15 09:48:31 CEST 2024-08-15 09:48:17 (0:00:08) - CRITICAL on CPU_IOWAIT (Min:25.4 Mean:28.4 Max:33.2): php-fpm8.2, mariadbd, php-fpm8.2
The VPS has 4 cores, but I didnāt tune it to allow one to mysql for example.
Should I ?
About the wait_timeout setting : which value can I set ? 100 ?
And last question/answer, about the DB size : the size directory of /var/lib/mysql is nearly 11Go
I donāt know how to get more specific stats about itā¦
pasted this from the weblink above ( he left out the trailing semicolon) SELECT table_schema AS "Database", SUM(data_length + index_length) / 1024 / 1024 AS "Size (MB)" FROM information_schema.TABLES GROUP BY table_schema;
on my idling hestia server it spit this out
MariaDB [(none)]> SELECT table_schema AS "Database", SUM(data_length + index_length) / 1024 / 1024 AS "Size (MB)" FROM information_schema.TABLES GROUP BY table_schema;
+--------------------+-------------+
| Database | Size (MB) |
+--------------------+-------------+
| information_schema | 0.20312500 |
| jperkins_joomla | 15.57812500 |
| mysql | 3.60156250 |
| performance_schema | 0.00000000 |
| phpmyadmin | 0.39257813 |
| roundcube | 1.20312500 |
| sys | 0.03125000 |
+--------------------+-------------+
7 rows in set (0.107 sec)
MariaDB [(none)]>
maybe you can see which is the large database
hope this helps and it is probably all I got
I suspected that was the case, as if it was managed you presumably would be asking for some assistance from your provider. If the provider you have is a good one (as mine is fortunately) they may be willing to help you solve the problem with some suggestions which may work anyway on a one off basis. The reality is though that this may not be a problem you can solve through software adjustments at all, and itās one I feel you must face and deal with in a timely manner if the 100 domains you mention belong to other people.
I already told you what I would do in my first post if no software adjustment could be found, and to spell it out further I would begin one by one to move the domains to a different server, starting with those that belong to you. If there was no reoccurrence of the mysql issue I would then inform my clients one by one that you are moving to a different data center and they will experience some downtime possibly.
In the end itās upto you what you do, obviously. Iām just telling you what I myself would do if i was in your position, thatās all.
Yes, itās worth exploring it, if tuning helps. I donot know if you have a swap file. If not, you can create one swap file in the size of 2x the RAM and reboot. As you have large databases, you need to go through tuning parameters, if something makes it better. But these are only suggestions and not solutions so one never knows if they work and how to solve bottleneck that you have.
Thank @Deepak and @soulyears for your answers.
Not easy to see clearly what to do.
Of course, I can move heavy site one by one until the crash stopā¦ but as they are appends randomly, itās a lot time process !
I will check for the swap file, but I wonder if there exist some sysadmin tools to fine tune mysql serverā¦
Thanks !
I will start by the more āeasyā for me : MySQLTunner.
I have made some adjustments, and I have to wait few days now to see if the situation comes better (or worse hahahaha).