MySQL and Memory: a love story (part 1)

As you may know, sometimes MySQL can be memory-hungry. Of course having data in memory is always better than disk… RAM is still much faster than any SSD disk.

This is the reason why we recommended to have the working set as much as possible in memory (I assume you are using InnoDB of course).

Also this why you don’t want to use Swap for MySQL, but don’t forget that a slow MySQL is always better than no MySQL at all, so don’t forget to setup a Swap partition but try to avoid using it. In fact, I saw many people just removing the Swap partition… and then OOM Killer did its job… and mysqld is often its first victim.

MySQL allocates buffers and caches to improve performance of database operations. That process is explained in details in the manual.

In this article series, I will provide you some information to check MySQL’s memory consumption and what configuration settings or actions can be made to understand and control the memory usage of MySQL.

We will start the series by the Operating System.

Operating System

In the OS level, there are some commands we can use to understand MySQL’s memory usage.

Memory Usage

You can check mysqld‘s memory usage from the command line:

# ps -eo size,pid,user,command --sort -size | grep [m]ysqld \
  | awk '{ hr=$1/1024 ; printf("%13.2f Mb ",hr) } { for ( x=4 ; x<=NF ; x++ ) { printf("%s ",$x) } print "" }' \ 
  |cut -d "" -f2 | cut -d "-" -f1

1841.10 Mb /usr/sbin/mysqld 
   0.46 Mb /bin/sh /usr/bin/mysqld_safe

top can also be used to verify this.

For top 3.2:

# top -ba -n1 -p $(pidof mysqld) | grep PID -A 1
PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                
 1752 mysql     20   0 1943m 664m  15m S  0.0 11.1  19:55.13 mysqld                                                                                                                                 
# top -ba -n1 -m -p $(pidof mysqld) | grep PID -A 1
  PID USER      PR  NI  USED  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                
 1752 mysql     20   0  664m 664m  15m S  2.0 11.1  19:55.17 mysqld

For more recent top, you can use top -b -o %MEM -n1 -p $(pidof mysqld) | grep PID -A 1

VIRT represents the total amount of virtual memory used by mysql. It includes all code, data and shared libraries plus pages that have eventually been swapped out.

USED reports the sum of process rss (resident set size, the portion of memory occupied by a process that is held in RAM) and swap total count.

We will see later what we can check from MySQL client.

SWAP

So we see that this can eventually include the swapped pages too. Let’s check if mysqld is using the swap, and the first thing to do is to check is the machine has some information in swap already:

# free -m
             total       used       free     shared    buffers     cached
Mem:          5965       4433       1532        128        454       2359
-/+ buffers/cache:       1619       4346
Swap:         2045         30       2015

We can see that a little amount of swap is used (30MB), is it by MySQL ? Let’s verify:

# cat /proc/$(pidof mysqld)/status | grep Swap
VmSwap:	       0 kB

Great, mysqld si not swapping. In case you really want to know which processes have swapped, run the following command:

for i in $(ls -d /proc/[0-9]*) 
do  
   out=$(grep Swap $i/status 2>/dev/null)
   if [ "x$(echo $out | awk '{print $2}')" != "x0" ] && [ "x$(echo $out | awk '{print $2}')" != "x" ]
   then    
	  echo "$(ps -p $(echo $i | cut -d'/' -f3) \
         | tail -n 1 | awk '{print $4'}): $(echo $out | awk '{print $2 $3}')" 
   fi
done

Of course the pages in the swap could have been there for a long time already and never been used since…  to be sure, I recommend to use vmstat and verify the columns si and so (a trending system is highly recommended):

# vmstat  1 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0  31252 1391672 505840 2523844    0    0     2    57    5    2  3  1 96  0  0	
 1  0  31252 1392664 505840 2523844    0    0     0   328  358  437  6  1 92  1  0	
 0  0  31252 1390820 505856 2523932    0    0     0  2024 1312 2818 28  3 67  2  0	
 0  0  31252 1391440 505860 2523980    0    0     0   596  484  931  1  1 98  1  0	
 0  0  31252 1391440 505860 2523980    0    0     0  1964  500  970  0  1 96  3  0	
 0  0  31252 1391440 505860 2523980    0    0     0    72  255  392  0  0 98  2  0	
 0  0  31252 1391440 505860 2523980    0    0     0     0  222  376  0  0 99  0  0	
 0  0  31252 1391440 505908 2524096    0    0     0  3592 1468 2095 34  6 52  9  0	
 0  0  31252 1391688 505928 2524092    0    0     0  1356  709 1179 12  1 85  2  0	
 0  0  31252 1390696 505928 2524092    0    0     0   152  350  950  4  6 90  1  0
On this server, we can see that mysqld is not using the swap, but if it was the case and some free RAM was still available, what could have been done ?

If this was the case, you must check 2 direct causes:

  1. swappiness
  2. numa

Swappiness

The swappiness parameter controls the tendency of the kernel to move processes out of physical memory and put them onto the swap disk partition. As I explained earlier,  disks are much slower than RAM, therefore this leads to slower response times for system and applications if processes are too aggressively moved out of memory. A high swappiness value means that the kernel will be more apt to unmap mapped pages. A low swappiness value means the opposite, the kernel will be less apt to unmap mapped pages. This means that the higher is the swappiness value, the more the system will swap !

The default value (60) is too high for a dedicated MySQL Server and should be reduced. Pay attention that with older Linux kernels (prior 2.6.32), 0 meant that the kernel should avoid swapping processes out of physical memory for as long as possible. Now the same value totally avoid swap to be used. I recommend to set it to 1 or 5.

# sysctl -w vm.swappinness=1

Numa

For servers having multiple NUMA cores, the recommendation is to set the NUMA mode to interleaved which balances memory allocation to all nodes. MySQL 8.0 supports NUMA for InnoDB. You just need to enable it in your configuration: innodb_numa_interleave = 1

To check if you have multiple NUMA nodes, you can use numactl -H

These are two different output:

# numactl -H
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 64379 MB
node 0 free: 2637 MB
node distances:
node 0 
0: 10
# numactl -H
available: 4 nodes (0-3)
node 0 cpus: 0 2 4 6
node 0 size: 8182 MB
node 0 free: 221 MB
node 1 cpus: 8 10 12 14
node 1 size: 8192 MB
node 1 free: 49 MB
node 2 cpus: 9 11 13 15
node 2 size: 8192 MB
node 2 free: 4234 MB
node 3 cpus: 1 3 5 7
node 3 size: 8192 MB
node 3 free: 5454 MB
node distances:
node 0 1 2 3 
0: 10 16 16 16 
1: 16 10 16 16 
2: 16 16 10 16 
3: 16 16 16 10

We can see that when there are multiple NUMA nodes (right column), by default the memory is not spread equally between all those nodes. This can lead to more swapping. Check these two nice articles from Jeremy Cole explaining this behavior:

Filesystem Cache

Another point we can check from the OS is the filesystem cache.

By default, Linux will use the filesystem cache for all I/O accesses (this is one of the reason why using MyISAM is not recommended, as this storage engine relies on the FS cache and can lead in loosing data as Linux sync those writes up to every 10sec). Of course as you are using InnoDB, with O_DIRECT as innodb_flush_method, MySQL will bypass the filesystem cache (InnoDB has already enough optimized caches anyway and one extra is not necessary). InnoDB will then not use any FS Cache Memory for the data files (*.ibd).

But there are of course other files used in MySQL that will still use the FS Cache. Let’s check this example:

# ~fred/dbsake fincore binlog.000017
binlog.000017: total_pages=120841 cached=50556 percent=41.84

# ls -lh binlog.000017 
-rw-r----- 1 mysql mysql 473M Sep 18 07:17 binlog.000017

# free -m
             total       used       free     shared    buffers     cached
Mem:          5965       4608       1356        128        435       2456
-/+ buffers/cache:       1716       4249
Swap:         2045         30       2015

# ~fred/dbsake uncache binlog.000017 
Uncached binlog.000017
# free -m
             total       used       free     shared    buffers     cached
Mem:          5965       4413       1552        128        435       2259
-/+ buffers/cache:       1718       4247
Swap:         2045         30       2015

Some explanations. I started checking how much of one binary log was present in the filesystem cache (using dbsake fincore), and we could see that 42% of 473M were using the RAM as FS cache. Then I forced an unmap of those pages in the cache (using fincore uncache) and finally, you could see that we freed +/- 195MB of RAM.

You could be surprised to see which logs or datafiles are using the FS cache (making a file copy for example). I really encourage you to verify this 😉

The next article will be about what can be seen from MySQL’s side and what are the best configuration practices.

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

5 Comments

Leave a Reply to Shobhit RathoreCancel Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

As MySQL Community Manager, I am an employee of Oracle and the views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

You can find articles I wrote on Oracle’s blog.