Say Hello to Libcoredumper – A New Way to Generate Core Dumps, and Other Improvements

Libcoredumper

LibcoredumperIn a perfect world, we expect all software to run flawlessly and never have problems such as bugs and crashes. We also know that this perfect world doesn’t exist and we better be as prepared as possible to troubleshoot those types of situations. Historically, generating core dumps has been a task delegated to the kernel. If you are curious about how to enable it via Linux kernel, you can check out Getting MySQL Core file on Linux. There are a few drawbacks that pose either a limitation or a huge strain to get it working, such as:

  • System-wide configuration required. This is not something DBA always has access to.
  • Inability or very difficult to enable it for a specific binary only. Standards ways enable it for every software running on the box.
  • Nowadays, with cloud and containers, this task has become even more difficult because it sometimes requires containers to be running on privileged mode and host OS to be properly configured by the provider.

The above issues have driven exploration of alternative ways to do create a core dump to help troubleshooting bugs and crashes. More details can be found at PS-7114 .

The Libcoredumper

The libcoredumper is a fork of the archived project google-coredumper. Percona has forked it under Percona-Lab Coredumper, cherry-picked a few improvements from other forks of the same project, and enhanced it to compile and work on newer versions of Linux as well on newer versions of GCC and CLANG.

This project is a Tech Preview, as you may infer from the repository name (Percona Lab). We might not maintain compatibility with future kernel versions and/or patches. One should test the core dumper on their environment before putting this tool into production. We have tested on kernel versions up to 5.4.

This functionality is present on all versions of Percona Server for MySQL and Percona XtraDB Cluster starting from 5.7.31 and 8.0.21. If you compile PS/PXC from source, you can control if the library will be compiled by switching -DWITHCOREDUMPER to ON/OFF (default is ON).

How To Configure It

A new variable named coredumper has been introduced. One should include it under the [mysqld] section of my.cnf and it works independently of the older configuration core-file. This new variable can either be a boolean (no value specified) or with value. It follows a few rules:

  • No value – core dump will have saved under MySQL datadir and will be named core.
  • A path ending with  /  – core dump will be saved under the specified directory and will be named core.
  • A full path with filename  – core dump will be saved under the specified directory and will use the specified name.

Every core file will end with the timestamp of the crash instead of PID, for two main reasons:

  • Make it easier to correlate a core dump with a crash, as MySQL always print a Zulu/UTC timestamp on the logs when it crashes:
    10:02:09 UTC - mysqld got signal 11 ;
  • Operators / Containers will always be running MySQL (or whatever application it is running) as PID 1. If MySQL has crashed multiple times, we don’t want to core-dump to get overwritten by the last crash.

How To Know If I Am Using libcoredumper

When MySQL attempts to write a core file it stamps the log saying it will write a core file. When it does it delegating the action to Linux kernel, you always see a message like below:

. . .
Writing a core file

The above behavior remains the same, however, when MySQL is using libcoredumper to generate the core file, one should see that message informing that the library will be responsible for the action:

. . .
Writing a core file using lib coredumper

Other Improvements

Apart from libcoredumper, starting from the same 5.7 and 8.0 releases a stack trace will also:

  • Print binary BuildID – This information is very useful for support/development people in case the MySQL binary that crashed is a stripped binary. Stripped binaries are a technique to remove part of the binaries that are not essential for it to run, making the binary occupy less space in disk and in memory. When computers had a restriction on memory, this technique was widely used. Nowadays this doesn’t pose a limitation anymore on most of the hardware, however, it is becoming popular once again with containers where image size matters. Stripping the binary removed the binary symbols table, which is required to resolve a stack trace and lets you read the core dump. BuildID is how we can link things together again.
  • Print the server Version – This information is also useful to have at glance. Recent versions of MySQL/Percona Server for MySQL have a fix for many know issues. Having this information helps to establish the starting point investigation. MySQL only prints the server version when it starts, and by the moment a server crashes, its log may have grown significantly or even got rotated/truncated.

Here is one example of how a crash with stack trace will look like:

14:23:52 UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.

Build ID: 55b4b87f230554777d28c6715557ee9538d80115
Server Version: 8.0.21-12-debug Source distribution

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x46000
/usr/local/mysql/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x55) [0x55943894c280]
/usr/local/mysql/bin/mysqld(handle_fatal_signal+0x2e0) [0x559437790768]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x13f40) [0x7f9d413bcf40]
/lib/x86_64-linux-gnu/libc.so.6(__poll+0x49) [0x7f9d40858729]
/usr/local/mysql/bin/mysqld(Mysqld_socket_listener::listen_for_connection_event()+0x64) [0x55943777db6a]
/usr/local/mysql/bin/mysqld(Connection_acceptor<Mysqld_socket_listener>::connection_event_loop()+0x30) [0x55943737266e]
/usr/local/mysql/bin/mysqld(mysqld_main(int, char**)+0x30c6) [0x559437365de1]
/usr/local/mysql/bin/mysqld(main+0x20) [0x559437114005]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7f9d4076db6b]
/usr/local/mysql/bin/mysqld(_start+0x2a) [0x559437113f2a]
Please help us make Percona Server better by reporting any
bugs at https://bugs.percona.com/

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
Writing a core file using lib coredumper

TL;DR

Libcoredumper serves as an alternative for current –core-file functionality for generating memory dumps. In case of any crash of MySQL, a core dump is written and can be later processed /read via GDB to understand the circumstances of such a crash.
Users can enable it by adding the below variable to [mysqld] section of my.cnf:

[mysqld]
coredumper

Percona Server for MySQL versions starting from 5.7.31 and 8.0.31 include the library by default. Refer to below documentation pages for more details:

https://www.percona.com/doc/percona-server/5.7/diagnostics/libcoredumper.html

https://www.percona.com/doc/percona-server/5.7/diagnostics/stacktrace.html

Summary

If you faced any issue or limitation on enabling core dumps before feel free to test new versions of Percona Server for MySQL/Percona XtraDB Cluster and use libcoredumper. Also, any feedback is very welcome on how we can improve the troubleshooting of bugs/crashes even further.


by Marcelo Altmann via Percona Database Performance Blog

Comments