MariaDB monitor
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
MariaDB is currently the fastest growing open source database solution. It is mainly developed by the MariaDB corporation and is a fork of MySQL. The DBMS MariaDB with its various free features offers much, which MySQL does not provide or only at a cost, e.g. a storage engine for processing huge amounts of data; a database proxy for secure and highly-manageble administration of scalable installations and more. Unlike MySQL, MariaDB does not have its own client like the Workbench.
A good free alternative is HeidiSQL, but it does not have a dashboard, which, for example, visualizes the server functionality and thus facilitates optimization decisions.
Content of this article
- The background of the MariaDB monitor
- The functions of the MariaDB monitor
- The installation of the MariaDB monitor
The idea for developing a MariaDB monitoring app
There are some MariaDB tools available for reporting and monitoring, for example:
- MONyog (fee required)
- Percona Toolkit (free)
- AppDynamics (fee required, SAS solution)
The disadvantage of the solutions is that they offer only limited flexibility, are partly liable to pay, or require that internal company data have to be given to the outside. This led to the idea of developing an own solution for monitoring and optimizing our internal database infrastructure.
For the implementation of the solution the open source programming environment R was chosen in conjunction with the Shiny package. A first functional version was published on Github and is presented here. There are numerous improvements and additional features planned. However, given the low priority of the project, the project is running parallel to the day-to-day business and with relatively few resources. We are pleased about every participation in the further development.
The challenge
Our database servers are used exclusively for classic data warehouse scenarios. Problems arising from data storage for web services do not apply to us. Above all, problems arise from processing and retrieving large amounts of data, e.g. resulting from BI and OLAP applications. The MariaDB monitor is designed to meet these objectives, but it is easy to customize with help of the programming language R.
The functions of the MariaDB monitor
The MariaDB Monitor is divided into three areas: a large area in the middle of the app visualizes the requested information. The left sidebar is used to select different topics. The third area is located in the upper right corner. This is a notification window, which informs about abnormalities.
### CUSTOM CONTENT ELEMENT ###
Text / Bild [erweitert]
- Überschrift
- Text
Sidebar – Overview of the content
The MariaDB monitor’s sidebar provides access to a variety of information about the status and functionality of the MariaDB server. In the top part of the sidebar is a selector for selecting a database server. By default, the database server specified in the configuration (config.file) is selected. If this is in a master-slave environment, all connected slaves can be selected via the selector. A prerequisite for changing the database server is that all are available via port 3306. In addition, the database access (account) must be the same on all database servers.
Below the selector is a block with general status information about the selected database server. These include, among others, the server run time, the number of current processes, and the current buffer size. All key figures were highlighted by indentation to the right and green icons. They link to tabs with detailed information. A directory of all pages of the MariaDB monitor is located below the key figure block. An interaction with the links provides access to more detailed server information.
This includes:
### CUSTOM CONTENT ELEMENT ###
Text / Bild [erweitert]
- Überschrift
- Text
- MariaDB Status: General information about the operation of the database server.
- Statement Analysis: Detailed information about the running database queries.
- Index Statistics: Information on the use and quality of the database indexes.
- User Statistics: General information about the connections of the database users.
- Events: Information about the configured server events.
- InnoDB Status Output: Overview of the operational information of the InnoDB storage engine.
- Allocated Mem: Overview of the allocated memory.
- Server-Variables: Overview of all server variables.
- MaxScale: Information about how MaxScale works (if any).
The last option “Configuration” allows the configuration of the MariaDB monitor, e.g. the limits for the coloring of values can be adjusted. The contents of the individual pages are briefly described below.
MariaDB Status – the start page of the app
The MariaDB status is at the same time the home page of the MariaDB monitor and contains general information on the operation of the database server. The upper graphic shows the total number of client connections (or threads), the number of active connections, and the amount of memory used. Via the selector of the surrounding tab, the user is given the option to request diagrams for log processes and buffer pool processes.
Under the diagram, there are four sections with circle diagrams or bar charts for important key performance indicators (KPIs):
- Buffer Pool: This diagram shows the level of the buffer. Ideally, it should be close to 100%. If the TokuDB engine is used in addition to the InnoDB storage engine, the graphics expands automatically.
- InnoDB Buffer Pool Hitrate: The percentage of queries answered by the buffer.
- Table Cache Hitrate: Share of open tables on all tables opened since the last reboot.
- Temp Tables on Disk: Percentage of temporary tables created in and out of memory.
Below the four sections is a summary of the most important server variables and the associated status. They can be accessed via the “More” button and are thematically grouped.
Statement Analysis – How the server answers queries
The “Statement Analysis” tab provides deeper information for server optimization. Database queries have a significant impact on server utilization and should be monitored from a variety of perspectives. The top part of the page contains general information and links to query optimization and general server optimization. Below is a table with five tabs:
This table shows the most common queries. The average runtime is highlighted in color (gray, orange, red). In general, it is important to avoid running times in the orange or red range among the most frequent queries. Optimization measures can be carried out, e.g. based on the SQL statement (column “query”) and / or the information on whether a table scan has been performed (column “full_scan”). The table and database structure must be checked if a table scan has been performed. Indexes may help to avoid the table scan. The SQL code must also be checked and, if necessary, optimized. Thus, e.g. running the query on the database with a prefixed “explain” shows the execution plan of the query engine.
The number of SQL statements issued and the limits for coloring the values can be adjusted in the “Configuration” area.
This table lists the slowest queries. Particular attention should be given to queries with frequent execution (column “exec_count”). The SQL code (column “query”) can give you an initial access to the causes of a too long execution time. This can be done for a more detailed investigation with a prefixed “explain” on the database. The query or table can be optimized with the information obtained with it. A further solution may be an aggregation table. The limits for coloring the values in the “avg_latency” column can also be adjusted in the “Configuration” area.
This table lists queries with at least one table scan. Table scans typically result in longer run times because each row in the table is considered for the query. If only a subset of the table is sufficient to answer a query, the query and / or the index structure of the table must be adjusted.
When the queries are displayed, attempts are made to consolidate identical patterns. This means that queries that differ only by filter elements are grouped using a wildcard (this also applies to all queries displayed in the MariaDB monitor). Thus:
where date < 01.01.2001
where date < 01.01.2002
grouped into
where data < ?
The query engine of MariaDB decides in a query whether the use of an index is meaningful. If the use of an index is more expensive than a table scan, MariaDB waives the use of the index. This assessment depends on the filtered elements so that the case may occur that the same SQL query (with different filtered elements) can be executed with or without the use of an index. The metric “no_index_used_pct” shows the proportion of the executions without an index. In this context, consideration should be given to whether filtering to another variable might lead to a more consistent use of the index.
The number of execution, the duration of the query, and the share of unused indexes help in prioritizing. For a better overview, the proportion of the queries with a table scan can be colored by limit values (gray, orange, red). The limit value can be set in the “Configuration” area.
This area provides information about which queries were created for temporary tables and whether RAM was used for this purpose. Typically, creating temporary tables in memory takes less time than on the hard disk. In addition, the operations on such memory tables are usually more performant. Therefore, temporary “disc tables” should to be avoided. The metric “tmp_tables_to_dsik_pct” shows the fraction of temporary hard disk tables – high values are unfavorable. Should the share of the temporary hard disk tables be high, Customizing the server variables [tmp_table_size] and [max_heap_table_size].
The last table lists queries with errors or warnings. The number of executions and the proportion of errors and warnings help in prioritization. Note that a query can produce multiple errors and warnings (for example, an insert of several hundred rows: an incorrect data format can lead to an error message per line). A proportion greater than one hundred percent is therefore possible.
Index Statistics
Indexes can speed up queries. However, the opposite is also possible. The tables “Unused Indexes” and “Index Cardinality” are intended to help keep an overview.
In this table, all indexes are separated according to schema and table, which have not been used since the last server start. Indexes that have not been used for a long time are good candidates for deletion.
Indexes can significantly speed up a query, e.g. by only having to access the rows of interest of a table. Writing processes to this table, on the other hand, are slower because the index must be updated in addition to the new rows. It is therefore important to consider whether an index is necessary and meaningful.
The cardinality, simply spoken, describes the ratio of different values to the total number of lines. Thus the gender attribute would have a low cardinality in a table with 1 million persons, since only two different types (male, female) exist. When filtering for gender, it is not to be assumed that the filtered line count is small (presumably 0.5 million). In the case of indices, a high cardinality is usually better, since a few lines are addressed via a filter. In the example mentioned, it may be – depending on the query – that the query engine decides against the use of the index and starts a complete table scan despite a filtering on “woman” and an index on gender.
The Index Cardinality table lists low-cardinal indexes. For the indices shown, consider removing them. However, further analysis of their use is recommended.
User Statistics
User statistics provide clues as to the impact of individual users on the overall performance of the server.
Events
The server event table provides an overview of defined events. This includes the database on which they were created and the responsible user. The status in conjunction with a respective date stamp for the creation, modification and the next execution help to judge the correct functioning.
InnoDB Status Output
This page shows the report that is automatically made available via the InnoDB storage engine. It contains various information, e.g. to transactions, I / O, buffer, and much more.
Allocated Mem
See the “Allocated Mem” page for detailed information on how to use the memory in MariaDB. As a rule, the performance of the database server increases with a larger RAM, since the majority of the data can then be kept “in memory”. Depending on the RAM size, some variables of the MariaDB server can be optimized. The largest RAM part is usually scheduled for the buffer pool. The larger it is, the higher the likelihood that a query will not trigger a hard disk operation, i.e. the query can be answered using RAM only. This usually leads to faster response times.
Another large proportion of RAM is reserved for the storage of temporary tables, especially for BI applications. These allow e.g. sorting or grouping operations. A significant difference between buffer pool and temporary table storage is that the buffer pool is used together for all threads, but for temporary tables, each thread is separate. This means that the total memory of temporary tables can increase with the number of simultaneous connections. In the worst case, the “maximum number of temporary tables” times the “maximum number of simultaneous connections”. In addition to the two “memory eaters” [innodb_buffer_pool_size] and [tmp_table_size] (or [max_heap_table_size]), there are other variables that are thread-specific or thread-spanning. They are also displayed in this tab. They usually require significantly less memory than the buffer pool and the temporary tables.
If more memory is scheduled using the MariaDB variable than the server actually provides, the database server may crash.
Server Variables
This tab shows all server variables and status variables including their values. For a selective display, the search mask can be used at the upper right of the table.
MaxScale
On this page you will find all the information that the MaxScale plugin “MaxInfo” makes available. Since this can be queried via a MySQL client and SQL, it was also integrated.
Important information are among others the attached backend servers with their associated status, and various status variables related to past events (e.g., hangup events and error_events). In order for the MariaDB monitor to query the data of the MaxInfo plug-in, the stored database account must also have been enabled for MaxInfo. This can be implemented in MaxScale via the “maxscale.cnf”. An example follows below. In addition, the MaxInfo plugin must be accessible via port 9003.
Configuration
The Configuration tab allows you to customize the appearance and functionality of the app. These include e.g. the limits for coloring values, the update frequency of the schedules, the number of rows displayed per table, or the RAM size. The latter has to be stored, since it can not currently be queried via the database server.
Notifications
There are three notification icons in the upper right corner of the app. The former displays various abnormalities, which when red colored may require an intervention. This includes the number of unused indices since the last server startup. They are to be subjected to a detailed examination and, if necessary, to be deleted. It also displays the top 50 queries with a runtime greater than one second. Depending on the usage of the database server, it may be useful to adjust the limit value. The background of the key figure is that frequently executed queries should be answered as quickly as possible.
In this context, a further key figure is printed. It shows the number of the slowest query that was executed more than three times. Again, it may make sense to adjust the limits.
The four key figures below summarize the allocated memory. The reason is to check whether the allocated memory exceeds RAM.
Installation and Configuration
The MariaDB monitor app has been released as R package and is available on Github. The installation in R is done by:
library(devtools) devtools::install_github("INWT/MariaDB_Monitor")
To run the app, the cnf.file configuration file should be customized. The app’s database user must have sufficient privileges to retrieve performance data (e.g. performance_schema). The following create script is a possibility, but should be adapted under the aspect of security:
CREATE USER 'MariaDBstat'@'%' IDENTIFIED BY 'abc'; GRANT SELECT, PROCESS ON *.* TO 'MariaDBstat'@'%'; GRANT SELECT ON `information\_schema`.* TO 'MariaDBstat'@'%'; GRANT SELECT ON `mysql`.* TO 'MariaDBstat'@'%'; GRANT SELECT ON `performance\_schema`.* TO 'MariaDBstat'@'%'; FLUSH PRIVILEGES;
The username, password, database host and database port must be entered into the **cnf.file** file, located in [~/.INWTdbMonitor/cnf.file`]. Specifying a database is optional. Instead of manually entering the data a configuration wizard can be used by runnung the function:
promptCnfData()
The configuration file has the following structure:
[client] user= password= database= host= port=
In addition, it is necessary that the following entry is set under [mysqld] in the my.cnf of the MariaDB server. It ensures that the server’s performance data is collected.
performance_schema = on
The MariaDB monitor can be started with correct configuration on the desktop PC with RStudio or on a Shiny server. In RStudio the following code is used to start the MariaDB monitor:
library("INWTdbMonitor") startApplication()
The MariaDB monitor automatically detects whether the configured database server is connected to MaxScale. In order to display data of the MaxInfo plugin, it is necessary that the configured database account was also created for MaxInfo.
Here an example configuration in the maxscale.cnf:
[MaxInfo] type=service router=maxinfo user=MariaDBstat passwd=abc [MaxInfo Listener] type=listener service= MaxInfo protocol=MySQLClient address=xxx.xxx.xxx.xx port=9003
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.