Installation and Usage of MySQL Extended Statistics by John Murtari Introduction MySQL has almost become the standard in Open Source databases. It's easy to understand why when you consider its simple install, high performance, and reliable operation. The MySQL Extended Statistics subsystem should improve that reputation even more. DBAs will now be able to monitor and account for server usage down to the individual database, user, or even connecting client machine. How would you like to be able to type: SHOW STATISTICS select, update, insert FROM user WHERE select > 100 HISTORY LIMIT 24 and see a nice display of all user activity in the last 24 hours displayed in hourly increments and showing the number of queries of each type issued. Building a new mysqld Obviously, the information stored in a database can well be considered the 'company jewels' and you don't want it broken by some untested piece of software! At times certain MySQL upgrades do make changes that can introduce incompatibility with prior releases -- even though their release notes well document the changes. The MySQL Extended Statistics subsystem is just a change to the MySQL server daemon (mysqld) itself. It was designed to be easy to install and also easy to remove. The entire installation process assumes you already have a source distribution of MySQL installed and operating normally. Please, let me repeat that again, make sure you start from a MySQL source distribution you have built and installed yourself and is operating normally! If you do not want to use the source, the download web site has compiled binaries for various OS/hardware platforms available for evaluation and licensing. Download the changed and new files for Extended Statistics from http://www.softwareworkshop.com/mysql for your version of the server and put the .gz file in the top of your MySQL distribution. If you give the command tar -ztf mysql_ext_stat-4.1.20.tar.gz you will see something like: README.ExtendedStatistics mysys/my_pthread.h sql/lex.h sql/mysql_priv.h sql/sql_class.h sql/log_event.cc sql/mysqld.cc sql/sql_parse.cc sql/sql_yacc.yy sql/statistics.cc sql/statistics.h As always, please view the README before doing anything! It contains a description of the most current procedures. Also, this article only describes a Linux installation. If you are installing under Windows the README will contain appropriate instructions. The simplest way for you to preserve your existing source is to make backup copies of each of the affected directories: cp -a sql sql.sav ; cp -a mysys mysys.sav Then unpack the files: tar -zxvf mysql_ext_stat-4.1.20-es-1.1.0.tar.gz You can then type 'make' to generate a new mysqld. DO NOT do a 'make install'. Installing the Extended Statistics mysqld The whole point of this procedure is again safety. Instead of doing a 'make install' which normally copies a lot of files to different destinations -- you will install just the new mysqld. We recommend the following steps (below we assume the base directory for MySQL is /usr/local/mysql): 1. If possible, try your first install of the new mysqld on a non-production server. 2. Backup all your databases. 3. Make a backup copy of your current mysqld: cd /usr/local/mysql/libexec cp -a mysqld mysqld.sav 4. Extended Statistics prints status information to the MySQL error file. In another window you can monitor that file by typing: tail -f /usr/local/mysql/data/your-hostname.err & 5. At this point we are ready to install and activate the new server. We will use the basic commands available with MySQL to stop and start the server. If your installation uses other wrapper programs -- you should probably use those. Stop your MySQL server: /usr/local/mysql/bin/mysqladmin -uroot -p"$password" shutdown 060612 07:58:08 [Note] /usr/local/mysql/bin/mysqld: Shutdown complete 060612 07:58:08 mysqld ended 6. Copy the new mysqld into place: cp -a /path/to/your/installation/sql/mysqld mysqld 7. Start the new server: /usr/local/mysql/bin/mysqld_safe --user=$mysql_user $other_args & 060612 7:58:38 [Note] Extended statistics build (mysql-4.1.20-es-1.1.0) 060612 7:58:38 [Note] Extended statistics disabled by user, no statistics_class_list defined 8. Take a breather! The preceding two lines confirm that your server was built with the Extended Statistics subsystem and that it is currently disabled since no statistics were requested. We recommend you run for a while with this configuration and convince yourself things are operating normally. If an error occurs during start this indicates a problem with the build. To get back to a known good configuration all you have to do is restore your version of the mysqld and restart the server. 9. You are now ready to start recording statistical data. Depending on local policies and your preference perform either of the following: Change /etc/my.cnf and add the following line (the next section will describe the syntax): statistics_class_list="user,max-100,time-120,units-m,(Com_insert,Com_select, Com_update,Com_delete,Com_replace,Slow_queries,Bytes_sent,Bytes_received, Sort_rows)" NOTE - enter it exactly as above, no extra white space, and it should be ONE physical line in the file. If you change the /etc/my.cnf be sure to remove this line if you revert to a standard mysqld. It will not understand this additional parameter and will generate an error. OR You can restart the server and add it as a command line option (again as ONE physical line): mysqld_safe --user=mysql --statistics_class_list="user,max-100, time-120,units-m,(Com_insert,Com_select,Com_update,Com_delete, Com_replace,Slow_queries,Bytes_sent,Bytes_received, Sort_rows)" & 10. When the server restarts you should now see: 060612 8:26:12 mysqld started 060612 8:26:12 [Note] Extended statistics build (4.1.20-es-1.1.0) 060612 8:26:12 [Note] Extended statistics tracking classes: user,max-100,time-120,units-m,(Com_insert,Com_select, Com_update,Com_delete,Com_replace,Slow_queries,Bytes_sent, Bytes_received, Sort_rows) 060612 8:26:12 [Note] Extended statistics ACTIVE tracking memory: 435600 bytes for 1 classes Any error during server initialization or operation will be noted to the error log and the subsystem will disable itself. I hope the configuration syntax was self explanatory! I will now give you a more complete definition of how Extended Statistics is configured and used. Terminology [SIDE BAR] In the world of Extended Statistics the different types of things that can interact with the server are called Classes. In its present version these are: db - a MySQL database. user - a MySQL user. host - the host machine of a connection. server - the server itself. Within a Class, specific instantiations are called Instances (does all this sound a bit familiar? It should. One goal of the project was not to invent new lingo or syntax). So ... an instance of a user could be domenic or melanie. The server Class is unique, the only instance is the server itself. For each instance we are allowed to track individual items called Vars. These include almost all of the variables displayed in the present SHOW STATUS command or available on an admin tool such as phpMyAdmin as "MySQL runtime information." You do need to use the name of the Var as displayed in the SHOW STATUS command, e.g. Com_select tracks select statements. We have made a shortcut for the more common statement allowing you to omit the Com_ prefix. Configuration Options Extended Statistics data is not recorded unless you have defined a 'statistics_class_list' in your /etc/my.cnf file OR have defined it as a command line option to the server, e.g. mysqld-safe -user=mysql --statistics_class_list="user,max-50,time-60,units-m,(Bytes_received)" As is standard for MySQL options, if you have something already defined in the /etc/my.cnf, a value specified on the command line will take precedence. Here is a more complete example: statistics_class_list=" user,max-50,time-60,units-m,(Queries,Com_select,Com_update,Bytes_sent), db,max-50,time-10,units-h,(Bytes_sent,Com_select,Com_update), host,max-5,time-3,units-d,(Com_select,Com_update,Bytes_sent)" NOTE: no white space is allowed in the string and it must be on one physical line in the config file. This is a standard for all MySQL options. In the above we want the server to track information on three different Classes: user, db, and host. We will also keep cumulative statistics for the server as a whole. Similar to SHOW STATUS, but with historical data. For each Class, the 'max-' precedes the number of Instances of that Class to be supported. The 'time-' precedes the number of historical time units to store and 'units-' designates the period of interest: 'm' minutes, 'h' hours, and 'd' days. After limits for each Class, a comma separated list of Vars is included. The names are identical to what is presented in SHOW STATUS. We can now read this as track up to a maximum of 50 users (user,max-50) and record historical activity for the last 60 minutes(time-60,units-m). The Vars we are interested in are: queries, select, update, bytes sent, and bytes received. Security Statistical information is something you may not want to share with every user of the system. The config file also supports another option: statistics_users="john,domenic,ron,kathy,melanie" This allows a DBA to give selected access to named MySQL users of the system. The root user is always allowed access. Using your new toy! -- the SHOW STATISTICS Syntax A goal of the implementation was to keep the syntax used to query the statistics as close to standard SQL as possible. The syntax described below should look very familiar, especially if you replace SHOW STATISTICS with SELECT. The significant differences are the use of LIKE to match specific Instances and the new keyword HISTORY to produce historical output. The LIMIT option operates differently depending on whether HISTORY has been specified. Without HISTORY it operates as in normal SQL, limiting the number of result rows displayed. If HISTORY is chosen it functions as a time limit, e.g. show me just the last 3 hours for each user. SHOW STATISTICS ( * | Var list) FROM Class [WHERE var ( '<' | '>' | '=' ) num] [LIKE 'Instance pattern'] [ORDER BY Var] [HISTORY] [LIMIT num] Because this is a new SQL addition, if you type just SHOW STATISTICS, you will get some usage and Extended Statistics subsystem information -- not just an error message. In normal SQL you could have expected: SHOW STATISTICS * FROM user WHERE user like '%joe%' and Bytes_sent > 500 Yes, we had to cheat by making LIKE a separate clause in the syntax that matched just Class instances. That is our development goal, but at present syntax is limited in the WHERE clause and ORDER BY is also being worked on. Usage Examples [SIDE BAR] The examples below show a cross section of commands and output. When using the "*" column headings are in the same order as specified in the statistics_class_list. # Limiting output to just two rows. mysql> SHOW STATISTICS * FROM user LIMIT 2; +-------------+--------+--------+------------+ | user | select | update | Bytes_sent | +-------------+-----------------+------------+ | kelly | 8 | 0 | 27350 | | brian | 1 | 0 | 28303 | +-------------+--------+--------+------------+ 2 rows in set (0.00 sec) # Using HISTORY to see historical data. 0 represents the current # minute, -1, -2, etc. are the prior minutes. mysql> SHOW STATISTICS * FROM user HISTORY LIMIT 5; +-------------+---------+--------+--------+------------+ | user | minutes | select | update | Bytes_sent | +-------------+---------+--------+--------+------------+ | kelly | 0 | 0 | 0 | 0 | | kelly | -1 | 0 | 0 | 8506 | | kelly | -2 | 5 | 0 | 9151 | | kelly | -3 | 3 | 0 | 4758 | | kelly | -4 | 0 | 0 | 4057 | | brian | 0 | 0 | 0 | 0 | | brian | -1 | 0 | 0 | 0 | | brian | -2 | 0 | 0 | 0 | | brian | -3 | 0 | 0 | 0 | | brian | -4 | 1 | 0 | 28303 | +-------------+---------+--------+--------+------------+ 10 rows in set (0.00 sec) # Display of specific columns. Use of WHERE clause to remove # uninteresting data. mysql> SHOW STATISTICS Queries, Bytes_sent FROM user WHERE Bytes_sent > 0 HISTORY LIMIT 5; +-------------+---------+----------+------------+ | user | minutes | Queries | Bytes_sent | +-------------+---------+----------+------------+ | kelly | 0 | 3 | 8853 | | kelly | -1 | 2 | 8506 | | kelly | -2 | 7 | 9151 | | kelly | -3 | 6 | 4758 | | kelly | -4 | 1 | 4057 | | brian | -4 | 32 | 28303 | +-------------+---------+----------+------------+ 6 rows in set (0.00 sec) # Note change in time units. DB data is being binned # at hourly intervals. mysql> SHOW STATISTICS * FROM db HISTORY limit 2; +-------------+-------+------------+--------+--------+ | db | hours | Bytes_sent | select | update | +-------------+-------+------------+--------+--------+ | accounting | 0 | 48774 | 8 | 0 | | accounting | -1 | 0 | 0 | 0 | | webstat | 0 | 28303 | 1 | 0 | | webstat | -1 | 0 | 0 | 0 | +-------------+-------+------------+--------+--------+ 6 rows in set (0.00 sec) # Using LIKE to limit hosts displayed. The ORDER BY clause # should be implemented by the time you read this. mysql> SHOW STATISTICS * FROM host LIKE '%kids%' ORDER BY Bytes_sent; +--------------------+--------+--------+------------+ | host | select | update | Bytes_sent | +--------------------+--------+--------+------------+ | www.akidsright.org | 8 | 0 | 53113 | | s1.kids-right.org | 2 | 0 | 495 | +--------------------+--------+--------+------------+ 2 rows in set (0.00 sec) Monitoring and controlling Extended Statistics with SHOW and SET Other commands allow you to know what is going on within the subsystem and even exercise control over its operation. You can see all the available items with: SHOW VARIABLES LIKE 'statistics%'; statistics_active: 1 - active, 0 - disabled Allows you to suspend statistical tracking and reactivate it later without a server restart. When disabled any existing data is purged and memory freed. statistics_debug: Bit mask value. The source contains various test points that will output data to the log. To be used with caution! statistics_users: comma separated list The current list of authorized Extended Statistics users. Each of the values above can be modified by the root user using the SET command, e.g. SET GLOBAL statistics_users="brian,kelly" Implementation Notes Many of you may be wondering how this was done. A detailed description is beyond the scope of this article, but here are a few notes that may help as you examine the source. The existing MySQL, in both the 4.x and 5.x series, uses a simple invocation of a #define called statistic_increment or statistic_add when data is to be stored and displayed as part of SHOW STATUS. The Extended Statistics addition added a function call to the existing #define. No change was made to how SHOW STATUS data was recorded or displayed, but additional data structures and processing were added to record more context information for each increment, e.g. what host, what user, what database. All the data structures used to record statistical information remain in RAM and the algorithms are tuned to limit the amount of processing that is performed at the moment a statistic is incremented. Future Directions As this is going to press work is being done in the following areas: * Allowing a complete reinitialization of the statistics_class_list without requiring a server restart. * Extending the syntax in SHOW STATISTICS to support a more complete syntax in the WHERE and ORDER BY clauses. * Addition of triggers to support administrator notification when predefine usage thresholds are exceeded. * Development of an external tool similar to MRTG to query the server over time and create detailed usage graphs for historical tracking. Like any new feature we expect a strong amount of development to occur and the Software Workshop plans on providing a focal point for community discussion, bug reports and fixes, along with enhancements. As the Extended Statistics subsystem matures we expect it will also be picked up into the main MySQL distribution. ---------------- Author Bio Tag John Murtari, a former Air Force Instructor Pilot, has been working with UNIX for over 20 years. He is also a crusader for Family Law reform and hosts a group for Moms and Dads at www.AKidsRight.Org. John Murtari (jmurtari@thebook.com) http://www.softwareworkshop.com/ Software Workshop Incorporated 55 E. Genesee Street Baldwinsville, NY 13027 USA Version: $Id: mysql_linux_journal.txt,v 1.1 2007/11/25 13:56:26 thebook Exp $