Configuration & Usage of MySQL Extended Statistics Introduction The Extended Statistics were designed to give the DBA complete control over the type and amount of data to be recorded by the server. The syntax to display information is fairly simple and is similar to SQL. This document describes how to configure and use Extended Statistics. For information on how to install them on your server, please consult documentation at http://www.extendedstatistics.com/documentation.html Terminology In the world of Extended Statistics the different types of things that can interact with the server are called Classes. In its present version these are: db - a MySQL database. user - a MySQL user. host - the host machine of a connection. conhost - a connection combination of user@host condb - a connection combination of db@host server - the server itself. Within a Class, specific instantiations are called Instances (does all this sound a bit familiar? It should. One goal of the project was not to invent new lingo or syntax). So ... an instance of a user could be domenic or melanie. The server Class is unique, the only instance is the server itself. For each instance we are allowed to track individual items called Vars. These include almost all of the variables displayed in the present SHOW STATUS command or available on an admin tool such as phpMyAdmin as "MySQL runtime information." You do need to use the name of the Var as displayed in the SHOW STATUS command, e.g. Com_select tracks select statements. Configuration Options Extended Statistics remains disabled on a server unless you have defined a 'statistics_class_list' in your /etc/my.cnf file OR have defined it as a command line option to the server, e.g. mysqld_safe -user=mysql --statistics_class_list="user,max-50,time-60,units-m,(Bytes_received)" As is standard for MySQL options, if you have something already defined in the /etc/my.cnf, a value specified on the command line will take precedence. The syntax is defined as: statistics_class_list=" Class,max-numInstances,time-timeLimit,units-(m|h|d),(Var,...),Class..." Here is a more complete example: statistics_class_list=" user,max-50,time-60,units-m,(Queries,Com_select,Com_update,Bytes_sent), db,max-50,time-10,units-h,(Bytes_sent,Com_select,Com_update), host,max-5,time-3,units-d,(Com_select,Com_update,Bytes_sent)" NOTE: no white space is allowed in the string and it must be on one physical line in the config file. This is a standard for all MySQL options. In the above we want the server to track information on three different Classes: user, db, and host. We will also keep cumulative statistics for the server as a whole. Similar to SHOW STATUS, but with historical data. For each Class, the 'max-' precedes the number of Instances of that Class to be supported. The 'time-' precedes the number of historical time units to store and 'units-' designates the period of interest: 'm' minutes, 'h' hours, and 'd' days. After limits for each Class, a comma separated list of Vars is included. The names are identical to what is presented in SHOW STATUS. We can now read this as track up to a maximum of 50 users (user,max-50) and record historical activity for the last 60 minutes(time-60,units-m). The Vars we are interested in are: queries, select, update, bytes sent, and bytes received. Security Statistical information is something you may not want to share with every user of the system. The config file also supports another option: statistics_users="john,domenic,ron,kathy,melanie" This allows a DBA to give selected access to named MySQL users of the system. The root user is always allowed access. The SHOW STATISTICS command, without any options, will show authorized users. The command will not display any information to unauthorized users. Using your new toy! -- the SHOW STATISTICS Syntax A goal of the implementation was to keep the syntax used to query the statistics as close to standard SQL as possible. The syntax described below should look very familiar, especially if you replace SHOW STATISTICS with SELECT. The significant differences are the use of LIKE to match specific Instances and the new keyword HISTORY to produce historical output. The LIMIT option operates differently depending on whether HISTORY has been specified. Without HISTORY it operates as in normal SQL, limiting the number of result rows displayed. If HISTORY is chosen it functions as a time limit, e.g. show me just the last 3 hours for each user. SHOW STATISTICS ( * | Var list) FROM Class [WHERE var ( '<' | '>' | '=' ) num] [LIKE 'Instance pattern'] [ORDER BY Var] [HISTORY] [LIMIT num] Because this is a new SQL addition, if you type just SHOW STATISTICS, you will get some usage and Extended Statistics subsystem information -- not just an error message. In normal SQL you could have expected: SHOW STATISTICS * FROM user WHERE user like '%joe%' and Bytes_sent > 500 Yes, we had to cheat by making LIKE a separate clause in the syntax that matched just Class instances. That is our development goal, but at present syntax is limited in the WHERE clause and ORDER BY is also being worked on. Much of this will also become simpler when Extended Statistics is implented as an INFORMATION_SCHEMA table. Usage Examples The examples below show a cross section of commands and output. When using the "*" column headings are in the same order as specified in the statistics_class_list. # Limiting output to just two rows. mysql> SHOW STATISTICS * FROM user LIMIT 2; +-------------+--------+--------+------------+ | user | select | update | Bytes_sent | +-------------+-----------------+------------+ | kelly | 8 | 0 | 27350 | | brian | 1 | 0 | 28303 | +-------------+--------+--------+------------+ 2 rows in set (0.00 sec) # Using HISTORY to see historical data. 0 represents the current # minute, -1, -2, etc. are the prior minutes. mysql> SHOW STATISTICS * FROM user HISTORY LIMIT 5; +----------+-------------+--------+--------+------------+ | db | minutes | select | update | Bytes_sent | +----------+-------------+--------+--------+------------+ | account | 11/24 14:04 | 733 | 6 | 1412513 | | account | 11/24 14:03 | 1095 | 13 | 2366218 | | account | 11/24 14:02 | 615 | 9 | 1856370 | | develop | 11/24 14:05 | 611 | 42 | 2917972 | | develop | 11/24 14:04 | 1226 | 46 | 6981811 | | develop | 11/24 14:03 | 968 | 36 | 6884774 | +----------+-------------+--------+--------+------------+ 6 rows in set (0.00 sec) # Display of specific columns. Use of WHERE clause to remove # uninteresting data. mysql> SHOW STATISTICS Queries, Bytes_sent FROM user WHERE Bytes_sent > 2000000 HISTORY LIMIT 5; +---------+-------------+---------+------------+ | db | minutes | Queries | Bytes_sent | +---------+-------------+---------+------------+ | account | 11/24 14:15 | 2464 | 2031356 | | account | 11/24 14:14 | 2097 | 2218834 | | account | 11/24 14:12 | 2784 | 2368126 | +---------+-------------+---------+------------+ 3 rows in set (0.00 sec) # Using LIKE to limit hosts displayed. The ORDER BY clause # should be implemented by the time you read this. mysql> SHOW STATISTICS * FROM host LIKE '%kids%' ORDER BY Bytes_sent; +--------------------+--------+--------+------------+ | host | select | update | Bytes_sent | +--------------------+--------+--------+------------+ | www.akidsright.org | 8 | 0 | 53113 | | s1.kids-right.org | 2 | 0 | 495 | +--------------------+--------+--------+------------+ 2 rows in set (0.00 sec) Monitoring and controlling Extended Statistics with SHOW and SET Other commands allow you to know what is going on within the subsystem and even exercise control over its operation. You can see all the available items with: SHOW [GLOBAL] VARIABLES LIKE 'statistics%'; statistics_active: 1 - active, 0 - disabled Allows you to suspend statistical tracking and reactivate it later without a server restart. Data recording will pick up where it left off, no old data will be lost. statistics_debug: Bit mask value. The source contains various test points that will output data to the log. To be used with caution under the direction of our service staff! Each of the values above can be modified by the root user using the SET command, e.g. SET GLOBAL statistics_active = 0; Implementation Notes Many of you may be wondering how this was done. A detailed description is beyond the scope of this article, but here are a few notes that may help as you examine the source. The existing MySQL, in both the 4.x and 5.x series, uses a simple invocation of a #define called statistic_increment or statistic_add when data is to be stored and displayed as part of SHOW STATUS. The Extended Statistics addition added a function call to the existing #define. No change was made to how SHOW STATUS data was recorded or displayed, but additional data structures and processing were added to record more context information for each increment, e.g. what host, what user, what database. All the data structures used to record statistical information remain in RAM and the algorithms are tuned to limit the amount of processing that is performed at the moment a statistic is incremented. Future Directions As this is going to press work is being done in the following areas: * Allowing a complete reinitialization of the statistics_class_list without requiring a server restart. * Extending the syntax in SHOW STATISTICS to support a more complete syntax in the WHERE and ORDER BY clauses. * Addition of triggers to support administrator notification when predefine usage thresholds are exceeded. * Development of an external tool similar to MRTG to query the server over time and create detailed usage graphs for historical tracking. Like any new feature we expect a strong amount of development to occur and the Software Workshop plans on providing a focal point for community discussion, bug reports and fixes, along with enhancements. As the Extended Statistics subsystem matures we expect it will also be picked up into the main MySQL distribution. Version: $Id: usage_conf.txt,v 1.2 2008/04/17 13:29:05 thebook Exp $