Wednesday, September 2, 2009

How GDB helped me fix a Drizzle Bug


The other day I found a nice surprise on my inbox. Jay Pipes asked me if I'd like to try fixing a small bug on Drizzle. It looked pretty simple, and the bug report included a big part of the fix. I accepted without a doubt.

I decided to first change trans_prealloc_size from uint32_t to uint64_t. That was done on drizzled/session.h.
Then, I went to drizzle/set_var.cc and changed sys_trans_prealloc_size from sys_var_session_uint32_t to sys_var_session_uint64_t (and removed the two extra parameters).

At first, that looked like it was everything I needed to do. I compiled drizzle, executed the queries that were included on the bug report, and that almost worked!

set session transaction_prealloc_size=1024*1024*1024*4;
set session transaction_prealloc_size=1024*1024*1024*5;
and
set session transaction_prealloc_size=1024*1024*1024*6; were all being truncated to "4294966272"

The pending problem was that I no longer received a warning telling me that those values had been truncated.

I looked again at the code, used gdb to step through the code, but I just couldn't tell what the problem was.
I was then forced to do what I do when I'm stuck on a computer problem, I had to take a break :).

... And that was a great idea, because while away from the computer, I thought of looking at sys_var_session_uint64_t::update, and compare it to sys_var_session_uint32_t::update.

It turned out that sys_var_session_uint64_t::update was missing a call to throw_bounds_warning(), no wonder I wasn't getting the truncation warnings.

It was time to enable the test suite that targeted this bug, run this particular test, commit, push .... and I suddenly thought that I should also run the regular tests, you know, make test. And I was glad I did. There were about 7 test cases that failed.

The firs thought was that these changes I just made uncovered some mysterious bugs on Drizzle. But after looking closely at the test cases, I noticed that there was something else going on.

If I executed this query: set group_concat_max_len = 4;, the session variable group_concat_max_len ended up with a value of 4294967295.

It was time for some serious GDB'ing. There are a few articles here, here and here that explain how to use gdb and Drizzle, but I start it a little different.

Starting GDB and Drizzle

On the terminal, I run:
[wizard]$ gdb --args /Applications/drizzle/sbin/drizzled --port=3306 --basedir=/Applications/drizzle --datadir=/Applications/drizzle/var

After a few seconds, I am at the (gdb) prompt. This is the time to set breakpoints, which would help me find this little issue.
I tried doing this:
(gdb)break drizzled/set_var.cc:744
but gdb told me: "No source file named drizzled/set_var.cc", I had this same problem trying to use gdb on the MySQL Proxy, and after trying different things, I found the correct way to specify the file name:
(gdb)break set_var.cc:744

*Note: when debugging plugins, you specify the filenames like this:
(gdb)break plugin/logging_gearman/logging_gearman.cc:150

(include the plugin folder)

That one breakpoint was all I needed, I then started drizzle by typing:
(gdb)run
After pressing enter, you see many lines, but the last few are (similar to)

. done
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins.
090827 23:45:16 InnoDB: highest supported file format is Barracuda.
090827 23:45:16 InnoDB Plugin 1.0.3 started; log sequence number 46439
Listening on :::3306
Listening on 0.0.0.0:3306
/Applications/mysql/enterprise/drizzle/sbin/drizzled: ready for connections.
Version: '2009.08.1124' Source distribution (bug-fixes)

Drizzle was ready for my tests. I went to a new terminal and executed

set group_concat_max_len = 4;
The drizzle prompt hanged there (waiting on the breakpoint I set earlier)

This is the code near the breakpoint

bool sys_var_session_uint64_t::update(Session *session, set_var *var)
{
uint64_t tmp= var->save_result.uint64_t_value;

if (tmp > max_system_variables.*offset) // <-- this was line 744
  throw_bounds_warning(session, true, true, getName(), (int64_t) tmp);
  tmp= max_system_variables.*offset;
if (option_limits)
  tmp= fix_unsigned(session, tmp, option_limits);
if (var->type == OPT_GLOBAL)
{
/* Lock is needed to make things safe on 32 bit systems */
pthread_mutex_lock(&LOCK_global_system_variables);
global_system_variables.*offset= (uint64_t) tmp;
pthread_mutex_unlock(&LOCK_global_system_variables);
}
else
session->variables.*offset= (uint64_t) tmp;
return 0;
}


Back on the GDB terminal, I saw:

Breakpoint 1, sys_var_session_uint64_t::update (this=0x4dec60, session=0x1044e00, var=0x10470c8) at set_var.cc:744
744 if (tmp > max_system_variables.*offset)

It was time to look at the value of tmp, so I typed
(gdb) print tmp
$4 = 4

So far, so good, I wanted to see 4 as the result, let's step through the code, and see when tmp gets a much higher value.
(gdb) step
746 tmp= max_system_variables.*offset;

Check the value of tmp


(gdb) print tmp
$5 = 4
(gdb) step
748 if (option_limits)
(gdb) print tmp
$6 = 4294967295

There it was, whatever happened between line 746 and 748 is causing the problem. By now you may have already spotted the issue, but I just didn't see it. I stared at that piece of code for about a minute, and then I realized what the problem was.
The if statement on line 744 was a single line statement, before I added the call to throw_bounds_warning(). Because I am so used to having brackets, even for single line if statement, I forgot to add a pair of {}.

*Note: For completeness, once you are done stepping through the code, you can type
(gdb)cont
and the program will continue to run as normal. On my case, I could then run on the drizzle terminal:


drizzle> show variables like "group_concat_max_len";
+----------------------+------------+
| Variable_name | Value |
+----------------------+------------+
| group_concat_max_len | 4294967295 |
+----------------------+------------+
1 row in set (0.01 sec)


Conclusion.
If you don't have another pair of eyes to look over your code, you could ask gdb for some help :)

Friday, August 28, 2009

Drizzle and the Gearman logging plug-in

Disclaimer:
This blog post is about things I did on my own free time, not endorsed by my employer.

I have been meaning to look at Gearman for a long time, but I just couldn't find any project where I could use it.

Well, that was true until last week a couple of weeks ago, when I started to put together Drizzle, the Gearman logging plug-in, Perl and the Enterprise Monitor.

As I was finishing writing the agent.pl script, I thought that it would be a good idea to split the script in at least two components: one that would just collect the queries, and another component that would do the processing of the log entries (replacing the literals for "?", grouping queries by query text, etc).

It was right there when I realized that this was my chance to look at Gearman! The first thought was to still use the regular query logging plug-in.
But there is already a Gearman logging plugin, and I was curious about how that worked.

A quick Google search returned very little information, but I did find the doxygen docs, and reading the code was fairly straight forward.

By reading the code, I found out that the plug-in registers the function drizzlelog with the Gearman Job server, and that it passes the same string that the query logging plug-in sends to the log file.

Next step was to find a hello world Perl + Gearman example. And I found a sample for the client and the worker. That almost worked out of the box, but I got this error:

Can't call method "syswrite" on an undefined value at /Library/Perl/5.8.8/Gearman/Taskset.pm line 201.

A little google search and I found an example where the port was appended to the host. I then added the port 4730 to worker.pl and client.pl and it all worked as expected.

Once I got the simple example working, I added most of the agent.pl code to the worker.pl script, made a few small changes, and added comments. I was done!

The Gearman logging plugin sends query logs to the job server, and the job server asks the worker to do the actual job.
In the end, the service manager ends up with all the information related to the queries that go to the Drizzle server.

Layout.
For this initial version, one worker cannot handle jobs for more than one drizzle server, this is not a Gearman limitation. When I wrote this script, there was no way to tell the worker, which Drizzle server was sending the log entry.

And that was an excellent excuse to add a few more fields to the Gearman logging plugin. (That patch was already approved and will soon be part of the main Drizzle source.)



worker-1 handles requests for drizzled-1 and worker-2 handles jobs for drizzled-2. I am already looking into ways to change this.

Where is the code?
As usual, I posted the worker.pl script on the MySQL Forge.

How do I start the worker?
Like this:

$ DEBUG=1 perl worker.pl --serveruuid="22222222-5555-5555-5555-222222222211"\
--serverhostuuid="ssh:{11:11:11:11:11:11:11:11:11:11:11:11:11:11:11:21}" \
--serverdisplayname="Main-Drizzle-web2"



How do I start the client?
In this case, the Gearman client is the drizzle plug-in, so, all you need to do is add these lines to your drizzle.cnf

$ cat /etc/drizzle/drizzled.cnf
[drizzled]
logging_gearman_host = 127.0.0.1
logging_gearman_enable = true


Restart the Drizzle server and you are ready to go (well, you also need the MySQL Enterprise Monitor)

Final Note.
I was amazed at how easy it was to have it all working, I will keep looking for other projects where I could use Gearman.

Thursday, August 27, 2009

Drizzle query monitoring

Disclaimer:
This blog post is about things I did on my own free time, not endorsed by my employer.

A little over a month ago, Ronald posted a blog about the different query logging plug-ins that are available for Drizzle. This was pretty exciting news, especially when I saw the details that were included in the logs.

Meanwhile, a few weeks ago, I started looking at the REST API that comes with the MySQL Enterprise Monitor.

The result is that we can now see most of the information returned by the plug-in, on the Dashboard.




How?
A colleague at work, wrote a little Perl script that interacts with the REST API, and I took his work as the foundation for my agent.pl script.

The next problem was to find a way to call this script as soon as there was a new entry on the log. After a little Google search, I went ahead and decided to ask my friend Adriano Di Paulo (who among other things, introduced me to MySQL).
A few minutes later, he showed me a working example of the Tail Perl module.
That was exactly what I needed, as soon as there is a new entry, I call the function assemble_queries() and I pass the new log entry as the parameter.


sub tail_log {
my $file=File::Tail->new(name=>$log_file, maxinterval=>1, reset_tail=>0);
while( defined (my $line=$file->read ) ) {
print "\n" . $line . "\n" if $DEBUG > 3;
assemble_queries( $line );
}
}



The assemble_queries() function is mostly based on what MC blogged about some time ago. On his blog post, he shows how to collect query related data using Dtrace and Perl.

Then, every n number of queries, I use send_json_data() to send the query information to the Dashboard, delete the sent data and it is ready to process more queries.

Now that I'm writing this, I realized that if send_json_data() fails, the information related to the queries are lost :|. (Note to self, fix it).

There are other functions in there, but they are mostly for housekeeping.

How do I use it?
Very simple, get the agent.pl script from the MySQL Forge website, edit the credentials, hosts, and ports to fit your needs (Future versions would include some kind of config file).

And then you call the script like this:

$ DEBUG=1 perl agent.pl --serveruuid="22222222-5555-5555-5555-222222222211" \
--serverhostuuid="ssh:{11:11:11:11:11:11:11:11:11:11:11:11:11:11:11:21}"\
--serverdisplayname="Main-Drizzle-web2" \
--log-file=/path/to/log/file

As soon as the scripts starts, it will add the drizzle server to the service manager, and once you start sending queries to drizzle, those queries will end up on the UI.

Next?
Next is already done :). I modified the agent.pl script to use the gearman logging plugin. I'll write a blog about it very soon.

Thanks for reading and enjoy!


Monday, August 24, 2009

More Drizzle plug-ins

Last weekend, I finally got some time to look around Drizzle. I had already compiled it on my laptop, but hadn't really looked at the code.
Then, I thought that looking over some of the blueprints on Launchpad, would be a good way to get familiar with the code base.
After a quick search, I found move function/time/ functions into plugin(s)

This blueprint is basically to create UDF plug-ins for the different time related functions.
There was no priority assigned and it was on the low hanging fruit milestone. Which was perfect for someone who doesn't really know how much time he could spend, and wants to get to know the code.

The first step was to read a bit about the process to contribute to the Drizzle project, I went to the wiki here and read about the coding standards.

I then, went ahead and saw how difficult easy the code looked like. And proceeded to email the list, asking for feedback and also to tell others what I was up to. This is important, to avoid duplicating the work of others.

Code?
This is where the fun began. I had a fresh branch, and it was time to pick the first function to make into an UDF plugin.
By luck (and you will know why luck), I picked to move unix_timestamp() first.

The Process
There are already some great plugins on the Drizzle branch. I went ahead and duplicated the md5 plugin (in plugin/md5). Renamed the folder unix_timestamp, also renamed the md5.cc to unix_timestamp.cc and edited the plugin.ini file that was on the same folder.

The md5 plugin folder also has a plugin.ac file, but it turned out I didn't need this file, so I just removed it.

It was then time to do the actual code moving. To start, I opened drizzled/function/time/unix_timestamp.cc and drizzled/function/time/unix_timestamp.h
It was pretty much copy and paste from those two files into plugin/unix_timestamp/unix_timestamp.cc

And the rest was to replace md5 for unix_timestamp :)

Notes:
When I first started, I had both, the built-in unix_timestamp() and the plugin version. To make sure the plugin was returning the correct values, I just temporary named the plugin function unix_timestamp2(). And you can do that by just changing code in two lines:

Error messages
Whenever there is an error with your function, the error message will call the plugin function func_name(), the string you return there, will be shown on error messages. One way to force this error is by including either too many, or too few parameters.

const char *func_name() const
{
return "unix_timestamp2";
}

To tell Drizzle the name of your plugin function, you use this line:

Create_function unix_timestampudf(string("unix_timestamp2"));

Most (all?) plugins files will start with lib + <name of the plugin> + _plugin.la. You specify this name using this line:

drizzle_declare_plugin(unix_timestamp)
The rest should be pretty easy to figure out.

Tip
Which I wish I knew before. Something that took me way too long to find out, when you add a new plugin folder, you need to run ./config/autorun.sh and ./configure ... && make && make install. This would make sure your new plugin gets compiled., if you skip autorun.sh, your new plugin will not be compiled.

Final steps
Once I compiled the new plugin, and verified that it all worked well. It was time to delete the built-in function.
1) Went to drizzled/Makefile.am and removed function/time/unix_timestamp.h from there.
2) Removed the files drizzled/function/time/unix_timestamp.cc and drizzled/function/time/unix_timestamp.h
3) Edited drizzled/item/create.cc and removed #include and some other references to the unix_timestamp function.
4) drizzled/sql_parse.cc also had to be edited, to remove #include .
5) Added the new plugin/unix_timestamp/ folder and files to the bzr branch.
6) Run tests (and here I found a new problem)

I'm still working on a fix for it. I'm going with using one error message, for built-in functions, as well as plugins. I hope to be pushing those changes soon.

Oh, why was I lucky to pick the unix_timestamp() function as the first one to tackle, well, I have been working on timestamp_diff for many hours, and it just does not want to work. It somehow does not see the first parameter. I'm pretty sure I'll be asking the Drizzle-discuss for help :)

The end.


select * from information_schema.plugins where plugin_name like '%time%';
+-------------------------+----------------+---------------+---------------+--------------------------------+----------------+
| PLUGIN_NAME | PLUGIN_VERSION | PLUGIN_STATUS | PLUGIN_AUTHOR | PLUGIN_DESCRIPTION | PLUGIN_LICENSE |
+-------------------------+----------------+---------------+---------------+--------------------------------+----------------+
| unix_timestamp_function | 1.0 | ACTIVE | Diego Medina | UDF for getting unix_timestamp | GPL |
+-------------------------+----------------+---------------+---------------+--------------------------------+----------------+
1 row in set (0 sec)

drizzle>
Well, not really the end, I still have plenty of functions to move into plugins.

Thanks!

Monday, August 10, 2009

Vote for me! ... widget for your blog.

Most likely you have seen Giuseppe's post showing the latest feature of Planet MySQL. Voting from RSS readers, was one feature I was really hoping for, since the day voting was announced. As I read most blogs using Google Reader.

Now, I don't remember if it was Dups who asked me, or if I asked him, but all I remember is that I ended up writing a little JavaScript widget, that you can add to your blog. This widget allows readers to vote for your blog on Planet MySQL, all from within your blog.

Why would you want to add this JavaScript to your blog?
Because you want to make it very easy for your readers to vote if they like or dislike what they just read.

Requirements/Limitations.
Yes, there are a few (small?) things that have to be in place for this widget to work.
* Your readers will have to have an account on the MySQL.com website.
* But most important, your blog post has to be already on the Planet MySQL database.
* If you are using Feedburner, and the url on your feeds is not the same as your post's url, this does not work (which is my case :( ). But I'll look for a workaround.

Code.

All you need to do is add these lines of code to your template:


<script language="JavaScript"><!--
var planet = "http://planet.mysql.com/entry/vote/?apivote=1&";
var lk=encodeURIComponent(document.location);
var thumb_up="<img src=\"http://planet.mysql.com/images/thumbs_up_blue.jpg\" border=\"0\" />";
var thumb_down="<img src=\"http://planet.mysql.com/images/thumbs_down_blue.jpg\" border=\"0\" />";
document.write('Vote on the <br /><a href=\"http://planet.mysql.com/\" >Planet MySQL</a><br />');
document.write('<a title=\"Vote me Up on the Planet MySQL\" href=\"' + planet + 'vote=1&url=' + lk + '\">' + thumb_up + '</a>');
document.write('&nbsp;');
document.write('<a title=\"Vote me Down on the Planet MySQL\" href=\"' + planet + 'vote=-1&url=' + lk + '\">' + thumb_down + '</a>');
// --></script>


How do I add it to Blogger?

1- On the left side of this blog, you will see a "Add Voting to your blog" button, click on it.


2- On the "Add page element" section, select the blog you would like to add this widget to.
3- Click "Add widget"


4- You will now see a widget under the name "Vote on Planet MySQL", you can go ahead and leave it there, or move it around.
This widget will appear on every single post you have.
5- Click on save, and you are done!



How do I add it to XYZ?

I'll talk to the Community team, and I'll ask them to have either a wiki page on the forge, or some place else the steps to add a widget like this to other blog platforms.

Enjoy!

Sunday, July 26, 2009

MockLoad on Launchpad - MySQL Proxy

Several months ago, I started a little project at work, called Mockload. It started as a fun way of using the MySQL Proxy, to test our monitoring agent, as well as the rules engine and graphs on the Service Manager.

Why?
I needed a tool that would be easy to use, and improve over time. And that it would allow our QA team to send custom values to the service manager. The goal was to pretend having some very busy MySQL servers.

And what better tool, than the MySQL Proxy itself to pretend being a busy MySQL Server!
The way our agent collects the MySQL related data, is by issuing different sql queries. So, I thought that I could have a MySQL proxy instance in between our agent, and the MySQL server we were monitoring.
This proxy would then intercept the queries that our agent was collecting, and it would return some custom values.

Writing Lua scripts
I started by looking at all the queries that our agent sends to the MySQL server it was monitoring, once I got the list, I went ahead, and looked at what values our graphs use. And finally I looked at the values our rules engine uses.

I then started by mocking just two queries, SHOW GLOBAL STATUS and SHOW GLOBAL VARIABLES.

What I do is I replace the values from a normal SHOW GLOBAL STATUS query, by Lua variables. And I manipulate those Lua variables to trigger alerts and to produce interesting graphs.

In the utils.lua script, I have this function



function increase_val(incoming_val, method, increaser)
if (method == "multiply") then
incoming_val = proxy.global.round( incoming_val * 1.1 , 0)
elseif (method == "add") then
incoming_val = incoming_val + increaser
end
return incoming_val
end
Pretty simple, this allows me to increase specific counters by just adding to the original value or by multiplying the original value by 1.1.

Replication.
But the fun does not stop there, with this script, I also mock a replication setup where the slaves show some serious replication lag.



This was actually pretty neat to implement, I had to make our agent go through the proxy to query the slave servers, so that the master and the slave would report the same binlog names, etc.

Oh, and for the most part, the proxy reports having 700 binlogs, 700 users without password, 700 users with root access to the server. These are all values that are supposed to put stress on our monitoring system.

The other challenge I had with simulating replication was how to tell each proxy instance, that the backend was a master or a slave server.
My first idea was to have two different Lua scripts, but that just did not look clean, I then thought of sending a custom query through each proxy, to tell it what the backend was. But again, I wasn't happy with neither approach. So I finally decided to do some query injection.

Master or Slave?
Whenever our agent sends a query SHOW MASTER STATUS, the lua script injects a SHOW SLAVE STATUS query to the queue. If the proxy finds an empty resultset for the slave status, it assumes that the backend server is a master. While this is not a foolproof method, it works fairly well for now.

You can see the replication mocking on the mockload.lua file.

Graphs?
These are just some of the graph our service manager produces when we use MockLoad:




Controlling the counters.
To prevent the counters from increasing without limit, I use a global variable that sets the maximum value any counter could get. Once any counter reaches that value, all counters are reset to their initial values.

Source code?
You can find the scripts that make up MockLoad on Launchpad. You may ask why I'm hosting them there. The answer is that I still have a long way to go with MockLoad, and I thought that some people would benefit from seeing how this was implemented. Who knows, someone may be able to use it as is, or without many changes.

How do I use it?
You need at least one MySQL server and a Proxy. You tell the proxy where the lua modules are by doing this:

Assuming the main file is
/usr/local/scripts/mockload.lua
and the modules are in

/usr/local/scripts/mockload/

$ export LUA_PATH=/usr/local/scripts/?.lua
$ ./sbin/mysql-proxy --defaults-file=/path/to/ini/file/proxy.ini

You can then send a query like SHOW GLOBAL STATUS through the proxy port 4040, and each time you send that query, you will notice some counters returning increased values (more than normal for an idle server :) )

I hope you all enjoy it and as always, feedback is welcome.

Wednesday, May 27, 2009

MySQL Proxy => proxydb + replication

A couple of weeks ago I wrote a lua script to use with the MySQL Proxy that transforms the Proxy into a key=>value lookup dictionary.

But I couldn't just stop there. So I decided to add replication to it :).


The basic idea is that you have one proxy that acts like a master, it can handle all write/read operations. Once it receives a write query, it will send that query to the slave proxy instances, and after the last slave gets the query, the master will return a confirmation to the mysql client.
And of course, you send your read queries to the slave proxy instances.

Show me the code.
It is available on the MySQL Forge. And I'd recommend using Oldag's version of the sandbox, which allows you to easily setup different topologies of MySQL servers and Proxies. (It saved me lots of time)

Complex situation, lots of ins, outs and what have you.
(or how does it work?)

It uses the connection pool feature from the MySQL Proxy that is based on the rw-splitting.lua script.
This means that you will need to start a few different connections to the master proxy instance to get the poll going.
I do this by using this simple command line

$ for i in `seq 1 100` ; do ./proxy1/use -e "exit"; done ;
It also makes usage of mocking a mysql server. The script checks if there is a backend defined (proxy-backend-addresses), if there is one, it will tr to open a connection, otherwise, it is the last node on the chain and it will return an OK packet.

Oh, look at the proto.to_challenge_packet({}) function, which is better than working directly with the protocol.

We then have read_query(), which is pretty easy to understand, I tried to add as many comments as I could to the actual code.
We have different functions for each type of query. I think this is a clean way of doing :).
You will also see that each function that handles different query types has a line like this one:

if proxy.global.last_node == 0 then

This tells the proxy that if it is the last node on the chain, it has to return a resultset

read_query_result()
This was fun to work on. Here is where you send the queries to the slave proxies. The proxy loops over the list of backends and appends the same query. Once it reached the last proxy, it resets the current backend server and returns an OK packet to the mysql client.

This implementation could help you duplicate and reroute your queries to a different mysql server. So that you can send them to a server with different values on the my.cnf file, or any other changes that you'd like to test.

I hope you find it useful and feedback is always welcome.

*I got seq for my mac from this post.

Monday, May 25, 2009

MySQL Proxy - is it a MySQL server or what is it?



There are times when you don't need a MySQL server to handle your queries, this could come handy when you are automating tests, or writing interesting Lua scripts.

Yes, you can use the MySQL Proxy for this as well. And have it pretend to be a MySQL server.
Today we'll focus on a simple implementation. We will handle an initial connection, a SHOW DATABASES query and the exit (QUIT) command.

You can see all the hooks that the proxy handles on this Lua script written by Giuseppe

Connecting to a server.


function packet_auth(fields)
fields = fields or { }
return "\010" .. -- proto version
(fields.version or "5.0.45-proxy") .. -- version
"\000" .. -- term-null
"\255\255\255\255" .. -- thread-id
"\000\020\000\000" ..
"\000\000\000\000" .. -- challenge - part I
"\000" .. -- filler
"\001\130" .. -- server cap (long pass, 4.1 proto)
"\008" .. -- charset
"\002\000" .. -- status
("\000"):rep(26) -- filler & challenge - part II
end

This function prepares the raw response that the MySQL Proxy will return to all clients connecting through the proxy port (default 4040)

If you are into protocols and what not, you can see the Client - Server protocol on the forge.

Then you need to use this function

function connect_server()
-- emulate a server
proxy.response = {
type = proxy.MYSQLD_PACKET_RAW,
packets = {
packet_auth()
}
}
return proxy.PROXY_SEND_RESULT
end

There, you see how we return a RAW packet (proxy.MYSQLD_PACKET_RAW), most of the time you see a proxy.MYSQLD_PACKET_OK
or proxy.MYSQLD_PACKET_ERR type of result. And then you see that
packets gets the result of the packet_auth() function.

Handle the Query types.



function read_query( packet )

if (string.byte(packet) == proxy.COM_QUIT)
or (string.byte(packet) == proxy.COM_INIT_DB) then
-- This gets called when you disconnect from the server
-- or when you specify a database on connection time
proxy.response.type = proxy.MYSQLD_PACKET_OK
return proxy.PROXY_SEND_RESULT
elseif string.byte(packet) == proxy.COM_QUERY then
if packet:sub(2) == "SHOW DATABASES" then
return show_databases()
else
-- Unkown query
return error_result (
"I haven't learnt how to handle that query"
, '101'
, '!101')
end
end
end

If the query type is either a COM_QUIT or COM_INIT_DB, you can just send an OK packet, and that will be fine.

Then we check for COM_QUERY, and once we find a query, we try to match the query text. In this example, we only handle SHOW DATABASES (all upper case), but you can handle any other query, even queries that are not real SQL :) .

If the MySQL Proxy got a SHOW DATABASES query, it will call the show_databases() function and return a resultset of 70 databases.

For all other queries, the proxy will return an error message back to the mysql client.

Send a result back to the client.

function show_databases()
proxy.response.type = proxy.MYSQLD_PACKET_OK
proxy.response.resultset = {
fields = {
{ type = proxy.MYSQL_TYPE_STRING, name = "Database", },
},
rows = {}
}
-- here we add 70 database names to the resulset
for i = 1, 70 do
table.insert(
proxy.response.resultset.rows,
{"db_" .. i})
end
return proxy.PROXY_SEND_RESULT
end

Here we create the table for proxy.response.resultset, note the empty rows table. And then, we use a for loop to add 70 items that will become 70 rows on the resultset.

Full code.

You can find this and many other lua scripts to use with the MySQL Proxy on the MySQL forge site.

How can I use it?
Because this lua script simulates being a MySQL Server, you do not need to specify a proxy-backend-address value, just start the proxy like this:

$ sbin/mysql-proxy --proxy-lua-script=/path/to/server_mock.lua

Tuesday, May 19, 2009

MySQL Proxy - what if it crashes?


While I hope the MySQL Proxy never crashes, it will happen, there will be some strange (or maybe not so strange) usage or workload and it will die.

To avoid this, you could decide not to use it, or maybe you could use something like Linux HA to have more than one MySQL Proxy running at all times. Or you could use one of the new features that comes with the version 0.7.0.

What is it?
We now have a --keepalive option. As the name indicates, if the mysql proxy process dies/crashes, it will come back up in a few seconds (less than 5 seconds).

How does it work?
If you start the MySQL Proxy with the keepalive option, there will be two processes with the same name. One will be very small, about 600KB. and then you will also see the "real" mysql proxy process. The "angel" process has a lower PID value than the "real" mysql proxy process.

If there is some kind of error that causes the proxy to die, the "angel" process will wait about 3 seconds, and try to restart the proxy process.

You can "force" this behavior by sending a

$ kill -SIGKILL < pid of the proxy >

or

$ kill -SEGV < pid of the proxy >

How do I use it?
Very simple, if you use a configuration file, you need to add a new line with

keepalive = true;

or you can start the mysql proxy like this:

sbin/mysql-proxy --proxy-lua-script=/mysql-lua-scripts/trunk/scripts/proxydb.lua --keepalive

Enjoy! ... And next time I'll write about using GDB to get some debugging information about the crash.

Sunday, May 10, 2009

MySQL Proxy - proxydb

What I really like about having Lua and MySQL Proxy together is that it turns out to be very flexible, you can have the proxy do all kinds of things. And the last thing I made the MySQL Proxy do is to act like memcached.

Well, maybe not, but it handles key => value pairs now :P




What does it do?

It handles 5 basic query types:

mysql> INSERT "key" = "value";

Very simple, insert a key => value pair on a proxy.global.db table, if there is already a value for that query, it will overwrite it.

mysql> SELECT "key";

It retrieves the value for the specified key.

mysql> DELETE "key";

Deletes the key => value pair from the lua table.

mysql> SELECT *;

It returns all the key=> value pairs.

mysql> DROP;

It deletes all rows from the table.

Internals.
This time I used the tokenizer to parse the queries, and I'm so glad I did. It is much easier and cleaner that writing regular expressions to match your queries.

I also wrote different functions to handle each statement. Which I call based on the tokens I find.

Looking at the script, you will find different ways of returning results back to the mysql client. Something that took me a while to figure out was, how to return an empty resulset, but with a number of affected rows, like what you get from an INSERT statement.

This is the answer:

function add( key, value )
-- We add the key value pair to the global table
proxy.global.db[key] = value
-- we return a nice "affected rows count"
proxy.response.type = proxy.MYSQLD_PACKET_OK
proxy.response.affected_rows = 1
proxy.response.insert_id = 0
return proxy.PROXY_SEND_RESULT
end

If you call this function, you need to do so like this:

return add(tokens[i + 1]["text"], tokens[i + 3]["text"])

and not just like this:

add(tokens[i + 1]["text"], tokens[i + 3]["text"])

* Note the "return" there. Otherwise, you will not get the custom resulset on the client.

I also added the connect_server() function, which allows you to run the mysql proxy without any mysql server on the backend.

How do I use it?
Get the code from the MySQL Forge site, and save it as proxydb.lua
Then, start the MySQL Proxy like this:

./bin/mysql-proxy --proxy-lua-script=/src/mysql-lua-scripts/trunk/scripts/proxydb.lua

You can then connect to the port 4040 using your mysql client and start sending queries. If you use any of the queries I showed before, the proxydb.lua script will handle them.

Numbers?
Yesterday I run some tests and I found that
  • Having 3,908,492 entries
  • Key length was about 11 characters
  • Value length was about 17 characters
  • MySQL Proxy process used 1.5GB of RAM
Next?
Who knows, maybe someone will use this script and let me know how it runs on a busy setup :).

Thursday, April 30, 2009

How do I use the MySQL Proxy Admin plugin?

We have an Admin plugin for the MySQL Proxy, but people started asking how to use it. I only found one example, thanks to Giuseppe, but people wanted more.

While the admin plugin is somehow limited, it already provides some nice features. One of the use cases is to give access to information to only authorized users. The Admin plugin uses its own username and password to authenticate users. This is not related to any user on your MySQL server.

The example.

I went ahead and put two scripts together in about an hour. They are basic, but should give you more of an idea of what you can do.

Number of queries processed.

Connecting to the admin port 4041, you can execute this query

mysql> show querycounter;
+---------------+
| query_counter |
+---------------+
|            15 |
+---------------+
1 row in set (0.00 sec)

And you can see how many queries went through the proxy (does not count the queries to the admin plugin)


Backends list.

You can see some information about the backends. Type 1 means master and type 2 means slave

mysql> SELECT * FROM backends;
+-------------+-----------------+-------+------+
| backend_ndx | address         | state | type |
+-------------+-----------------+-------+------+
|           1 | 127.0.0.1:22547 | 1     | 1    |
|           2 | 127.0.0.1:22548 | 0     | 2    |
|           3 | 127.0.0.1:22549 | 0     | 2    |
+-------------+-----------------+-------+------+
3 rows in set (0.00 sec)


Client connections list

mysql> SHOW PROXY PROCESSLIST;
+------+--------------------+--------------------------+
| Id   | IP Address         | Time                     |
+------+--------------------+--------------------------+
| 249  | 127.0.0.1:53830    | Fri May  1 00:13:28 2009 |
| 248  | 127.0.0.1:53827    | Fri May  1 00:13:27 2009 |
| 247  | 127.0.0.1:53825    | Fri May  1 00:13:26 2009 |
| 250  | 192.168.0.99:53836 | Fri May  1 00:13:38 2009 |
+------+--------------------+--------------------------+
4 rows in set (0.00 sec)


The way it works is basically by collecting data using the proxy plugin, making that information available using proxy.global.* variables, and then the admin plugin reads those global variables and returns the data on a pretty format :)

You can start the proxy with a command like this:
./sbin/mysql-proxy --admin-lua-script=/Users/wizard/etools/mysql-lua-scripts/trunk/scripts/admin-1.lua \
--proxy-backendddresses=127.0.0.1:22547 \
--proxy-read-only-backend-addresses=127.0.0.1:22548 \
--proxy-read-only-backend-addresses=127.0.0.1:22549 \
--proxy-lua-script=/Users/wizard/etools/mysql-lua-scripts/trunk/scripts/reporter.lua \
--plugins=admin


Where are the Lua scripts?

You can find them on the MySQL Forge site here and here.

And remember that the default username and password for the admin plugin is root / secret, which you can change by using
--admin-username=proxy \
--admin-password=guesswhat


Enjoy!

Tuesday, April 14, 2009

A feature, a bug, a new feature and a bug fix

After I wrote about reading a master.info file using the MySQL Proxy, I went ahead and added the missing piece. Creating a master.info file using the MySQL Proxy.

A bug?
As I went back to lib/mysql-proto.c looking for a function that I could duplicate and modify to add the to_masterinfo_string() function, I realized that I missed a few master_ssl_* fields. It turned out that I was not exporting all the fields from the master.info file.

This time, the bug fix was easy enough. After modifying the test case to account for the missing fields, I added a few lines of code and I was ready to add the new feature.

New feature
You can now use the function to_masterinfo_string(), which takes as a parameter a table. You can create one using from_masterinfo_string()

You could read this example to get an idea of how it could be used.

Next?
I'll continue my work on Mock Load(*) and wait for the merge proposal to be accepted.

(*)More about what it does once I finish some legal paperwork here at Sun.

Friday, April 10, 2009

Reading master.info files using the MySQL Proxy

After I wrote about a new feature on MySQL Proxy that helps you read master.info files, I thought that showing an example could come handy.

You can find the complete file on the MySQL Forge and once I have a test case for this script, it will be available on Launchpad.

Explaining the code.
You can see at the top I have
local proto = assert(require("mysql.proto"))
This is important, as it makes the from_masterinfo_string() function available for use.

I also included a function called get_command(), which is a modification of code found here.
It basically does a simple parsing of the statement you send through the proxy and returns two variables.

read_query()
Here, I look for the query "read masterinfo;" and convert it into
SHOW GLOBAL VARIABLES LIKE "datadir"
I do this so that I could get the location of the master.info file. You can just hardcode it, but that's not fun :)

read_query_result()
Here, I read the value for datadir, and use it to load the content of master.info into a lua variable, convert the content into a mysql result set and send it back to the client.

How to use.
You need to connect to a mysql proxy that is in front of a slave mysql server; (and tell the proxy to use this lua script)(*). Then execute

mysql> read masterinfo;

And you would get something like this:

mysql> read masterinfo;
+----------------------+----------------+
| Variable_name | Value |
+----------------------+----------------+
| master_user | slave |
| master_ssl | 0 |
| master_port | 43309 |
| master_log_file | dv1-bin.000005 |
| master_connect_retry | 60 |
| master_password | slave |
| master_log_pos | 272 |
| master_host | 192.168.2.1 |
+----------------------+----------------+
8 rows in set (0.00 sec)

And you are done.


(*)You can find more information about the MySQL Proxy on the MySQL Forge site.

Tuesday, April 7, 2009

MySQL Proxy and master.info (contribution 2 of N)

This time I'll write about a nice featured we now have on the MySQL Proxy.

Parsing master.info
The master.info is a file that the MySQL server creates when you are running a slave server, you can read the manual for more details.
And now you can use the MySQL Proxy to read that file and parse the data from it.

Background
It all started when Jan emailed the proxy discuss mailing with the Changelog from 0.6.1 to 0.7.0. The part that got my attention was:
"...
Core
[skip]
* added parser for master.info files"
... "
As I was working on a Lua script that did some simulation, I asked Jan how that worked, and I found out that it was only implemented on C-land, and only compatible with MySQL server 5.1. That meant you could not write a Lua script and use the parser we had.



Nothing like motivation
So I went ahead and asked how much work it would take to implement this feature and I was pointed to the lib/mysql-proto.c file. As I really wanted to have this implemented, I couldn't resist and made a local branch to work on it.

I first looked at that whole file, trying to figure out how things worked and after reading lib/mysql-proto.c I found out I also needed to look at src/network-mysqld-masterinfo.c. After some more reading, I was ready to write some code.

Thought process
Well, I was ready to modify what was there. The way I usually add code when I'm not very familiar with it is by modifying the current code, little by little, compile, test, fix what I broke and continue. I do this a few times until I have something new working :).

This time I took this function:

static int lua_proto_get_ok_packet (lua_State *L) {
size_t packet_len;
const char *packet_str = luaL_checklstring(L, 1, &packet_len);
network_mysqld_ok_packet_t *ok_packet;
network_packet packet;
GString s;
int err = 0;

s.str = (char *)packet_str;
s.len = packet_len;

packet.data = &s;
packet.offset = 0;

ok_packet = network_mysqld_ok_packet_new();

err = err || network_mysqld_proto_get_ok_packet(&packet, ok_packet);
if (err) {
network_mysqld_ok_packet_free(ok_packet);

luaL_error(L, "%s: network_mysqld_proto_get_ok_packet() failed", G_STRLOC);
return 0;
}

lua_newtable(L);
LUA_EXPORT_INT(ok_packet, server_status);
LUA_EXPORT_INT(ok_packet, insert_id);
LUA_EXPORT_INT(ok_packet, warnings);
LUA_EXPORT_INT(ok_packet, affected_rows);

network_mysqld_ok_packet_free(ok_packet);

return 1;
}

I duplicated it and renamed it "lua_proto_get_masterinfo_string". I chose that function as it had some "LUA_EXPORT_INT" lines which sounded like what I needed.

As you can see form here I started by commenting out some lines and renaming some function calls and other lines of code.

I also added a line to "luaL_reg mysql_protolib" which is where things really get exposed to Lua (I found this out by searching the whole branch for "from_ok_packet" which I found on test/unit/lua/mysql-proto.lua)

I then compiled the new code and wrote a simple Lua script to test this (which was not the best way, but more on that later).
At this point all I wanted to test was that I could call "get_masterinfo()" from Lua, and it worked!

I then removed the lines that I comentd out and after showing Jan where I was, he suggested renaming the function, so I did.

Where I should've started.
We all now that we should first have a test, and then code, well, I wasn't sure what I was doing so I started the other way, but eventualy I got to writing the test for this piece of code.
There was already a lua unit test for the mysql-proto.c file, so I added some tests for the master.info parsing.

At that point, it only worked for master.info files from MySQL 5.1, so the next step was to add support for 4.1 and 5.0.

The way I implemnted the suport for 4.1 and 5.0 master.info files took me a few days, I really wanted to return a NULL (nil in Lua) value for "master_ssl_verify_server_cert" if it was not present, but I really did not know how to do it. Either Jan or Kay came to the rescue there and we ended up with this.

Basically, if you don't export it, it will be NULL. And having the unit test made it so much easier to test the changes.

And as I was done with this, I went ahead and proposed a merge and a few days later it was part of the Main Proxy code.

Next?
So far we can parse a master.info file, but we cannot "create" one, so this is next, I want to have a function that would create the master.info file, I know I could just use the io library from Lua, but it will be more fun to add a "to_masterinfo_string()" function on Lua land :)

Thanks and enjoy!

Monday, April 6, 2009

Contributing to MySQL Proxy (1 of N)

Thanks to Kay, it is now much easier to contribute to the MySQL Proxy project. And it turns out you don't have to be a super C developer to help out.
Yes, you may think, you can report bugs, help answering questions, even submit Lua scripts. But there is another way you can contribute.

Blueprints
You can check out the blueprints for the Proxy and pick one that seems "simple" to implement. That's what I did. I went there and picked "Chassis/Proxy should have short names for often used options". It sounded simple enough for me, and I thought it would help me get more familiar with the source code.
After a few emails on the mailing list, I gather all the info I needed and created a branch to work on it.

It turned out not to be much work at all :), so after I pushed my changes to the branch I created on Launchpad, I proposed a merge and now I'm waiting for a code review and hopefully it will be part of the main Proxy branch.


Conclusion
No matter what your skill set is, there are plenty of ways to get involved on this project.

Thursday, March 26, 2009

Lua script repository for MySQL Proxy


Doing QA on the MySQL Proxy and the Enterprise tools, I started writing a Lua script to use with the proxy. The goal was to tell our monitoring agent that the server it was monitoring was very busy. It basically intercepts queries like SHOW GLOBAL VARIABLES or SHOW GLOBAL STATUS among others and returns a custom resultset. (More details on a future post)

It is still a work in progress, but I wanted to give the community access to it. After some emails on the MySQL Proxy Discuss mailing list, I created a project on Launchpad that will host this script (one main script and 4 modules so far) and the idea is to make this project a community owned project, so that we can have all kinds of Lua scripts there.

What about the MySQL Forge?
I know we have many Lua scripts on the forge site, but having the scripts on Launchpad would enable anyone to maintain those scripts.

In the last few month, there have been some changes on the MySQL Proxy code that broke some of those scripts, by moving them to this new repository, we could start fixing them.


Now, the legal department at Sun is doing some work so that this script can be available under the GPLv2 license. This process may take a few weeks :(

As soon as I get this approved, I'll blog about it!

Tuesday, March 24, 2009

Simulating workload with MySQL Proxy

On April 2nd at 10 AM PST, I'll be giving a webinar with Giuseppe. I will be talking about how I use the MySQL Proxy to test itself and to test our Monitoring Agent using some Lua scripts.

I'll also talk about a new Launchpad repository for Lua scripts to use with the MySQL Proxy. One of the nice things about it, is that it will be community-own. So even though a Sun employee registered it, the community will be able to make contributions and/or decide what gets included there. In some ways, it will be like the Drizzle project.

I hope you see you there!

P.S. You can register here

Vote on Planet MySQL