[Linux-disciples] How to figure out who's busying my machine

Stephen R Laniel steve at stevereads.com
Tue Feb 9 17:51:39 EST 2010


At work I get Nagios alerts pretty regularly from this one database server. Maybe ever 12 or 15 minutes I'll get notified that the server is experiencing elevated system load; a couple minutes later I'll get another message saying that the load is back to normal. If I connect to the MySQL instance on that box and run "SHOW PROCESSLIST", I almost always find that there's nothing interesting happening. Quite often it's just 100 connections in the 'sleep' state. (That many sleeping connections is fine: apps open a pooled connection, then keep them open in case anyone else wants to stuff some data down the pipe.)

What I'd like to know, at moments of high system load, is who is using up my CPU. Obviously "ps" does the job, and lo: it turns out that mysqld is consuming 70+% of the CPU. This is to be expected; it's a DB server.

I could use some domain-specific knowledge to figure out what's going on here, and indeed I'm doing that. Maybe, for instance, there's a big disk commit happening every 12 minutes, for some reason. Is there some more generic way of figuring out what's happening at those exact moments of high system load, though? For instance, is there any way to see which files a specific pid is writing to at a specific moment? lsof is normally helpful, but doesn't really help in this case: I know that mysqld will be holding open all the InnoDB files corresponding to a given set of tables, so lsof is unlikely to tell me anything surprising.

I've not done much with this sort of performance debugging. I'd appreciate any help you can pass along. As far as domain-specific stuff goes, I'm going to spend some quality time tomorrow at work with the High-Performance MySQL book.

-- 
Stephen R. Laniel
steve at stevereads.com
Cell: +(617) 308-5571
http://stevereads.com/
PGP key: http://stevereads.com/slaniel.key



More information about the Linux-disciples mailing list