Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How not to test that mysqld is alive (openark.org)
65 points by kirubakaran on Oct 5, 2009 | hide | past | favorite | 22 comments


'ps ax | grep [m]ysqld' (The regex '[f]oo' will match 'foo' but not itself.)


That's one of my favorite shell techniques.

Still, I think they're doing it wrong. For keeping daemons alive I rely on http://cr.yp.to/daemontools.html .


This is the correct answer -- when you fork the process yourself, you don't have to guess about it dying, you get SIGCHLD and can handle it accordingly.

If you use daemontools, this code is already written for you.


Can you use daemontools to monitor/kill/restart processes that run out of memory but don't exit?


Don't... You will still see someone else's 'grep mysql' as well as yours 'man mysql' and any command that just happens to include that string :/


Well, but why does that work? Doesn't it rely on the opposite outcome of the race condition mentioned in the article?


regex background: [abc] matches any of 'a', 'b', or 'c'. [m] matches only 'm'.

It works because the text to match is no longer included in the command line invoked to do the matching. Whether or not 'grep [m]ysqld' appears in the list produced by ps is irrelevant, as [m]ysqld does not match "m]ysqld". i.e. the 'm' matches, but then the match fails when it hits ']'.

tl;dr It's just a short way of matching "mysqld" without the matching pattern itself being "mysqld".


nope. doesn't.


I guess it was I that fell into a race condition: replying before my brain had finished parsing the command. :)

It's obvious now, that regexp does not match, for example, the "]".


Sounds very hacktastical to count the entries as a way to know if the service is alive.

If I found someone doing this on one of my production database machines, I think I'd have to dig out the paintball gun and start chasing them around, especially if the person run around with the title of sysadmin. (I'd cut a slight bit of slack if they were a webdev.)


I wonder why they didn't try the command a few times first from the shell.

Anyone who's ever done 'ps | grep' will have noticed that the grep sometimes appears, sometimes doesn't.

Course you could do 'ps | grep | grep -v grep'


If it restarted every few hours and the test executed once a minute, then the bug would appear only once in a hundred executions. It's a good post, I think I could have made the same mistake easily.


It takes me about 10 runs to get a line of output without the grep. Sometimes less. Try it :)

EDIT:

test.sh:

  for i in {1..1000}
  do
  ps | grep grep | wc -l
  done
./test.sh | sort | uniq -c

  13    0
  987   1
Which would indeed suggest 1.3%. Likely dependent on how quickly you're running it, if it's in memory already etc etc


I'd never noticed it. I worked in a unix shell for a while, but I used to use ps piped to grep. But I suspect I often used 'grep -v' to filter the output anyway when I really needed just the non-grep lines.


You should check out monit ( http://mmonit.com/monit/ ). It does, as the name implies, monitoring of processes.


Unmentioned is that there could easily be other 'mysqld' processes running on your box, even without someone screwing with you -- a lot of commercial apps bundle their own copy, even on the desktop (like Acrobat 8 Pro).


Had they just run

  /etc/init.d/mysql start
instead of "restart," their script would have worked fine. The spurious starts when mysqld was already started would have been ignored.


And best way, that will set your mind at ease even if you’re worried that “mysql is running but not responding; it is stuck”: connect to MySQL and issue SELECT 1, SELECT NOW(), SELECT something.

Fortunately this is easy:

  $ echo "select now();" | mysql
  now()
  2009-10-05 19:24:22


Informative. I wasn't aware of the fact that all of the processes would be started simultaneously, but in retrospect it makes sense.


The pgrep command mentioned in the article was new to me and does the job with a single program. Had it on my debian preinstalled.

http://en.wikipedia.org/wiki/Pgrep


Use "mysqladmin"

run: mysqladmin ping

output: mysqld is alive


While the future is uncertain re: Oracle, Solaris's SMF takes care of this problem automatically.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: