Monthly Archive for March, 2008

How to Tune Your MySQL Server (The Easy Way)

My previous experience with tuning MySQL servers was that you should start with the example my.conf that most closely matches your hardware and then Google around for the little tweaks that others have reportedly had good luck with. Definitive step-by-step guides on exactly what settings to change and why have historically been a little bit hard to come by. I guess thats what the MySQL DBAs get paid for, huh?

Well now there are a couple of scripts out there that will examine your server’s runtime statistics and provide you with advice on what to change (kind of like what Bastille does for Linux).

Either one of this scripts will likely save you hours of Googling/experimenting. Both do essentially the same thing, but it never hurts to have a 2nd opinion. And if you’re still not sure about something, a trip to MySQL Performance Blog should fix you right up.

Windows 2008 Telnet (not SSH) Server

Have you heard that Windows 2008 will be able to run in a command-line only mode, but will continue to ship with a telnet server instead of SSH? This is awesome, seeing as how telnet is an insecure, antiquated method of remote access that should not be used by anyone under any circumstances. Congratulations Microsoft! Welcome to the 1970′s! Should we expect the SSH server in Windows Server 2033?

Seriously, what the fuck are those people doing over there?

Update According to Microsoft, there will be “a technology like this included in Windows Server 2008 called WinRS; or Windows Remote Shell. This command line tool allows administrators to remotely execute most cmd.exe commands using the WS_Management protocol.” Too bad it sucks!

See Also: “Not Invented Here Syndrome

Thoughts on GUI Automation

Until today, I never thought much about writing scripts that rely on a GUI. I knew there were tools out there to do it, but I never looked into it, because it always seemed kind of tasteless to me; like something a VB programmer would do out of ignorance. And then when I did some research for a project I was working on today, I found confirmation of that in some hilarious quotes about one particular GUI scripting language being “the nicest procedural programming language I’ve [sic] ever worked with.” Obviously these people spend too much time dragging around windows and clicking on buttons to know what the hell they’re talking about, so before I go any further with this post, I’m going to give you a list of reasons why it’s generally not a good idea to write a script for a GUI:

  • It requires a GUI. That’s a lot of overhead for a script. So you either leave your computer logged in with some predetermined user account, or you somehow script the logon and logoff part too. Both methods are insecure and a waste of system resources.
  • It’s basically reverse “screen scraping,” and the same caveats apply. Your script is relying on display elements that are more or less guaranteed to change in the future (upgrades, etc.). That means your script will most likely have a very short lifespan, lots of updates, or both.
  • It’s an inefficient method of programming. A GUI is designed for interaction between a human and a computer; not between a computer and itself. All those mouse clicks are translated into machine instructions anyway, so if I already know the machine instructions I want to execute, why would I want to waste time telling the computer where to type text and click the mouse? A GUI script is working with an unnecessary layer of abstraction, which is only useful when building Rube Goldberg machines, and Rube Goldberg machines have no place in any IT department.

There are a few situations, however, where automating a GUI is a very good idea. The most obvious use for GUI automation is in unit testing and software quality control. Obviously, good unit tests are a lot cheaper and less error-prone than QA people, and from what I can tell, software developers are the target audience for most commercial GUI scripting tools.

Another good use for GUI automation (as I discovered today) is when working with shitty old software with proprietary databases of which there is no hope of accessing with Python. In my case, I had to come up with a way to pull a report out of the biometric security system for one of our datacenters and email it to someone on a weekly basis. My first thought was to connect directly to the database using the Python DBI and do the query myself, but gave up on that idea when I discovered that not only was everything written in German (the filenames, the documentation, everything), but the actual “database” was just a bunch of crusty old binary files.

The ultimate solution was a free scripting language called AutoIT. It isn’t exactly pretty, but it works. And seeing as how the GUI for our security system looks like something out of Windows 95, I figure I probably won’t have to worry about it changing my script any time soon.

More Hardware Upgrades

Hopefully you’ve noticed some drastic improvements in load times, as I’ve moved the website to new hardware. My old “server” was a little Mini-ITX box with a 1Ghz VIA Nehemiah CPU, 256MB of memory, and a laptop hard drive. It certainly was quiet and power efficient, but it was frustratingly slow sometimes. The new server is actually my old desktop with an AMD Athlon XP 2500+ CPU, 512MB of memory, and two brand new 750GB WD Caviars running in a RAID 1 array (yes, these drives are dead silent like the reviews say). I decided to go with Linux software raid after reading some interesting opinions on Linux Software RAID vs Hardware RAID. I also found some surprising benchmarks showing that Linux software RAID is actually faster than a lot of consumer-level SATA RAID cards.

Compared to the old hardware…

root@old-server:~# hdparm -Tt /dev/hda
 
/dev/hda:
 Timing cached reads:   278 MB in  2.01 seconds = 138.26 MB/sec
 Timing buffered disk reads:   88 MB in  3.00 seconds =  29.32 MB/sec

the new hardware shows huge improvements in disk transfer rates, pretty much in line with what I expected.

root@new-server:~# hdparm -tT /dev/md0
 
/dev/md0:
 Timing cached reads:   950 MB in  2.00 seconds = 474.49 MB/sec
 Timing buffered disk reads:  222 MB in  3.01 seconds =  73.81 MB/sec

With twice as much memory as before, I’m also not swapping nearly as much (or really at all for that matter). But what I didn’t expect was for Linux software RAID and LVM2 to have such a big impact on my load average. It actually seems to be around 10-20% higher on average, even with a much faster CPU. Interesting…

How to PROPERLY Choose your Internal DNS Domain

One of my biggest IT-related pet peeves is a broken DNS infrastructure. Since nobody seems to know how to implement this properly, I have decided to write a little howto to help put an end to the insanity.

  • Don’t just use whatever the hell domain name you want and justify it by saying “we’ll only be using this domain internally, so it doesn’t matter if we actually own it or not.” That’s just as dumb as using someone else’s public IP addresses on your LAN, and if you don’t understand what’s wrong with that, you’re fired. Make sure the domain you want to use is unique on the internet, and register it.
  • Do use a standard TLD; not that .local bullshit. Using a non-standard TLD like .local is a great way of showing the world that you have absolutely no taste (see below).
  • Don’t go out and register two entirely different domains (e.g. example.com and example.net) for your internal and external namespaces. This is unnecessary, will confuse your users, and will tell the world that you don’t understand how DNS works. Just use sub-domains (e.g. hq.example.com, office.example.com, etc.) for your internal networks, and reserve the root domain (i.e. example.com) for your external resources.
  • Do use different internal and external namespaces. If your external namespace is example.com, don’t use example.com for your internal (i.e. Active Directory) namespace too. Otherwise, you’ll run into problems when your internal users can’t resolve external resources (like your website which may be hosted off-site). If you were stupid enough to make this mistake, one solution is to mirror all your external resource records on your internal DNS servers, but then you’ll have to add/change every record in two places.
  • Do run your internal and external namespaces on separate servers (or at least in separate views). It’s not a good idea to make your internal resource records available to the whole world in the first place, but if you’re using proper private IP addresses on your LAN, this won’t help anyone access your servers over the internet anyway.

Before you ask, if you think you have a good reason not to follow any of the above rules, you are wrong. Don’t do it. I’m begging you.