The Real or Official MySQL? Does Not Matter!

Posted by Jeremy@Zawodny.com | Internet | Tuesday 31 March 2009 8:23 am

Yesterday Patrick Galbraith asked What is the official branch of MySQL? which got a lot of attention, including on Slashdot (and the token PostgeSQL comments quickly appeared).

Here’s the funny thing. It doesn’t matter anymore. Patrick’s question is interesting in an academic sense, but it’s mainly a distraction from what really matters. (Hint: What’s the official Linux and who really cares? Ubuntu? RedHat? Debian? CentOS?)

Storage Engines

Nowadays what matters is the set of available storage engines. InnoDB, Percona’s XtraDB, PrimeBase’s PBXT, Maria, Falcon, and several others are available or will be soon. I predict that for the foreseeable future, any MySQL distribution or derivative must support the storage engine plug-in API that MySQL 5.1 defined. And since that’s the case, it largely won’t matter which flavor you using.

Protocol(s)

Look at what’s happened in the world of key/value databases in the last few years. More than a few of them speak the memcached protocol as either their native and default or an optional add-on. I suspect the same thing will be the case here. All MySQL distributions and derivatives will speak the “traditional” MySQL protocol (just like memecached has the old protocol). Some of them, notably Drizzle, will have other (newere, better) protocols available as well (much like memcached has the new binary protocol).

Summary

In summary, the choice of MySQL version or derivative won’t matter as much as you might think because they’ll have the same Storge Engine plug-ins available (thanks to the shared plugin-in API), they’ll all speak a common protocol (this may not be true for replication–watch that area closely), and will largely offer the same subset of SQL and SQL extensions.

They’ll all be supported by different groups/companies (including some “database appliance” vendors), will all be tuned differently and aimed at slightly different uses cases, and will certainly benefit from a lot of cross-pollination.

That doesn’t sound so bad to me.

The fact that nobody can point to the “real” MySQL in a few years just won’t matter. Does anyone ask (anymore) which is the “real” Linux? Nope. And for very similar reasons. Think of MySQL as “kernel” and Storage Engine as “filesystem” and you’ll realize we’ve been down this road before.

We’re looking at the upgrade from 5.0 to 5.1 soon at Craigslist and don’t know if we’ll be using InnoDB or XtraDB yet. Time will tell.

See Also: The New MySQL Landscape, which I wrote a few months back–before a good chunk of the MySQL team had left Sun.

(comments)

Enable Visual Effects in Ubuntu to Increase Performance

Posted by Jeremy@Zawodny.com | Internet | Tuesday 24 March 2009 12:29 pm

For a while now, I’d been running Ubuntu 8.04 on my Thinkpad T61 (work machine) with Visual Effects disabled.

Why?

There were weird bugs with compiz and xterm that caused corruption at times. So I shut it off and never thought about it again. But a few days ago, I upgraded to 8.10 despite the apparent increase in WiFi related lock-ups I can expect to see (apparently I don’t have the Intel wireless in this machine… grumble).

Switching virtual desktops, or “Workspaces” as they’re called, seemed to be even slower than before–almost intolerable. Just for kicks I decided to go play with the settings.

Ubuntu Visual Effects Preferences

Imagine my surprise when switching that selection from “None” to “Normal” resulted in an dramatic increase in virtual desktop redraw perfomance.

Yay!

Counterintuitive, but yay anyway.

(comments)

New Craigslist Search Features

Posted by Jeremy@Zawodny.com | Internet | Thursday 5 March 2009 4:47 pm

I haven’t said a lot here about what I’ve been working on at Craigslist recently. But Craig mentioned me today in his blog and that made me remember that I should say something. :-)

Much of my work has been behind the scenes infrastructure stuff, but some of that is translating into new features that craigslist users can see. And, as of this morning, a lot more users are seeing the fruits of that labor.

As I noted a few weeks back in Sphinx Search at Craigslist, I’ve been hacking a lot on search. Here’s a screen shot to show you what I’ve been calling “nearby search” (though “nearby results” is probably more appropriate).

Craigslist Nearby Results in Toledo

If you run a search in a city and there aren’t many results, we’ll also run the search in nearby areas to see if we can find matches there too. The above example was a search for “2008 mazda” in my hometown of Toledo, Ohio. The “nearby” results are clearly separated from local matches and local matches are still given priority.

The feedback has been generally positive so far. Though, with any change, some folks aren’t happy. I can’t say it’s going to stay in this exact form. We may need to tweak the interface, the radius of the nearby search areas, and so on. But on the whole I think it’s a helpful improvement when you’re looking for something that’s a bit harder to find and you’re willing to drive an hour or two.

As of earlier today, it’s available in most smaller and medium sized US cities. It’ll probably come to the remainder of cities before long too. I’ve been testing it for about a week and a half, starting with about a dozen cities and then adding about twenty more late last week. This morning I mostly flipped the big switch.

Of course, this opened the flood gates for similar feature requests: custom radius searches, state wide searching, search ALL of craigslist, etc.

In related news, a couple months back I expanded the search help page to include advanced search syntax, including grouping, negation, OR queries, and more.

(comments)

Sphinx Search at Craigslist

Posted by admin | Internet | Friday 16 January 2009 10:55 am

A couple days ago, Andrew posted a news item titled Sphinx goes billions to the Sphinx web site.

Last but not least, Powered By section, now at 113 sites and counting, was updated and restyled. I had long wondered how much Sphinx search queries are performed per month if we sum all the sites using it, and whether we already hit 1B page views per month or not. Being open-source, there’s no easy way to tell. But now with the addition of craigslist to Powered By list I finally know that we do. Many thanks to Jeremy Zawodny who worked hard on making that happen, my itch is no more. :-)

Well, I guess the cat’s out of the bag! My first project at Craigslist was replacing MySQL FULLTEXT indexing with Sphinx. It wasn’t the easiest road in the world, for a variety of reasons, but we got it all working and it’s been humming along very well ever since. And I learned a heck of a lot about both Sphinx and craigslist internals in the process too.

I’m not going to go into a lot of details on the implementation here, other than to say Sphinx is faster and far more resource efficient than MySQL was for this task. In the MySQL and Search and Craigslist talk I’m giving at the 2009 MySQL Users Conference, I’ll go into a lot more detail about the unique problems we had and how we solved them.

For what it’s worth, the implementation isn’t really done. I did update the search help page on the site to reflect some of the capabilities (hey, look! OR searches!) but there are features I have planned that I’d like to expose as time allows.

(comments)