The Real or Official MySQL? Does Not Matter!

Posted by Jeremy@Zawodny.com | Internet | Tuesday 31 March 2009 8:23 am

Yesterday Patrick Galbraith asked What is the official branch of MySQL? which got a lot of attention, including on Slashdot (and the token PostgeSQL comments quickly appeared).

Here’s the funny thing. It doesn’t matter anymore. Patrick’s question is interesting in an academic sense, but it’s mainly a distraction from what really matters. (Hint: What’s the official Linux and who really cares? Ubuntu? RedHat? Debian? CentOS?)

Storage Engines

Nowadays what matters is the set of available storage engines. InnoDB, Percona’s XtraDB, PrimeBase’s PBXT, Maria, Falcon, and several others are available or will be soon. I predict that for the foreseeable future, any MySQL distribution or derivative must support the storage engine plug-in API that MySQL 5.1 defined. And since that’s the case, it largely won’t matter which flavor you using.

Protocol(s)

Look at what’s happened in the world of key/value databases in the last few years. More than a few of them speak the memcached protocol as either their native and default or an optional add-on. I suspect the same thing will be the case here. All MySQL distributions and derivatives will speak the “traditional” MySQL protocol (just like memecached has the old protocol). Some of them, notably Drizzle, will have other (newere, better) protocols available as well (much like memcached has the new binary protocol).

Summary

In summary, the choice of MySQL version or derivative won’t matter as much as you might think because they’ll have the same Storge Engine plug-ins available (thanks to the shared plugin-in API), they’ll all speak a common protocol (this may not be true for replication–watch that area closely), and will largely offer the same subset of SQL and SQL extensions.

They’ll all be supported by different groups/companies (including some “database appliance” vendors), will all be tuned differently and aimed at slightly different uses cases, and will certainly benefit from a lot of cross-pollination.

That doesn’t sound so bad to me.

The fact that nobody can point to the “real” MySQL in a few years just won’t matter. Does anyone ask (anymore) which is the “real” Linux? Nope. And for very similar reasons. Think of MySQL as “kernel” and Storage Engine as “filesystem” and you’ll realize we’ve been down this road before.

We’re looking at the upgrade from 5.0 to 5.1 soon at Craigslist and don’t know if we’ll be using InnoDB or XtraDB yet. Time will tell.

See Also: The New MySQL Landscape, which I wrote a few months back–before a good chunk of the MySQL team had left Sun.

(comments)

Playing With CouchDB: First Impressions

Posted by Jeremy@Zawodny.com | Technology | Tuesday 10 February 2009 11:58 am

About a week ago, Nat
posted Open
Source NG Databases
on O’Reilly Radar. That caught my interest
because I’m playing with some “alternative” databases for some of our
data at Craigslist. Don’t get me wrong, MySQL is great. But MySQL
isn’t well suited to every use case out there either. (I’ll talk more
about
this at
the MySQL Conference
.)

Meanwhile, I
left a
comment on that posting
about CouchDB and have been playing with
it a bit more since then–mostly loading in test data, figuring out
the data footprint, performance, etc.

Overall, I’m impressed and encouraged. I agree with
what Ben
Bangert said
. The simple API is great but the lack of a schema to
worry about really makes my life simple in this application. I don’t
have any initial plans for views, but writing them in Javascript is an
interesting idea. I can definitely appreciate the flexibility there.
And having good replication built-in solves one of my big needs.

I’m sure my thinking will have evolve after I’ve loaded a few
hundred million documents in, but so far I’m really liking it. The
CPAN modules
in Net::CouchDb
do a pretty good job and get you up and running quickly. I had a
knee-jerk response to tweak a few things there but quickly realize
that they’re far from being the bottleneck anyway.

It seems that without any tuning or fancy work, I can get about
75-100 inerts/sec on my desktop class Ubuntu box (Intel Core 2 Duo,
2.66GHz, 1GB RAM, single 80GB SATA disk). That’s not bad for
out-of-the-box performance. And doing the math on space used for a
document set (after compaction), I’m seeing roughly ~3KB/doc. That’s
a bit more than I expected but really not bad at all.

I wonder if there’s a future for gzip compression in CouchDB. Or
maybe we should just use ZFS…

(comments)