15 December 2007

Amazon's SimpleDB

Amazon is entering the DBMS market... With a non-DBMS?

Let's look at the article found at http://aws.typepad.com/aws/2007/12/a-place-for-eve.html.

"Amazon SimpleDB makes it really easy and straightforward to store and to retrieve structured data. You no longer need to worry about creating, maintaining, or migrating database schemas, monitoring and tuning the performance of your queries, outgrowing the storage or processing capacity of your database server, making backups, or replicating data."

Already I am confused... 'retrieve structured data,' 'no longer need to worry about creating... database schemas.' But... structured data... schema... structure... schema... Do you see my confusion? Let's read on...

"Instead you simply create up to 100 SimpleDB domains (each of which can hold up to 10 GB of data, for a total of 1 TB) and then start to store structured data in the form of items."

So, create a database and put some tables in it? 'Item' is a very vague term. 'Domain' has a few meanings in the IT world, and a very specific meaning in the database world. But everyone brands their own terminology...

"Each item consists of multiple name/value pairs (which we call attributes)"

Novel idea.

"With SimpleDB there is no need for a time-consuming schema change when you need to store additional information in your database. You simply store the additional attributes as desired."

So domain became database. And we are adding columns where desired. How is this not maintaining a schema? More accurately, this is what programmers often do when there isn't a DBA around to smack their wrists... So again, I fail to see any innovation.

"For example, if you were building a tag cloud to represent information about a collection of web sites, you could store the site URL as the first attribute and the entire set of tags as the second."

So it has object oriented capabilities? Cool! There are people that would really like to see that functionality.

"After the system has been running for a while, you decide to add a thumbnail for each URL (of course the Alexa Site Thumbnail Service would be perfect for this) and simply add a third attribute to the new entries. Later, as desired, you can go back and add this attribute to the older entries. This ability to improve your data model on a dynamic, as-needed basis makes Amazon SimpleDB a perfect match for today's fast-paced world of agile development, where flexibility and adaptability are of paramount importance."


"Applications which require long-running queries and/or complex table joins, such as those for data warehouse applications, are probably not a good fit for SimpleDB today. While RDBMS offerings provide deep functionality, for many use cases, they introduce more complexity (and more cost) than is necessary. Many developers simply want to store, process, and query their data without worrying about managing schemas, maintaining indexes, tuning performance or scaling access to their data."

This sounds like data management, '80s style.

I love Amazon. I think that they are a great company with incredible customer service. It took them years to get in the black and I rooted for them the whole way. Even this nonsense does not turn me off to them. However, I do hope that they revisit their presentation. There are many many reasons why the relational model is better than flat-file and hierarchical models. Dismissing the relational model, RDBMSes and DBAs as by and large unneccessary is ludicrous. Especially when you can't even get away from the basic concepts and terminology in your 'solution.' I also doubt that SimpleDB will be able to scale very large while remaining cost effective--it is simple, after all.

After glancing at links referenced at the bottom of the article, particularly the FAQ, this really does not sound like a bad product for what it is intended to do (or at least how I could see it practically being used (it is a redundant, affordable directory service for data; which has great potential for web applications). How they are billing it though is rather ridiculous. I really hope that I have completely misunderstood this and/or this will have a positive impact on the data management world.


the Monk said...

Slashdot has an interesting discussion going here.

'Amazon SimpleDB is designed to store relatively small amounts of data and is optimized for fast data access and flexibility in how that data is expressed.'

That sounds a bit more reasonable. Still functionality that can easily be implemented by existing DBMSes, but with automatic redundancy/scalability (in the utilization sense).

Still, I highly doubt that it is appropriate for 80% of database applications.

peter said...

"There are many many reasons why the relational model is better than flat-file and hierarchical models."

You're missing the point though. It scales way beyond your regular old db. Cheaply. Without complex administration. That's the painpoint of a lot of webapps, and that's what they're solving.

the Monk said...

Throw that much hardware at a 'regular old db' and it will scale too; look at Google.

I do however see your point. And I do hope that SimpleDB is a good solution for smaller web apps. As well, I do hope that I am missing something... Something more than... Berkeley DB.

I hope that you see one of my points; the cost scales with the performance. And the performance is limited to 'relatively small amounts of data.' All things being relative, that is very vague.

I just think that many people are setting high expectations without really reading. Amazon is making marketing claims (that is more polite than 'FUD,' right?) that technologically are not that revolutionary (notice modifying that with 'that'--a redundant DS (not Nintendo, Directory Service) is pretty cool). And without a great step forward, you are not going to be able to get rid of us evil DBAs with our crazy normalization.

Anonymous said...

Google doesn't just "throw hardware at a regular old db". They've already built something just like this (http://labs.google.com/papers/bigtable.html) and use it throughout their application base. Amazon's just finally catching up.

Perhaps you ought to be catching up too. This is the future of databases!

the Monk said...

Your first paragraph was spot on. I was not referring Google's use of Bigtable, but to their gignormous (that's a word, right?)MySQL environment. But then you got belligerent... And then silly.

Google is not claiming that Bigtable replaces the relational model and DBAs and that their product works for 80% of applications--if you read about it, they actually say 'In many ways, Bigtable resembles a database.' It fulfills a specific need. Google is too smart to dismiss the relational model. If you read what I said, I do see merit in Amazon's offering; I just do not like the marketing and think many are failing to see the limitations.

If directory services are the future of databases, why does Google bother with MySQL? And why don't they market Bigtable (which has been around for years and is a proven solution)? Because they are not idiots. It is great for what it does, but it is also a specialized service.