RAD Blog & Commentary

EnterpriseDB announces clustered database in Amazon's EC2

EnterpriseDB announced that their PostgreSQL based database offering will be available inside of Amazon's EC2 cloud. Some details are available from Information Week


Hosted PostgreSQL services have been around for years a database cluster that can rapidly scale is less common. The appeal of Amazon's compute cloud is that you can start of small with low resource utilization and have an application running in the cloud fairly inexpensively. You can then devote more resources to the application on an as-needed basis. EC2 also allows you to keep costs down during less busy times of the month (or year) by paying on a metered basis. Traditionally coming up with a way of scaling the database tier up quickly requires careful application architecture to make the application compatible with replication or partitioning based solutions. Oracle's RAC offers an alternative approach but the cost of scaling a RAC cluster from 2 nodes to 20 in a short period of time exceeds the cash most web companies have on hand.


What I'm interested in seeing is what constraints EnterpriseDB/Elastra puts on the application and database. Elastra claims their cluster is 'infinitely scalable', and EnterpriseDB claims their database is going to be transactional. Over the years I've looked into a lot of ways of clustering databases, especially PostgreSQL based ones, and it always comes down having to accept some set of constraints, limitations often involving trading off transactional consistency versus performance. It will be interesting to see what trade-off's this solution involves and how it works.

Flash adds support for H.264

Interestingly enough, Adobe has added H.264 support in Flash. I wonder what this means for VP 6.2?

Here's an article about it

Migrating data in the JCR

Apache JackrabbitWe’ve been trying to come up with a migration strategy for moving data between two versions of our JCR/Jackrabbit based application. This was motivated by the following goals:

  1. we were previously storing large media files, including images and video, as binary properties of a node and instead we wanted to store them as individual files on a file-system
  2. we had some changes to our nodetype definitions that involved incrementing the version number of our namespace
  3. we wanted the ability to switch persistence managers

Extracting the media files sounded simple; just write a program to iterate through nodes with media items, save the content to the filesystem and refactor the node structure. This worked fine for the current version of each node but our workspace is versioned and our application needs to work with historical versions of the nodes. Older versions of nodes are frozen and can’t be changed in Jackrabbit.

Changing the node definitions and persistence managers required exporting the content and adding it to a new repository. However, as far as I can tell, the Jackrabbit import/export features does not allow you to restore old versions. This means that we would not be able to migrate our version history.

We tried using the exportsysview command to export our repository as XML, run the result through a XML transformation to remove any versionHistory and baseVersion properties, switch our node type definition and re-import the data into a new repository. We excluded the binary data from our XML export and ended up with a 22MB file. When we tried to import this through Jackrabbit’s importxml we kept getting OutOfMemory exceptions from the JVM. While we eventually got the import to work on a Win64 machine using 8 GB of memory, this isn't a practical long-term solution to. 8G to import 22MB of data just doesn't cut it.

What we ended up doing was writing a program that iterates through each of the nodes in the repository in the following fashion: Run an exportxml on the node but not its children, perform any filters on the XML and then run importxml to bring the data into a second repository; saving the node at this point is important. Repeat for all children. Result, success.

This worked because we don’t use a lot of references and they are structured in a way that we were able to export all of the reference targets first. In any event, JackRabbit's import/export mechanism needs a longer look.


About RAD

© 2007 RAD International Ltd. | Privacy Policy | Terms of Use | Site Map