Working with NoSQL is like being a pioneer
– for tough guys
Interview with Senior Developer Thomas Brask Jørgensen
Almost 15 years of experience with NoSQL and NoSQL-type databases have given ProData consultant Thomas Brask Jørgensen a deep insight into working with the new, fast-growing database technologies. Here he talks about his experience and also offers good advice to both businesses and colleagues.
Thomas Brask Jørgensen, has both the age and experience to have worked with NoSQL databases before they were even called NoSQL. That was back in 2000 when he was working in the Jubii community department responsible for the chat platform. This was in the early web 2.0 days, when Danish users were beginning to use public instant messaging with other users about all kinds of topics. Thomas Brask Jørgensen remembers it as a fun project, but also as one that, right from the beginning, presented developers with a number of new challenges.
"Back when I started, Jubii Chat was a relatively small Danish product, but you could see that it had great potential. Then Jubii was acquired by Lycos, and their ambition was to introduce the product across Europe, and so all of a sudden we were talking about a completely different size of user base. At that time you used ASP and SQL Server for chat, so it was obvious that there would be scalability problems. It was not remotely fast enough and it would be expensive to solve the problem with the existing technology," explains Thomas Brask Jørgensen.
"So instead, we chose to create our own database. We had a developer in-house for seven years who didn’t do much else besides developing the database. It actually ended up becoming a really good product, and one that is still on the market. Today, the database has full SQL, but it didn't back then. It was pure NoSQL. Solely because it had to be capable of scaling across multiple servers and doing so as quickly as possible."
Even to this day, Thomas Brask Jørgensen believes that building their own database to solve the challenge was the right decision. But when they decided to use the database for other types of tasks at Lycos, problems arose.
"We simply moved all of the community products over to the platform and also developed new products that used the same database. This resulted in a few crashes from time to time where everything went down, and then our technician had to sit there for 36 hours straight to restore the system. Downtime might not matter so much when it's just chat and guest books, but it still ended up costing money. This was because chat was based on advertising revenue, and when the chat was down, the users couldn't see the adverts," he explains.
Online backgammon and ZYB
When Lycos moved the entire department to Germany in 2005, Thomas Brask Jørgensen did not go along. He was still head consultant for the back-end for a while, but preferred to work from a Danish base.
The next time he worked with NoSQL databases was in 2006, when he was hired for a project for a small Danish entrepreneurship that wanted to create online games. At that time - with top player Gus Hansen as their role model - the Danes had really taken to poker, online poker and other online games in a big way.
"I was asked to handle the back-end for an online backgammon game, and I said yes. One of the databases we used was called Memcached, which is a NoSQL-like database for games and chat functionality. It was fine for that, but it was a bit of a problem that the company also wanted to use it for financial transactions. There are some problems with consistency and concurrency, which means that you should not use NoSQL for the financial transactions; instead, you need to use transactional databases," says Thomas Brask Jørgensen, adding that the small entrepreneurial company never came to market with their online game, but closed the project down prematurely.
"In 2008 I came to ZYB, which was a backup solution for the mobile phone. Before I arrived, ZYB was acquired by Vodafone. At that time, Vodafone had the ambition to build a large community. My job included working on a proof of concept for a new version of the ZYB solution, and in this respect we were also looking at NoSQL databases, including MongoDB and Amazon’s SimpleDB," says Thomas Brask Jørgensen, explaining that the developers were working on one overall solution, where MongoDB, SimpleDB and other databases were 'wrapped' in an abstraction layer so you could not see which databases were running underneath. It was not a package solution they would choose in a use scenario; it was only an experiment to test the fitness of the various databases for use against each other.
"Unfortunately, ZYB eventually closed down, so we never completed the project, but it was very exciting to work on. In retrospect, we probably erred when we tried to build a generic SQL interface on top of the NoSQL databases. We took a NoSQL database and tried to turn it into SQL. We probably shouldn't have done that. It is possible to have a query language in this way, but you don’t get consistency and atomic transactions,” says Thomas Brask Jørgensen, explaining that the requirement for SQL and query opportunities arose from the wish to use the packing solution for data analysis with business intelligence tools, data mining, etc.
"But it wasn't the right way to go about it. The right way would have been to move the data and analyse it somewhere else where you would have had more advanced query options. You don't get that kind of functionality in a NoSQL database."
In 2012, Thomas Brask Jørgensen was brought in to the Atea Tele project, where the company entered the business telephony market. When Thomas Brask Jørgensen was brought in along with other consultants, the project should – as always – have been completed 'yesterday', he says with a smile. "And after all, that's what consultants do." Atea wanted telerating functionality – and the possibility of split billing, allowing the user and not the least the company to separate work calls from private calls. Originally, Atea wanted to perform this rating itself, which would mean millions of records each time a call was made or a text message was sent. Atea had looked into it, and the project did not suit Microsoft SQL, for which they had their own expertise. So they decided to use RavenDB, which is a very .NET-friendly NoSQL database.
"An obvious choice if that was the route you wanted to take," says Thomas Brask Jørgensen. "The problem was that before we were brought in, they had decided that they didn't want to perform the telerating after all. They just wanted a user database with profiles, configuration, etc. This meant a standard database with not much data, and the data that was available would fit into a relational database, where you could do advanced queries, etc. The task was not well suited to RavenDB, which they had retained even though they had dropped the idea of telerating. After six months on the project, we argued to remove RavenDB and bring in a relational database, and that's how things ended up."
Today, Thomas Brask Jørgensen is at Saxo Bank, where he is currently finalising an Open API project that makes it easier for external companies’ banking applications to work with Saxo Bank's internal systems.
"Our system records large volumes of log data. So far we have used log files, but we want to get this into a database. In this case, a NoSQL database such as MongoDB would be a good choice because of the large amount of data and the relatively modest lookup requirements. Saxo Bank already use MongoDB in other contexts, so it's an obvious choice," says Thomas Brask Jørgensen.
Listen to us!
Today, when Thomas Brask Jørgensen begins work on a project in which NoSQL technology is considered, he often experiences that he is asked for advice because of his many years of experience in the field. And so it should be, according to him.
"My experience is that in 9 out of 10 cases the clients listen to us consultants as consultants because we have seen similar projects before. It's very rare that you're told to do things in a certain way. But you do experience this from time to time. I’ve been in a situation where the choice was between relational databases and NoSQL databases. But now I usually speak my mind if people are going off the deep end with their solutions," says Thomas Brask Jørgensen, who stresses that the best advice he can give companies facing a database project is to think through their IT environment and future needs very carefully before making the technological decisions.
"You need the right tool for the right task. Making changes along the way is painful. In an ideal world, you should look at the databases’ technical features. But the reality is usually that people end up choosing the product they are familiar with," he says.
If Thomas Brask Jørgensen has one piece of good advice for colleagues who may not yet have become acquainted with NoSQL databases, it is that they must first and foremost get an overview of the myriad of NoSQL and NoSQL-like technologies.
"There are well over 100 NoSQL databases on the market. So it's very much about being able to distinguish between the different types of databases and not believing that NoSQL is just one thing. Because it absolutely is not. Specifically, I would recommend looking at RavenDB and MongoDB, if you have a Microsoft background. If you're more web-orientated, you could look at CouchDB. And if you have the cloud option, take a look at Microsoft's Azure platform, Amazon's SimpleDB or Google Cloud Datastore," says Thomas Brask Jørgensen.
Declarative versus imperative
There is a huge difference between working with relational databases based on more than 20 years of accumulated knowledge, proven technologies and a broad palette of tools, and working with the newer and less mature NoSQL databases. But on a personal level, Thomas Brask Jørgensen likes the challenge inherent in NoSQL technologies.
”Working with NoSQL is kind of like being a pioneer – for tough guys. You feel like a pioneer every time, because you don't have the same big toolbox available that you have when working with relational databases. With NoSQL you have to do a lot of the work yourself. You could make some of the same queries as in a relational database, but you have to write this yourself. Most NoSQL databases do not have declarative query options like SQL, where you write what you want and how it should be presented. NoSQL databases are usually imperative. You describe how data is to be found. It's like writing a program. First do this, then do this, then do this, etc. It's a completely different thought process,” concludes Thomas Brask Jørgensen.