RethinkDB: first things first

Recently in a new project, one of the project members encouraged me to use RethinkDB, saying that it could be the key solution, mainly because it has a feature called “change feeds”. However I’m very used to using MongoDB for projects that are crucially dependent on denormalized data. Being a bit of a stubborn person, I had my doubts, but decided to give it a try (at least it would be a new toy).

RethinkDB is a fully JSON compliant database engine, instead of the BSON format that is used in MongoDB. This could be an advantage because properties such as dates and IDs should be native javascript formats, which is cool because there is no need to convert formats to be compatible, it's all javascript!

Another thing that is in fact nice are the change feeds I mentioned earlier. If implemented, every time an object is created, changed or deleted in the database, it raises an event This feed of events can be relied upon to create some important parts of the applications. Basically it is a message queue that will serve you well on most use cases.
First things First

Using RethinkDB alongside Node.js particularly makes sense for realtime applications, as it can be used as a PubSub engine, which allows Node to scale in a very effective way.
RethinkDB has atomic events raised by document changes, which I’ll mention in this article. It was at this point I was convinced RethinkDB is great!
To start the project as soon as possible and it being the first time I was going to work with RethinkDB, I had a few initial questions:

How do I handle the connections?
There are decent ORM’s or do I need one in fact?
How do I define a object schema?
What is the proper way to do the basic operations insert, update, delete, get, list and filter results?

There were of course other things I thought about, such as performance and server configuration. Both of these topics will however be covered in a separate article.
Thinky did the job well
To accelerate development, I started out by looking for a database driver and ORM which could help me with repetitive and basic tasks such setting up a connection pool, data models and schemas. Fortunately I bumped into neumino on GitHub(https://github.com/neumino). He is the creator and main contributor of the Thinky (ORM) and rethinkdbdash (3rd party driver) projects.

Thinky is the most popular javascript ORM for RethinkDB. The documentation is not perfect, but has the essential information we need to start being productive. It is built on top of the great third party driver rethinkdbdash, which handles a connection pool and TLS secure connections for you. It’s built on top of the Bluebird promise library, enabling ES6-specific functionality for non-ES6 compatible versions of Node, whilst also adding in extra features.
RethinkDB has its own query language called “ReQL”, which is quite good, as you can combine a bunch of operations in one go. A practical example:

r.table('articles')  
    .filter(r.row("URL")
    .eq("http://blog.cloudoki.com/angular-2-first-steps/"))
    .update({Title: "Angular 2 more than the first steps"})
    .run(connection, function(err, result) {
        if (err) throw err;
        console.log(JSON.stringify(result, null, 2));
});

In this example I get an article by filtering the URL and update the Title in a single ReQL command. With thinky you can use the same original syntax to do the queries, which is awesome as you won’t have to worry about repetitive tasks. The ORM will take care of things such as:

  • Connection pool handling (thanks to the rethinkdash)
  • Model definition and schema validation
  • Model relations made easy
  • Create tables automatically
  • Create indexes automatically
  • Virtual fields
  • All the CRUD operations to the models
  • Pre and Post changes hooks to a model instance
  • Change feeds

Connect to the DB

The first task to do for any type of implementation is the connection handling. Most of the time, it’s a nice idea to have your own connection module in your structure, and with rethinkdbdash (used by Thinky), this is a must because it creates a new connection pool each time it is required.

var thinky  = require('thinky')({  
      host: 'db01.example.com',
      port: 28015,
      authKey: 'z1B85ZP80dK791772K0o4K9y405W8PW4di1E0y11F2u3G0wY',
      db:'superdatabase',
    });

exports.thinky = thinky;

exports.close = function(callback){  
  thinky.r.getPoolMaster().on('log', console.log);
  thinky.r.getPoolMaster().drain();
}

Data modulation

As you may know at this point, RethinkDB is a NoSQL database fully JSON compliant with data joining support, and it’s exactly here where the Thinky ORM does a great job by simplifying the job greatly. RethinkDB is a special type of NoSQL database as it works with models or schemas. You have the advantage being able to create barebone schemas and enjoy automatic table joins at the same time, a feature usually only found in relational databases.

Creating models (tables)

Because today I am thinking about food since I woke up, I’ll do my examples based on restaurant dishes, ingredients and everything that is food related.

Let’s say that you want the menus from the nearby restaurants. You will need as a start, to have at least two tables: the restaurants and the menu dishes.

// require our db connection module
var thinky = require('./db').thinky;  
// the type validation exported by Thinky to use in the field validator
var type = thinky.type;

var Restaurant = thinky.createModel("Restaurants", {  
    // not needed, just to ensure the type. The id field in RethinkDB works the same way as _id in MongoDB with the only difference being RethinkDB allows you to manage this yourself.

    id: type.string().uuid(4),
    // the restaurant name
    Name: type.string(),
    // the restaurant’s location
    Geolocation: type.string(),
    // the given score by the users
    Score: type.number(),
    // last restaurant date update. If the field is missing the default date is current one
    Updated: type.date().default(r.now())
});

var Dish = thinky.createModel("Dishes", {  
    id: type.string().uuid(4),
    // the dish title
    Title: type.string().min(10),
    // extended description
    Description:type.string().min(10)
});

Thinky provides you with a “create” model to define your schema and a “type” model to define the property types. You can also define relationships between this tables, I’ll be getting into this in a further article about advanced RethinkDB data modulation.
CRUD operations
Thinky helps with you with all your CRUD needs, by abstracting some of the operations. The documentation suggests to create a new model and then save it as seen in the example below:

var dish = new Dish(“Bacalhau à brás”, “Bacalhau à Brás is made from shreds of salted cod, onions and thinly chopped fried potatoes in a bound of scrambled eggs..”);

dish.save().then(function(newDish){  
    // here it is the new dish saved in the DB
});

However, as it is common practice in Node.js to work with JSON objects, these can easily be used as well, by using the Model’s API instead of the base class:

Dish.save( dish ).then(function(newDish){  
    // here it is the new or updated dish saved on the DB
}).error(function(err){
    console.error(“something went wrong:”, err);
});

The resume of the most common operations

// create or update a dish. The update will replace the current project, but you can control it by adding an optional argument: https://thinky.io/documentation/api/model/#save

function  saveDish(dish, callback){  
    Dish.save( dish ).then(function(newDish){
        // here it is the new or updated dish saved on the DB
        return callback(null, newDish);
    }).error(function(err){
        console.error(“something went wrong:”, err);
        return callback(err);
    });
};
// create or update ensuring there is no missed fields if they are not specified in the model

function  saveDish(dish, callback){

              // try to find if exists
      Dish.getAll(dish.id).run().then(function (existingDishes) {

                var existingDish = existingDishes.shift();
          // if not exists create
      if( existingDish == null ){

        return Dish.save(dish).then(function(result){
          return callback(null, result);
        }).catch(next);
      }
// otherwise merge the new object with the current one
      return existingDish
        .merge(dish)
        .save()
        .then(function (result) {
            return callback(null, result);
        }).catch(next);;
    }).catch(next);;
};
Get one or more entries by id

This one have native ReQL syntax included:

// get one entry by its id

Dish.get("0e4a6f6f-cc0c-4aa5-951a-fcfc480dd05a")  
.run()
.then(function(dish) {
    console.log(“The dish:”, dish);
})
.catch(function(err){
    console.error(err);
});
// get multiple entries by id

Dish.getAll("0e4a6f6f-cc0c-4aa5-951a-fcfc480dd05a", "7eab9e63-73f1-4f33-8ce4-95cbea626f59", "a9849eef-7176-4411-935b-79a6e3c56a74")  
.run()
.then(function(dishes) {
    console.log(“The 3 dishes:”, dishes);
})
.catch(function(err){
    console.error(err);
});

You could you other fields in the get query, see here: https://www.rethinkdb.com/api/javascript/#get_all

Get entries by filter
// get all the restaurants scored 5
Restaurant.filter(r.row(‘Score’).eq(5))  
    .run()
    .then(function(restaurants){
        console.log(“The top restaurants:”, restaurants);
    })
    .catch(function(err){
        console.error(err);

    });
Delete
// get one entry by its id
Dish.get("0e4a6f6f-cc0c-4aa5-951a-fcfc480dd05a")  
    .delete()
    .run()
    .then(function(result) {
        console.log(“The delete result:”, result);
    }).catch(function(err){
        console.error(err);
    });
Pre and Post hooks

Using Thinky models instead of the native RethinkDB syntax you get access to pre- and post-hooks, allowing you to define data transformations before and after actions:

Pre-hooks: save, delete, validate
Post-hooks: save, delete, validate, init, retrieve

Example:

// post retrieving restaurant object/s from the DB
Restaurant.post("retrieve", function (next) {

    // in this context you have access to the restaurant object
    console.log(this.Name);
});

What’s next?

This is the first article in a series of articles on how to use RethinkDB. In the next articles, you can expect information on how to create joins, how to index data and about the change feeds functionality that RethinkDB is shipped with.

If this article peaked your interest, I can recommend the following sources for further reading:

https://www.rethinkdb.com/
https://thinky.io/
https://github.com/neumino/rethinkdbdash

Happy reading and stay tuned for the next article!

comments powered by Disqus