Posts Tagged ‘dirty_write’

Suffering from amnesia? Use Mnesia!

May 31, 2010 Leave a comment

As you should already know, I’m currently working with Erlang and exmpp to build some XMPP gateways: some of them require only the Jid (jabber id) of the user, while on the other hand some require more specific information as the credentials needed to retrieve user’s personal information from the internet (e.g. Twitter ID and password).

Actually I could use OAuth and ask the user only for his id, but I thought to keep it simple and to go for the full credentials, so the point now is: how and where should I store these data? Well, MySQL  is definitely among the best solutions if you have to store a wide amount of data, but since in my case this is not true, I decided to focus my attention on Mnesia, a distributed DataBase Management System used in Erlang applications, which require continuous operation and exhibit soft real-time properties.

The first thing to do is to create a schema, a sort of description of your databas (e.g. are the data stored on RAM memory? Disk memory? Both?).

When you create a new Mnesia schema, you just save an empty schema table that you will populate further.

To create the schema for you application, start your distributed Erlang nodes and connect them. If you don’t want to distribute Mnesia over all the nodes, just start a non-distributed Erlang node (be careful! It is important that no old schemas exist, as well as ensuring that Mnesia is NOT started).
Ok let’s see same code!

Let’s define the record user_info as:

-record(user_info, {jid, presence, uid, pwd}).

In the init([]) function we add the following code:

case mnesia:create_schema([node()]) of
  ok ->
    ok = mnesia:start(),
                                    [{disc_copies, [node()]},
 				     {type, set},
 				     {attributes, record_info(fields, user_info)}]),
    mnesia:add_table_index(user_table, presence);
  _ ->

As you can see we try to create a new schema by calling the function  mnesia:create_schema(Nodes) that has to be executed on one of the connected nodes, if you want to set up the schema only in the local node just let the command as is it in the previous example, otherwise put as argument of the function:  [node()|nodes()].

If the creation was successful, we start the application by calling mnesia:start() or application:start(mnesia) (note that to stop Mnesia you call either application:stop(mnesia) or mnesia:stop()).

After the creation of the schema, we create the first table (user_table). The first parameter of the function is of course the name, followed by a list of characteristics for the table. First of all we provide the list of nodes where we want disc and RAM replicas of the table, then we state that the table is a set (other possible values are bag and ordered_set), in the end we specify the fields of the table, by saying that they are the ones of the record user_info we declared above.

After this, we add another index to the table for our inspections to the table (remember that the first attribute is always an index.

In worths to notice when the init function starts, if the schema is already there, we just start Mnesia.

One of the coolest feature of Mnesia are transactions:  a way to prevent race conditions while touching the database, anyhow in this post I’m not gonna talk about transactions, since I’m going to discuss about dirty operations, that are not good for preventing race  conditions, but are 10 times faster.

Here there are two functions, let’s analyze them:

set_subscribtion(BareFrom, UId, Pwd) ->
    mnesia:dirty_write({user_table, BareFrom, "online", UId, Pwd}).

set_subscription takes as input 3 values: a Jid, a User id and a Password, dirty_write is used to write into the specified tablea (user_table) these values.

get_subscribtion(BareFrom) ->
   case mnesia:dirty_read({user_table, BareFrom}) of
       [] ->
       [{user_table, BareFrom, _Status, UId, Pwd}] ->
	   {subscribed, UId, Pwd}

get_subscription takes as input only the Jid and checks whether the user identified by it is registered to our service, in fact if the dirty_read is an empty list it means that the user in not subscribed, otherwise the user data will be returned. Note that to use dirty read you must have the relative attribute as index (in this case we used jid, but we can use also presence).