Home > English, Erlang > How to handle configuration in init/1 function without slowing down your erlang supervisor startup

How to handle configuration in init/1 function without slowing down your erlang supervisor startup


Many times if you work with Erlang and follow the OTP design principles in your implementation, you may end up having one or more supervisors spawning a set of processes that can be either other supervisors or workers.

Most likely the child processes implementing the workers will be based on gen_server behaviour and whether you’re working on your small side project or in some big company project they will need some sort of initialization during the start-up phase: in fact creating an ets or mnesia table, reading a configuration file or accepting connections on a socket are pretty common operations that you want to be executed before the worker handles  other messages and executes the operations connected to such messages.

According to the relative documentation, the gen_server process calls Module:init/1 to initialize and therefore the first strategy you may think to employ consists in doing the operations listed above within this function. What I mean here is something like:

start_link() ->
    gen_server:start_link({local, ?SERVER}, ?MODULE, [], []).

init([]) ->
    %% Some configuration operation here (e.g. handle ets/mnesia table)
    {ok, #state{}}.

This kind of approach is pretty common when the operations we want to take during initialization are cheap in terms of time, but what happens if the initialisation is expected to take a long time?

Suppose you have a supervisor that spawns many children and each child has some long time taking configuration. The supervisor will probably call the function start_link/3,4 of each child in sequence and will not be able to return until Module:init/1 of the child it is starting has returned.  This means that the supervisor won’t be able to start the next children on the fly and this will somehow slow down the whole supervisor startup phase. 

How can we solve this issue? Well, there are a couple of different ways to do it, but all of them are based on splitting the gen_server initialisation into two phases, a first phase implemented in the init/1 function during which we trigger some internal message for a future configuration and that returns immediately to the supervisor and  a second phase in which the configuration  actually takes place. In such a way we can free the supervisor startup from the time burden of all the children configurations.

Let’s see with some code what are the most common ways to achieve this results. My favourite technique consists into triggering the future configuration using a gen_server cast inside the init/1 function as follows:

start_link() ->
    gen_server:start_link({local, ?SERVER}, ?MODULE, [], []).

init([]) ->
    gen_server:cast(self(), startup),
    {ok, state{}}.

As you can see within the init/1 function we trigger a cast message to our process and immediately return. At this point we just need to handle the cast in the function handle_cast/2 and perform the needed configuration. This can be done in this way:

handle_cast(startup, State) ->
    %% Do your configuration here
    {noreply, State}.

A different way to achieve the same “two phases” result can be implemented as follows:

start_link() ->
    gen_server:start_link({local, ?SERVER}, ?MODULE, [], []).

init([]) ->
    self() ! startup,
    {ok, #state{}}.

This time we first send the atom ‘startup’ to the gen_server and then we return. Of course we need to handle that message within our gen_server as follows:

handle_info(startup, State) ->
    %% Do your configuration here
    {noreply, State}.

As you can see the logic is pretty much the same here so I won’t go into further details. 

The last way to achieve our result can be implemented by taking advantage of a timeout message. In practice in the init/1 function, instead of returning the tuple {ok, #state{}} we return the tuple {ok, #state{}, Timeout}. By including the value Timeout in the last tuple we specify that a ‘timeout’ atom will be sent to our gen_server unless a request or a message is received within Timeout milliseconds.

The ‘timeout’ atom should be handled by the handle_info/2 callback function. By setting the value of Timeout to 0 and adapting handle_info we can implement once again in an easy way our “two phases” configuration. Let’s see how this can be obtained:

start_link() ->
    gen_server:start_link({local, ?SERVER}, ?MODULE, [], []).

init([]) ->
    {ok, #state{}, 0}.

And the ‘timeout’ message can be handled as:

handle_info(timeout, State) ->
    %% Do your configuration here
    {noreply, State}.

Personally I don’t like the last approach because the atom ‘timeout’ is not so meaningful and it can lead to some misunderstanding. By the way the real problem here is that this approach is implemented taking advantage of an internal timer that should be evaluted: we can’t be sure that the message will be sent immediately, we just know that the message will be sent after at least 0 milliseconds.

Some reader here may say that this “two phases” approach is risky, because no one assures us that the configuration message will be the first message handled either in handl_cast/2 or handle_info/2.  Actually this is not completely true.

We can’t be sure 100% that a message sent in init/1 using either a cast of the operator ! will be the first one in the process queue of our gen_server, but considering that start_link/3,4 is synchronous there are really few chances for another process to send a message to our gen_server before we send the configuration message.

Final consideration: there is a more elegant way to achieve the same result that consists in the combination of the functions start_link/3 and init_ack/1 of the module proc_lib. For those of you interested in the topic I suggest the user guide of ranch.

  1. No comments yet.
  1. September 12, 2013 at 12:11 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: