Posts Tagged ‘distributed’

An interview with Steve Vinoski (@stevevinoski)

May 14, 2013 4 comments

Today you can read my interview to Steve Vinoski, a famous Erlang developer/speaker and distributed systems expert. Steve will give the talk “Addressing Network Congestion in Riak Clusters” at Erlang User Conference 2013.

Some questions, some answers

Paolo – Hi Steve! It’s really good to have one of the most famous Erlangers here in my blog. Would you mind to introduce yourself to our readers in a few words?

Steve – I’m Steve Vinoski, a member of the architecture group at Basho Technologies, the makers of Riak and RiakCS. I have a background in middleware and distributed systems, and have been an Erlang user since 2006.

Paolo – I know you are expert in several programming languages. How did you end up using Erlang? Did you have any previous experience with functional languages?

Steve – As far as functional languages go, I’ve played with them on and off for decades, but never used one in production until I found Erlang.

I worked in middleware from 1991 to 2007, and in 2004 at IONA Technologies I started looking into innovative ways of expanding our product line and reducing the cost of product development. IONA’s products were written in C++, which I’ve used since 1988 and so I am well aware of its complexity, and Java, which frankly I’ve never liked (I like the JVM but don’t like the Java language). Neither language lends itself to rapid development or easy maintenance. I built a prototype that layered Ruby over one of our C++ products that allowed for an order of magnitude decrease in the number of lines of code required to write client applications, and built another prototype that provided a JavaScript layer for writing server applications, but customers didn’t seem interested, and both approaches only increased development and maintenance costs.

Then I found Erlang/OTP. I grew more and more intrigued as I discovered that it already provided numerous features that we had spent years developing and maintaining in our middleware systems, things like internode messaging, node monitoring, naming and discovery, portability across multiple network stacks, logging, tracing, etc. Not only did it provide all the features we needed, but its features were much more powerful and elegant. I put together a proposal for the IONA executive team that suggested we rebuild all of our product servers in Erlang so we could reduce maintenance costs, but the proposal was rejected because, as I later learned, they were trying to sell the company so it didn’t make sense to make such large changes to the code. I left IONA and joined Verivue, where we built video delivery hardware, and there I trained seven or eight other developers in Erlang and we used it to great advantage. After Verivue, I wanted to continue working with Erlang, which is part of the reason I joined Basho.

Paolo – In your blog you state that Erlang is your favourite programming language. Why?

Steve – To me Erlang/OTP is the type of system my middleware colleagues and I spent years trying to create. It’s got so many things a distributed systems developer wants: easy access to networking, libraries and utilities to make
interacting with distributed nodes straightforward, wonderful concurrency support, all the upgrading and reliability capabilities, and the Erlang language itself is sort of a “distributed systems DSL” where its elegance and small size make it easy to learn and easy to use to quickly become productive building distributed applications. And as if that’s not enough, the Erlang community is great, pleasantly supporting each other and newcomers while avoiding pointless arguments and rivalries you find in other communities. My use of other programming languages has actually decreased in recent years due primarily to my continued satisfaction with Erlang/OTP — it’s not great for every problem, but it’s fantastic for the types of problems I tend to work on.

Paolo – I know that in a previous working experience you had to deal with multimedia systems, a field where Erlang has still a minor impact with respect to languages like C++. Do you think Erlang will be able to find its place in this field as well? Can you give reasons for your answer?

Steve – Erlang/OTP is excellent for server systems in general, including multimedia servers. The Verivue system I worked on a few years ago had special TCP offload hardware for video delivery, so we didn’t need Erlang for that. Rather, we used Erlang for the control plane, which for example handled incoming client requests, looked up subscriber details in databases, and interacted with the hardware to set up multimedia data flows. Multimedia systems also have to integrate with billing systems, monitoring systems, and hardware from other vendors, and Erlang shines there as well, especially when it comes to finding bugs in the other systems and hot-loading code to compensate for those bugs. Customers tend to love you when you can quickly turn around fixes like that.

Another Erlang developer, Max Lapshin, built and supports erlyvideo, which seems to work well. I’ve never met Max but I know he faced some challenges along the way, as we did at Verivue, but I think he’s generally happy with how erlyvideo has turned out.

Paolo – Currently you are working at Basho, a very important company in the Erlang world. Do you mind telling our readers something more about your job?

Steve – At Basho I work in CTO Justin Sheehy’s architecture group. It’s a broad role with a lot of freedom to speak at and attend conferences and meetups, and I also work on research projects and pick up development tasks and projects from our Engineering team and Professional Services team when they need my help.

Paolo – At Erlang User Conference 2013 you will give a talk about Riak, its behaviour under extreme loads and the issues we may face when we want to scale it. Can you tell us something more about the topic?

Steve – At Basho we’re fortunate to have customers who continually push the boundaries of Riak’s comfort zone. Network traffic in Riak all goes over TCP — client requests, intracluster messages, and distributed Erlang communication. When clusters are extremely busy with client requests and transfer of data and messages between nodes, under certain conditions network throughput can drop significantly and messages can be lost, including messages intended for client applications. I am currently investigating the use of alternative network protocols to see if they can help prioritize different kinds of network traffic. This work is not yet finished, so my talk will give an overview of the problems along with the current status of the solution I’m investigating.

Paolo – I heard that you will also introduce during the talk a new Erlang network driver that should tackle some of this issues. Is this correct? Can you give us an insight?

Steve – Yes, I have been working on a new network driver. It implements an alternative UDP-based protocol for data transfer that can utilize full bandwidth when available but can also watch for congestion on network switches and quickly back off when detected. It also yields to TCP traffic under congestion conditions, preventing background data transfer tasks from shutting out more important messages like client requests and responses.

Paolo – Who should be interested in this talk? What are the minimum requisites needed in order to fully understand the topics of the talk?

Steve – Attendees should have a high-level understanding of Erlang’s architecture, what drivers are, and how they fit into the system. Other than that, my talk will explain in detail the problems I’m trying to address as well as the solution I’ve been investigating, so neither deep networking expertise nor deep understanding of Erlang internals is required.

Paolo – I can say without doubts that you are an expert in middleware and distributed computing systems. Can you suggest to our readers interested in those topics some books or internet resources?

Steve – The nice thing about distributed systems is that they never seem to get any easier, so there have been interesting research and development in this area for decades. The downside of that is that there are an enormous number of papers I could point to. In no particular order, here are some interesting papers and articles, most of which are currently sitting open in my browser tabs:

“Eventual Consistency Today: Limitations, Extensions, and Beyond”, Peter Bailis, Ali Ghodsi. This article provides an excellent description of eventual consistency and
recent work on eventually consistent systems.

“A comprehensive study of Convergent and Commutative Replicated Data Types”, M. Shapiro, N. Preguiça, C. Baquero, M. Zawirski. This paper explores and details data types that work well for applications built on eventually consistent systems.

“Notes on Distributed Systems for Young Bloods”, J. Hodges. This excellent blog post succinctly summarizes the past few decades of
distributed systems research and discoveries, and also explains some implementation concerns we’ve learned along the way to keep in mind when build distributed applications.

“Impossibility of Distributed Consensus with One Faulty Process”, M.Fischer, N. Lynch, M. Paterson. This paper is nearly 30 years old but is critical to understanding fundamental properties of distributed systems.

“Dynamo: Amazon’s Highly Available Key-value Store”, G. DeCandia, et al. A classic paper detailing trade-offs for high availability distributed systems.

Paolo – Day-by-day Erlang becomes more popular. In your opinion what can we expect from Erlang in the future? What are the next goals the Erlang community should try to reach?

Steve – Under the guidance of Ericsson’s OTP team and with valuable input from the open source community, Erlang/OTP continues to evolve gracefully to address production systems. I expect Erlang will continue to improve as a language
and platform for building large-scale systems that perform well and are relatively easy to understand, reason about, and maintain without requiring an army of developers. In particular I’m looking forward to the OTP team’s
continued work on optimizing multicore Erlang process scheduling. The Erlang community is very good at proving how good Erlang/OTP is through the results of the systems they build, so they need to keep doing that to broaden Erlang’s appeal. If you’re a developer building practical open source or commercial software, the presentations given by community members at events like the Erlang User Conference and the Erlang Factory conferences are amazing sources of knowledge and wisdom for what works well for Erlang/OTP applications and what can be problematic.

REMOTE PROCEDURE CALLs: C vs. Erlang (part 1)

December 21, 2009 Leave a comment

In the last weeks I have been geeking a while with Remote Procedure Calls.

RPCs were invented at Sun Microsystems in the 80s, and can be seen as “normal” procedure calls on a distributed system. The concept behind RPCs is the following: you don’t have to declare the function you want to use in your client, you can declare that function on a server and than if you invoke it, a so-called “middleware” is going to forward the request from the client to the server, where the function is executed and the result is returned back to the client.

Since I’m still confident that Erlang is much more better than C for distributed systems I’ll try to code in both languages a server that keeps a counter in it, each time a client calls via RPC a function it will increment that value.

In C I could do it coding three files. Let’s start with the first: a file (counter.x) with the specifications of the service provided by the functions that can be called remotely. The specifications must be converted to stubs and header files for the middleware using the tool rpcgen (we will se it later).

When you write the specifications, you must identify the functions that are going to be called remotely. The functions must specify the return type and arguments. For every function (called RPC), you must assign an ID and must include it in a program that must be uniquely referred to by an ID and a version.

Our counter.x may look like this:

  version COUNTERVERS {
    int COUNTER() = 1;
  } = 1;
} = 0x20000001;

where the remote procedure COUNTER has ID 1 and is included in the version 1 of the program COUNTERPROG. The ID of the program is 0x20000001.

After that you must write the procedure code as if you would write it locally, but with slight changes. The procedure must use a conventional name PROGRAM_VERSION_svc (in our case counter_1_svc). The procedure uses pointers to return the result and receive the arguments.
One more argument is required to identify the execution context.

Thus in our case we can create a file named counter_proc.c of the following form:

#include <stdio.h>
#include "counter.h"
int *counter_1_svc(void *msg, struct svc_req *req)
  static int result = 0;
  return (&result);

The header file counter.h is one of the files generated by rpcgen. As you can see we create a STATIC variable result that lives in the space allocated for static variables so that it can survive when the execution of the fuction ends. Whenever the function is called we increment that value and return the address of that static variable.

Now we have to write to client! This client, before performing the remote call to the server, must connect to it. Then, after the call it can disconnect. The connection is done using the function clnt_create(server address, program id, program version, transport protocol). The disconnection is done using the function clnt_destroy().

Let’s create a file called rcounter.c of the following form:

#include <stdio.h>
#include <string.h>
#include "counter.h"
int main(int argc, char **argv)
  CLIENT *clnt;
  int *result;
  char *server;
  // try to get the server address from command line
  if (argc != 2) {
    fprintf(stderr, "Usage is the following: %s server\n", argv[0]);
    return -1;
  server = argv[1];
  // create the client
  clnt = clnt_create(server, COUNTERPROG, COUNTERVERS, "udp"); //udp may be set to tcp
  if (clnt == NULL) {
    return -1;
  // call the function
  result = counter_1(NULL, clnt);
  if (result == NULL) {
    clnt_perror(clnt, server);
    return -1;
  // print the value obtained from the call
  printf("The value of the counter is %d\n", *result);
  // destroy the client
  return 0;

Now let’s compile all the files:

bellerofonte@pegaso:~$ rpcgen counter.x

This generates: counter.h counter_svc.c counter_clnt.c

bellerofonte@pegaso:~$ gcc counter_svc.c counter_proc.c -o counter_svr -lnsl
bellerofonte@pegaso:~$ gcc counter_clnt.c rcounter.c -o counter_clt -lnsl

and execute them:

bellerofonte@pegaso:~$ ./counter_svr &
[1] 4898
bellerofonte@pegaso:~$ ./counter_clt localhost
Counter is 1
bellerofonte@pegaso:~$ ./counter_clt localhost
Counter is 2
bellerofonte@pegaso:~$ kill -9 4898

As you can see at every call the value returned is the previous one plus 1 (starting from zero).

I killed the server in the end because I started it in background mode…if you want you can run the two files in different shells to avoid this.

Ok, this post starts being long…so for the Erlang code I will open a new one 🙂

Categories: C, English, Erlang Tags: , , ,