Talking about Erlang, Riak and Vector Clocks with Christopher Meiklejohn (@cmeik)
Hello! Today you may read here my interview to Christopher Meiklejohn, one of the speakers at the upcoming Erlang Factory in San Francisco. Christopher is working on Riak at Basho Technologies.
Erlang, Vector Clocks and Riak!
Paolo – Hello Chris! It’s great to have another Basho Erlanger here! Can you introduce yourself please?
Christopher – It’s great to be here! My name is Christopher Meiklejohn, and I’m currently a software engineer at Basho Technologies, Inc., where I work on our distributed datastore, Riak. In addition to that, I’m also a graduate student at Brown University, in Providence, RI, where I study distributed computing.
Paolo – Before joining Basho you were working in a different company (i.e., Swipely) where you dealt with Ruby code. Did you already know Erlang when you started at Basho? How would you describe the switch between these two languages?
Christopher – During the time I was at Swipely they had Riak deployed in production, which was what initially got me interested in Basho, Riak, in particular, Erlang. When I joined Basho, I knew very little Erlang and spent my first few weeks at the company learning it.
That said, I love Erlang as a language and as platform to build application on. I wouldn’t necessarily say that the change from Ruby to Erlang was anything that was unexpected, specifically because I already had functional programming experience using languages like Scheme and Haskell.
Paolo – Rubyists tend to be addicted to TDD. Were you able to maintain such a good practice also when coding Erlang?
Christopher – Well, I’ll start with a disclaimer. I was primarily responsible for the introduction of behavior driven development at Swipely for feature development, in addition to promoting pair programming within the development team.
That said, testing and verification of software is a very interesting topic to me.
While I believe that all software should be properly tested, I’ve never been particularly dogmatic about when in the cycle of development testing is performed: whether it’s done during development to guide the design of the software or whether it’s done afterwards to validate the authored components. I do, however, have one major exception to this rule: when attempting to reproduce a customer issue and validate a fix for the issue.
This is purely a pragmatic decision that’s come from working on large scale distributed systems: testing and verification of distributed systems is extremely hard given the number of cooperating components involved in completing a task.
At Basho, we take two major approaches to testing Riak: integration testing using a open source test harness we’ve developed that allow us to validate high level operations of the system, and QuickCheck for randomized testing of smaller pieces of functionality.
Paolo – At the upcoming Erlang Factory in San Francisco you will give the following talk: “Verified Vector Clocks: An Experience Report”. Can you introduce in a few words the arguments you will treat during the talk?
Christopher – My talk is going to look at an alternative way of asserting correct operation of software components, commonly known as formal verification.
The talk will specifically focus on modeling vector clocks for use in the Riak datastore using an interactive theorem prover called Coq. This allows us to assert certain mathematical properties about our implementation, and perform extraction of the component into Erlang codewhich we can directly use in our system.
Paolo – Who should be interested in following your talk and why?
Christopher – Given the topics involved, I’m planning on keeping the talk pretty high level and will touch a variety of topics: the theorem prover Coq, which implements a dependently-typed functional programming language, the basics of using Core Erlang, a de-sugared subset of the Erlang programming language, and how we put all of the pieces together.
Paolo – Lamport’s vector clocks are well known by people working in fields connected to distributed systems. Can you explain briefly what they are and in what fields they can be used?
Christopher – Vector clocks provide a model for reasoning about events in a distributed system. It’s a bit involved for this interview to get into the specifics about how they work and when they should be used, so I’ll refer to you two excellent articles written by Bryan Fink and Justin Sheehy of Basho.
“Why Vector Clocks Are Easy”
“Why Vector Clocks Are Hard”
Paolo – About the application vvclocks, are you planning to keep the development on? If so how can people contribute?
Christopher – At this point, the project mainly serves as a playground for exploring how we might begin to approach building verifiable software components in Erlang. What has been done so far is available on GitHub, it’s actively being worked on by myself as my time allows, and if you’re interested in helping to explore this further, feel free to reach out to me via e-mail or on Twitter.