Hello! Today you may read here my interview to Christopher Meiklejohn, one of the speakers at the upcoming Erlang Factory in San Francisco. Christopher is working on Riak at Basho Technologies.
Erlang, Vector Clocks and Riak!
Paolo – Hello Chris! It’s great to have another Basho Erlanger here! Can you introduce yourself please?
Christopher – It’s great to be here! My name is Christopher Meiklejohn, and I’m currently a software engineer at Basho Technologies, Inc., where I work on our distributed datastore, Riak. In addition to that, I’m also a graduate student at Brown University, in Providence, RI, where I study distributed computing.
Paolo – Before joining Basho you were working in a different company (i.e., Swipely) where you dealt with Ruby code. Did you already know Erlang when you started at Basho? How would you describe the switch between these two languages?
Christopher – During the time I was at Swipely they had Riak deployed in production, which was what initially got me interested in Basho, Riak, in particular, Erlang. When I joined Basho, I knew very little Erlang and spent my first few weeks at the company learning it.
That said, I love Erlang as a language and as platform to build application on. I wouldn’t necessarily say that the change from Ruby to Erlang was anything that was unexpected, specifically because I already had functional programming experience using languages like Scheme and Haskell.
Paolo – Rubyists tend to be addicted to TDD. Were you able to maintain such a good practice also when coding Erlang?
Christopher – Well, I’ll start with a disclaimer. I was primarily responsible for the introduction of behavior driven development at Swipely for feature development, in addition to promoting pair programming within the development team.
That said, testing and verification of software is a very interesting topic to me.
While I believe that all software should be properly tested, I’ve never been particularly dogmatic about when in the cycle of development testing is performed: whether it’s done during development to guide the design of the software or whether it’s done afterwards to validate the authored components. I do, however, have one major exception to this rule: when attempting to reproduce a customer issue and validate a fix for the issue.
This is purely a pragmatic decision that’s come from working on large scale distributed systems: testing and verification of distributed systems is extremely hard given the number of cooperating components involved in completing a task.
At Basho, we take two major approaches to testing Riak: integration testing using a open source test harness we’ve developed that allow us to validate high level operations of the system, and QuickCheck for randomized testing of smaller pieces of functionality.
Paolo – At the upcoming Erlang Factory in San Francisco you will give the following talk: “Verified Vector Clocks: An Experience Report”. Can you introduce in a few words the arguments you will treat during the talk?
Christopher – My talk is going to look at an alternative way of asserting correct operation of software components, commonly known as formal verification.
The talk will specifically focus on modeling vector clocks for use in the Riak datastore using an interactive theorem prover called Coq. This allows us to assert certain mathematical properties about our implementation, and perform extraction of the component into Erlang codewhich we can directly use in our system.
Paolo – Who should be interested in following your talk and why?
Christopher – Given the topics involved, I’m planning on keeping the talk pretty high level and will touch a variety of topics: the theorem prover Coq, which implements a dependently-typed functional programming language, the basics of using Core Erlang, a de-sugared subset of the Erlang programming language, and how we put all of the pieces together.
Paolo – Lamport’s vector clocks are well known by people working in fields connected to distributed systems. Can you explain briefly what they are and in what fields they can be used?
Christopher – Vector clocks provide a model for reasoning about events in a distributed system. It’s a bit involved for this interview to get into the specifics about how they work and when they should be used, so I’ll refer to you two excellent articles written by Bryan Fink and Justin Sheehy of Basho.
“Why Vector Clocks Are Easy”
“Why Vector Clocks Are Hard”
Paolo – About the application vvclocks, are you planning to keep the development on? If so how can people contribute?
Christopher – At this point, the project mainly serves as a playground for exploring how we might begin to approach building verifiable software components in Erlang. What has been done so far is available on GitHub, it’s actively being worked on by myself as my time allows, and if you’re interested in helping to explore this further, feel free to reach out to me via e-mail or on Twitter.
Hello there! Today I want to introduce you my interview to John Koening. John is a PhD student at University of Minnesota in the fields of distributed and real time simulations. John is also working in the game studio he founded. Currently they are developing the game The Electric Adventures of Watt which has some Erlang in it.
Learning something more about Ymir
Paolo – Hello John and welcome to my blog! Can you please introduce yourself to our readers please?
John – Hi Paolo, thanks for having me. My name is John Koenig and I am a PhD student at the University of Minnesota (UMN) studying distributed, real-time simulation. I am going into the third year of my PhD program preparing for my written and oral defenses. I have been a regular Erlang-user for about 6 years.
Prior to, and inter-mixed with my time at UMN I worked at Cray Inc. Most recently I was contracted under Cray’s Chapel team where I worked on several language improvements in the area of portability. A majority of my time at Cray was spent as part of their Custom Engineering initiative. Together, we engineered unique super-computing platforms and software stacks for various customers.
In 2010, I founded a game studio, Called Shot LLC, with my good friends Gabriel Brockman and William Block. We are currently in the first round of funding for our flagship game title: The Electric Adventures of Watt.
Paolo – This is a common question I ask during my interviews: how did you start using Erlang? What are the features of Erlang that made you learn it?
John – I was first introduced to Erlang while pursuing my undergraduate degree at University of Wisconsin – Eau Claire (UWEC), I think it was around 2006. As part of a Programming Languages course we were tasked with picking a new language and implementing a solution to a sufficiently interesting problem which applied to the language’s domain. At the time, I was big into Plan 9 and distributed software in general so I chose Erlang and implemented a distributed prime number sieve.
Being more of an applied school, UWEC had me spending most of my time programming C and C++ and I remember being really impressed with how Erlang modeled processes and inter-process communication directly in the language. Once I got past Erlang’s syntax learning curve and my newness to functional programming, I found myself able to express distributed solutions very naturally in Erlang. After that, I was hooked. I picked up Joe’s book, Programming Erlang, and started keeping up with the Erlang community online.
I first started using Erlang professionally at Basho in early 2008 when I was brought on as a Reliability Engineer. I had thought, coming out of UWEC, that I knew Erlang fairly well, but I grew considerably during my six months at Basho. Justin and his development team are incredibly talented and being around that level of skill and enthusiasm was highly contagious. I remember that time fondly.
Paolo – Would you like to introduce and describe in a few lines what Ymir is? Where will Ymir be used?
John – Ymir is an open-source (GPL), cross-platform, distributed 3D game engine written in Erlang.
With the number of cores available to gamers on the rise, it is Ymir’s purpose to break games out of the traditional, single-core-dominate game-loop and, in doing so, achieve faster, larger simulations which grow in proportion to the number of available cores.
Paolo – Why did you decide to use Erlang for Ymir? Was there any other candidate language at the beginning of the project?
John – Ymir grew out of a desire to create a multi-player RPG that got away from the traditional client/server model. Myself and a few friends enjoyed online RPGs but didn’t enjoy the MMORPG scene. We were interested the approach of Neverwinter Nights 2, however, which featured smaller worlds developed and hosted by members of the community. Hosting of these worlds could get terribly expensive, as the worlds were hosted on a single server and, as their player base grew, admins of these worlds would be required to either co-locate their server or pay for expensive home internet access with sufficient upload speed. I set out to change this, wanting instead to see a game capable of simulating a world in a more peer-to-peer fashion. Namely, a game engine capable of utilizing the additional computational power and bandwidth present as players login to enjoy the simulated world.
I didn’t consider anything other than Erlang for this task. Along with OTP, Erlang is still the best language for distributed development as it allows me to focus more on the high-level challenges of distributed real-time simulation and less on the gritty details of implementing my own task-queues, inter-process communication, etc. This choice was further cemented when I proved that communication to port drivers, with minimal trickery, was sufficiently fast to support online rendering.
Paolo – Reading your paper I spotted many words often used among Erlang developers: scalable, soft-thread, message passing and minimal amount of synchronization. Would you like to discuss the meaning of each term with respect to Erlang and Ymir?
John – Game engines are traditionally frame-centric. Their primary goal is to compute and render frames as quickly as possible. Two aspects make this approach difficult when scaling over multiple cores: first, computing a frame is recursively dependent on the frames which came before it and, second, traditional spatial data structures used in collision detection require all game entities to be synchronized in order to function.
Ymir takes an object-centric approach and aggregates frames as quickly as possible. Game objects (entities) are represented as Erlang processes (soft-threads) and each entity is responsible for simulating itself locally. Discrete events (e.g. collisions, user-input) are modeled as messages which occur at a specific point in simulation time. As an optimization we allow entities to exist at various points in simulation time and to resolve events in the recent past by application of timewarp. Entities proceed through their local simulations, streaming updates to their physical state to relevant renderers. To enforce fairness, a sense of global time is defined as the minimum of all entity simulation times. This introduces a small amount of global synchronization as entities “vote”on the value of this global time through various shared ETS-based counters.
To break away from traditional spatial data-structures, Ymir applies map/reduce to spatial reasoning in order to achieve scalable collision detection. When simulating forward in time, entities volumetrically hash their physical extents against a fixed cube to various buckets (also soft-threads) and aggregate contacts which result from writing their latest physical states into each selected bucket. The act of mapping to spatial buckets is analogous to selecting nearest-neighbors (broadphase) and the buckets themselves compute points of contact for each pair of entities overlapping within its given volume of simulation space. In short, Ymir serializes only those objects which are sufficiently close together while permitting objects sufficiently separated in space to simulate unimpeded.
Using map/reduce in this fashion allows Ymir to scale-out over many cores very well. Currently, we are able to realize ~11x speedup in overall simulation time on 16 cores and sustained frame rates of ~500 fps. Ymir’s performance is dependent on many factors, however, chief among these is the degrees of freedom between entities. As entities are serialized based on spatial proximity, scenes where all entities exist in persistent contact are currently unable to obtain such lofty speedups. I am currently expanding our methods to better model persistent contact which will help Ymir obtain better speedups in these scenarios. Furthermore, using map/reduce to compute contacts works well locally or over low latency networks but as we scale up to many machines connected with higher latency other approaches will be needed. I am currently investigating network overlays between entities which capitalize on spatial assumptions present in game simulations.
Paolo – Do you have any partial results about Ymir our readers can take a look at? What kind of tests do you do on Ymir?
John – We are actively maintaining performance results on Ymir’s indiedb page, and as time permits I will be documenting Ymir more completely on our development blog.
This video showcases the three testing scenarios we used to gather our preliminary results. All three scenes are rendered offline using Ymir’s built-in support for Mitsuba. Parallel is rather boring to watch, but provides a best case performance. Cylinders features a stack of spheres falling onto an static array of cylinders and is more representative of the types of rigid body interactions one might see in an interactive game. Last, is Bounce in which spheres move randomly within fixed scene boundaries. Bounce is currently being used to measure Ymir’s performance as it relates to scene density.
Paolo – Is there any way our readers can contribute to the development of Ymir? Is there any fund-raising? Can other developers join the project?
John – Glad you asked, yes! We are currently on indiegogo seeking funding for The Electric Adventures of Watt which will be powered by Ymir. In supporting The Electric Adventures of Watt, contributors will be directly helping us mature Ymir into its first public release.
We will be advertising the public repositories for Ymir concurrently with its first official release. In the meantime, if developers are interested in working on Ymir, please, don’t hesitate to get in touch with me: john calledshot.org.
Paolo – You are a PhD student at the University of Minnesota and your studies are mainly focused on parallel and distributed real time simulation. Do you think Erlang could be widely used in these fields?
John – Without a doubt, that is what is what Erlang was designed for. There is even a precedent for using Erlang server-side for games: MuchDifferent and SMASH. I feel that as CPUs continue to have more cores and affordable CPU accelerators (i.e. parallela) become available, game developers will turn to solutions like Ymir to grow their games. Scalable, real-time simulation is not an easy undertaking and savvy developers will be looking for the right tools for the job.
As many Erlang enthusiasts know, there are at times substantial resistance to using languages that are outside of other developer’s comfort zones. This is especially true for academia. I can’t even count the number of times I have had to defend Erlang to my lab mates at UMN. That said, we are not expecting that all developers wishing to use Ymir will embrace Erlang, and we have on our roadmap to develop front-ends for languages most game developers will find familiar: C/C++/Lua.
Paolo – Did you find any help in the Erlang community? Did any Erlang developer give you feedback or support online during your development?
John – Several times I got frustrated with the performance of Mnesia and ETS for Ymir’s collision detection and I turned to the Erlang IRC channel for support and guidance. The Erlang community has been nothing but insightful and supportive every time I have turned to them. Although for the life of me I cannot remember that handles of those who offered help, I owe the Erlang community thanks.
We also owe special thanks to you, Paolo, for featuring Ymir on your blog, Peer Stritzinger for helping us reach out to the Erlang community, and to your readers.