Hello! Today you may read here my interview to Christopher Meiklejohn, one of the speakers at the upcoming Erlang Factory in San Francisco. Christopher is working on Riak at Basho Technologies.
Erlang, Vector Clocks and Riak!
Paolo – Hello Chris! It’s great to have another Basho Erlanger here! Can you introduce yourself please?
Christopher – It’s great to be here! My name is Christopher Meiklejohn, and I’m currently a software engineer at Basho Technologies, Inc., where I work on our distributed datastore, Riak. In addition to that, I’m also a graduate student at Brown University, in Providence, RI, where I study distributed computing.
Paolo – Before joining Basho you were working in a different company (i.e., Swipely) where you dealt with Ruby code. Did you already know Erlang when you started at Basho? How would you describe the switch between these two languages?
Christopher – During the time I was at Swipely they had Riak deployed in production, which was what initially got me interested in Basho, Riak, in particular, Erlang. When I joined Basho, I knew very little Erlang and spent my first few weeks at the company learning it.
That said, I love Erlang as a language and as platform to build application on. I wouldn’t necessarily say that the change from Ruby to Erlang was anything that was unexpected, specifically because I already had functional programming experience using languages like Scheme and Haskell.
Paolo – Rubyists tend to be addicted to TDD. Were you able to maintain such a good practice also when coding Erlang?
Christopher – Well, I’ll start with a disclaimer. I was primarily responsible for the introduction of behavior driven development at Swipely for feature development, in addition to promoting pair programming within the development team.
That said, testing and verification of software is a very interesting topic to me.
While I believe that all software should be properly tested, I’ve never been particularly dogmatic about when in the cycle of development testing is performed: whether it’s done during development to guide the design of the software or whether it’s done afterwards to validate the authored components. I do, however, have one major exception to this rule: when attempting to reproduce a customer issue and validate a fix for the issue.
This is purely a pragmatic decision that’s come from working on large scale distributed systems: testing and verification of distributed systems is extremely hard given the number of cooperating components involved in completing a task.
At Basho, we take two major approaches to testing Riak: integration testing using a open source test harness we’ve developed that allow us to validate high level operations of the system, and QuickCheck for randomized testing of smaller pieces of functionality.
Paolo – At the upcoming Erlang Factory in San Francisco you will give the following talk: “Verified Vector Clocks: An Experience Report”. Can you introduce in a few words the arguments you will treat during the talk?
Christopher – My talk is going to look at an alternative way of asserting correct operation of software components, commonly known as formal verification.
The talk will specifically focus on modeling vector clocks for use in the Riak datastore using an interactive theorem prover called Coq. This allows us to assert certain mathematical properties about our implementation, and perform extraction of the component into Erlang codewhich we can directly use in our system.
Paolo – Who should be interested in following your talk and why?
Christopher – Given the topics involved, I’m planning on keeping the talk pretty high level and will touch a variety of topics: the theorem prover Coq, which implements a dependently-typed functional programming language, the basics of using Core Erlang, a de-sugared subset of the Erlang programming language, and how we put all of the pieces together.
Paolo – Lamport’s vector clocks are well known by people working in fields connected to distributed systems. Can you explain briefly what they are and in what fields they can be used?
Christopher – Vector clocks provide a model for reasoning about events in a distributed system. It’s a bit involved for this interview to get into the specifics about how they work and when they should be used, so I’ll refer to you two excellent articles written by Bryan Fink and Justin Sheehy of Basho.
“Why Vector Clocks Are Easy”
“Why Vector Clocks Are Hard”
Paolo – About the application vvclocks, are you planning to keep the development on? If so how can people contribute?
Christopher – At this point, the project mainly serves as a playground for exploring how we might begin to approach building verifiable software components in Erlang. What has been done so far is available on GitHub, it’s actively being worked on by myself as my time allows, and if you’re interested in helping to explore this further, feel free to reach out to me via e-mail or on Twitter.
Hello there! In this post you can read my interview to Kenji Rikitake. Kenji is a famous Erlang developer and security expert. I really loved this interview because Kenji provided some really interesting anectodes connected to his personal life and many insights on the IT in Japan.
The Erlanger from Japan
Paolo – Hello Kenji! It’s great to have you here! Please, can you describe yourself to our readers.
Kenji – My name is Kenji Rikitake. I am a relatively new user and programmer of Erlang; my experience is only about five years.
I’ve been working on various aspects of internet and distributed computing for 25 years. I started as a intern of VAX/VMS sysadmin in 1987. A couple of years later, I became a VAX/VMS Asian screen management library programmer and the product tester in 1990 at Digital Equipment Corporation Japan.
After leaving Digital in 1992, I decided to start my career as an internet sysadmin, or “devops” in the latest trendy word, and a volunteer evangelist of explaining how internet would change the world. I worked for a systems integration company called TDI, and co-designed and implemented a corporate firewall with BSD/OS systems and dedicated routers, including a simple fault tolerance. The firewall system was running until 2000 when I left the company. I’ve also written two books about internet engineering and technologies in Japanese.
From 2001 to 2005, I was a researcher at KDDI R&D Labs, about network security on intrusion systems, DNS protocol, and teleworking. During the period, I also conducted a joint research with Osaka University as a PhD student. My PhD thesis was about DNS reliability and security.
From 2005 to 2010, I was a researcher for National Institute of Communications and Information Technology (NICT), a research body of the Telecom Ministry of Japan. I involved in the preliminary design of a network intrusion early warning and analysis system called “nicter”, and later I pursued the DNS reliability research especially on the behavior of DNS packet fragments. I also worked in IPv6 and NGN security.
After I met Erlang/OTP in 2008, my research interests have shifted into the concurrency programming and the various related issues, including security, efficiency, and the robustness. Distributed database design is my latest research topic, for the obvious reason that I am currently working on building Riak. I’ve presented four talks at Erlang Factory SF Bay Area from 2010 to 2013, one for each year.
From 2010 to 2012, I was a Kyoto University full professor, though my primary role there was to implement and supervise the campus network security policies and procedures. I worked on two Mersenne-Twister random number implementations for Erlang, called SFMT and TinyMT, which are published in ACM Erlang Workshop 2011 and 2012. I also organized the 2011 Workshop held in Tokyo, as the Workshop General Chair.
I’m currently working for Basho Japan, a Japanese subsidiary of Basho Technologies.
I’m an electronic geek, and my Twitter handle @jj1bdx is derived from my primary ham radio call sign in Japan, which I’ve been assigned since 1976. Morse Code on the shortwave is one of my favorites on the radio, though from 1986 to 1990 I also involved in the packet radio activities
based on TCP/IP. Music is another thing that makes me happy.
Paolo – First real question: how did you meet the functional programming world?
Kenji – I first read a Lisp book in the early 1980s when I was a teenager. I was not that interested in the S-expression though, because I didn’t have an execution environment then. It was even before the C language for the personal computers; I was playing around with my Apple II, mostly in the assembly language, and two tiny programming languages called GAME and TL/1. I even wrote a GAME compiler for 6502 running on Apple II.
Before starting my real career, I was a lab member of Professor Eiiti Wada from 1988 to 1990, at the University of Tokyo. Prof. Wada and his lab members created a Lisp implementation called UtiLisp, and the lab was the most advanced place in the campus networking. I was also learned some basic ideas on the functional and even logic programming, because of the nationwide buzzword called The Fifth Generation Computers. Some of the Wada lab alumni were the key designers and implementers of the language called Guarded Horn Clauses, which has surprisingly similar design philosophy to Erlang, although it is a logic programming language.
My problem about understanding functional/logic programming was, however, that I couldn’t really grasp the core reasons why those programming paradigms were effective and even required for a large-scale system design. I failed on a Prolog course in 1989 either because I didn’t find the unification principle was anything meaningful. So I was a very bad student. I wish I could have learned it in the Erlang way of the pattern matching then!
And unfortunately my mind in the late 1980s was too focused on how to run UUCP and email systems in an inexpensive way without UNIX, so any functional or logical programming paradigms seemed redundant to me, because they were so slow. I didn’t like regular commuting from my home to the university, so I wanted a way to discover a way of working from home. At that time my main target of code hacking at home was MS-DOS then; I had to wait until 1993 when I could use BSD/OS at home for experiencing the real UNIX at home. I later moved into FreeBSD in 1997. And I’ve been running Erlang/OTP mostly on FreeBSD since 2008.
Paolo – And when did you first hear about Erlang?
Kenji – I first saw a Japanese translation of Joe Armstrong’s “Programming Erlang”, published by Ohmsha in November 2007, at a bookstore I visited in Tokyo downtown in February 2008, on my way back home from Tokyo to Osaka. I instinctively found out this was the once I had to learn and go for, so I immediately bought it and started discovering the world of Erlang since then.
Paolo – You told me that you had some bad times during your experiences as developer and University Professor, but also that Erlang and functional programming helped you to overcome your difficulties. Can you tell our readers something about that?
Kenji – Let me start from my programming middle-age crisis first.
I was looking for something completely new and innovative for a programming system to learn, after I thought working only on C was no longer sufficient to keep myself up as a modern programmer. I was sick and tired of understanding and modifying the BIND 9 DNS server code, written mostly in C, for a DNS research paper I was writing then. I don’t blame the BIND 9 programmers because it does really complex magic things, and I admire ISC people especially Paul Vixie, one of my mentors in Digital Equipment and the father of BIND. Nevertheless, having to read hundreds of header macro lines to reach the actual code looked no longer practical to me at that time. And I thought I would have lost my competitiveness as a programming person then, if I stuck into the old way of C programming. So eventually I become a polyglot programmer; I use C, awk, Python, Perl, and Erlang.
I knew multi-core or massive-parallel computing hardware is coming and I wanted to learn something very much different from the past sequential and inherently procedural programming languages and systems. While Erlang is *not* specifically designed for a massive-parallel execution environment, Erlang does have a lot of practical constraints for modern computing hardware requirements embedded in the language, for example the single-assignment variable principle, and the OTP system themselves, for example the gen_server behaviour [sic] framework, to solicit the programmers to do the least wrong things. This is something which other languages cannot emulate or mimic.
Next about the University Professor life crisis.
During my Kyoto University career, most of the things I had been doing there was talking, negotiating, and dealing with people, not with computers. The university is a very large organization, and keeping the campus network secure is something practically impossible without the university member’s help, namely from the administrative, education, research staff, and of course from the students. I am an introvert person and most of the university people are not geeks although many are excellent researchers, so the human communication tasks were the toughest thing to do in my life. Also the long-time commuting from my house to the office, spending four hours in total every day, literally killed me.
Fortunately I was allowed to do the CS research activity, however, during the Kyoto University career. And I was eligible to run a large batch jobs on a large Linux supercomputer cluster. So I decided to run some Erlang code and do the fun things over there. One good thing about Erlang is that it is mostly OS independent, so I did the prototyping on my home FreeBSD machines, and let the huge multi-core jobs run on the cluster. I’ve put the research result into GitHub. So I didn’t have to throw away the possibility of my career as a CS researcher 🙂
Paolo – You are widely respected not only for your knowledge on Erlang/OTP but also for your expertise on distributed system security. What is the intersection between these two fields?
Kenji – Erlang/OTP is a very good candidate for making a reliable system. This means it would be a prospective candidate even for a secure system, if properlydesigned. In other words, an unreliable system could *never* be secure. And every system is not 100% reliable.
The word “security” has a lot of implications in many different aspects, and is widely misused in many contexts, even if I exclude the militaristic and socialistic implications, which may be out of scope of this interview, though very serious issues themselves indeed.
I believe that the foundation of a secure system is a reliable and fault-tolerant system. This has been frequently ignored even by many “security” experts; for many of them, security is only about cryptography, or about restricting the user’s behavior in a system, or just about analyzing the behavior of the pieces of malware. I do not deny those aspects, they are very important, and the outcomes of those research activities are surely essential for making a better computer system, but those aspects are not only *the* security. A very broad perspective is needed for a computer security expert.
Also, I have to stress that security is mostly about people and how people behave. People want convenient systems; at many circumstances, security and convenience do not coexist. For example, if you really want a secure system, do not connect it to the internet. But such a special system, which could enable you to provide sufficient communication capability within the system while rejecting all the attack thwarts and zero attack vector, is virtually impossible to make, from the financial point of view. See the Stuxnet case? Consider what if the power plant were using Erlang/OTP as the core and the end-point controllers.
I wish Erlang/OTP developers always think about making a reliable software. It’s not that difficult; thinking carefully when programming will solve most of the cases.
Paolo – What is the best way to “secure” an Erlang distributed systems?
Kenji – Traditionally, putting the whole systems in a protected network, is the only solution. And unfortunately it has still been so.
This is a very good question to answer, because in the current Distributed Erlang (disterl) system on the OTP, the security model is very weak if any existed. TLS-based disterl (with the ssl and crypto modules) will be a good solution to protect the communication between BEAMs, but the problem is that the communication between BEAMs and the port-mapper daemons are plain text and it’s not trivial to incorporate necessary authentication and cryptographic features.
Erlang/OTP has been depending on the assumption that the whole disterl cluster is in a protected network without any attack vectors. In other words, the disterl cluster itself was considered a system without protection. Opening the communication ports to the internet, however, makes this assumption rather unrealistic; the Erlang/OTP devops must think about all the possible attack vectors for the disterl cluster as a whole system.
One possibility on protecting BEAM-to-BEAM communication is to establish cryptographically authenticated links between the BEAMs and let the links be used persistently, with proper periodic re-keying, without using any port-mapper daemon. I believe incorporating such a facility into Erlang is not that difficult, though the rendez-vous problem between the multiple BEAMs should be solved in another way.
Paolo – During your experience as Professor at Kyoto University, you did also research activity using Erlang and OTP. You worked in particular on SFMT and TinyMT. Would you like to introduce these two projects to our readers?
Kenji – Mersenne Twister (MT), a BSD-licensed innovative long-period (typically 2^19937 – 1) non-cryptographic pseudo random number generator (PRNG) by Profs. Makoto Matsumoto and Takuji Nishimura, has become the de facto standard on popular programming languages such as Python and R. SIMD-oriented Fast MT (SFMT) and TinyMT are the improved algorithms, by Profs. Makoto Matsumoto and Mutsuo Saito. The MT algorithms have all a very high order of equidistribution, which fits very well on a large-scale simulation, including the software testing.
SFMT is an improved version of the original MT, which is even faster than the MT, and has a tunable characteristics of the generation period and the sequence generation. TinyMT is another variant of MT, which has a much shorter generation period (2^127 – 1) and smaller memory footprint, but is still suitable for most simulation use. The algorithm of TinyMT is much compact than SFMT or MT, and can generate a massive number (~ 2^56) of independent orthogonal number sequences, which is suitable for massive-parallel asynchronous PRNG.
For the further details on SFMT and TinyMT, please take a look at: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html
I am not a mathematician so I cannot mathematically prove how MT and the derivatives are better than the other algorithms. But I have to emphasize very much that Erlang/OTP’s random module is still using an archaic old algorithm invented in 1980s which has a significantly shorter generation period (~ 2^43), and that has already become an indirect source of security vulnerability once (CVE-2011-0766, discovered by Geoff Cant). SFMT and TinyMT have much better characteristics than the random module, and I strongly suggest you to try them out if you really need a better non-cryptographic PRNG.
Recently I have put 256M (= 2^28) precomputed keys of TinyMT 32-bit and 64-bit generation parameters. This archive is huge (~82GB), but if you would like to use TinyMT for a serious simulation, it is worth taking a look for. The archive is at: https://github.com/jj1bdx/tinymtdc-longbatch/
Paolo – Currently you are working at Basho Japan. Can I ask you what is like to work in one of the most acknowledged Erlang companies? How much Erlang code do you see in your working daily routine?
Kenji – Basho developers are all superb and are very energetic on making Riak and recently-open-sourced Riak CS even more better products. Working with such talented engineers and keeping yourself up with them is very very tough, but if you are capable to point bugs and propose contributions which have proven to work correctly to Basho’s open-sourced projects, you will surely be welcome.
I would also like to emphasize that Basho is not just an Erlang company. You need to know every programming languages and the computer science elements, from C, C++, Java, Python, Ruby, to the gory details of distributed database, including how the vector clocks work and commutative/conflict-free replicated data types (CRDTs). Riak, Riak CS, and rebar, include a lot of their by-products. See the deps/ directory under Riak and you will be astonished. On the other hand, there might be many ways to contribute your skills.
I would also like to emphasize that Basho’s client service engineers, sales and marketing people including the documentation experts, and all the other staff members, are closely working together with the developers and maintain the high standard of delivering the quality service and products.
I can only answer that the amount of Erlang code I have to see is *enormous*. 🙂
Paolo – I am very interested about Erlang and Japan. Is Erlang a niche programming language there as well or is it spreading fast as in the US and north Europe?
Kenji – I would rather want to ask the question back: is Erlang a popular language in anywhere in the world? I think the answer is probably no, comparing to the popularity of Java or C++. Looking at the TIOBE index will prove this. And I’d rather say nobody cares about that, because whether a language is spreading fast or not has already become irrelevant, comparing to the jobs or tasks what you want to get done with the language.
I do understand Erlang has gained a larger momentum in Sweden, where the language is from. And I see many people solving problems with Erlang in Europe. And in the USA and Canada (hi Fred!). And in Japan too, especially for the server-side programming solutions. So I feel the developers in Japan are slowly but surely showing more interests.
Getting back to the situation in Japan: I think not many people are interested in whatever the new paradigm of programming, except for relatively small number of communities. Fortunately, those communities surely exist. And some visionaries have discovered some languages, such as Haskell, OCaml, or Erlang, to solve *their* problems and helping others solving the problems. But for the majority of programmers, most of the details are “not really something to be carefully taken care of and to be blindly delegated to the experts”, also called the *omakase* attitude in Japan. So most programmers just do the omakase to the Rails, or to Java libraries, or to the pre-built C++ libraries. And that irresponsible attitude towards their profession, though not necessarily only of their sole responsibilities, cause a lot of sometimes lethal or disastrous bugs in the production systems. Unfortunately, many of programmers in Japan are not well-educated as the software engineers, and their supervisors are sometimes even worse. Their mindset of dumping the risks (or *doing the marunage*) for every difficult problem makes things even worse.
I think programming is not something for omakase and the quality of code will not be sufficient so long as major users of computers are doing the marunage to the developers in Japan. And I believe Erlang/OTP is not for the people who are not willing to take the risk of their own computer systems. On the other hand, for those who want to maintain the system by themselves or at least to eagerly, deliberately, and willingly take the responsibility of running the system without major outage, Erlang/OTP will become a great tool because the system provides the critical and essential functions such as non-stopping module replacement.
Paolo – As many other Erlang gurus out there, you are very active not only when it comes to promote new Erlang applications but also when Erlang newbies ask for support or suggestions. In your opinion what are the factors that make the Erlang community so nice?
Kenji – I was pretty much impressed by the friendly environment of the erlang-questions mailing list and the modest attitude of the experienced Erlang community-driving people there, when I first asked some questions. I just read and read and read all the things in the Erlang-related mailing lists as much as I could. Erlang Workshop papers were also a set of excellent source of information. And now we’ve been full of good code in GitHub, including the OTP itself. So we’ve got many many more things ready to learn now for free!
I’ve heard that one of the old sayings in the Erlang community is “no prima donna allowed”. This is so important for maintaining a community. I understand everybody wants to get grumpy sometimes, and quite often flame wars occur, but many people just endure and keep silent. I respect this rather European or even Swedish way of getting rid of chaos 🙂
Paolo – I think that the Erlang community is growing fast: many applications, conferences and new books, still most of the developers out there don’t know that behind many of the tools they use every day there is a piece of Erlang. How would you explain that?
Kenji – I think this is in fact a very good thing. People want to solve their own problems in whatever tools they have to use, or they think suitable to use. Erlang has flexible package release tools which can minimize the users of the package to think about the installation of Erlang/OTP itself. In many popular applications, the Erlang virtual machine and the necessary libraries are silently built-in and being there; and most people don’t care whether it uses Erlang/OTP or not so long as the software works OK. Erlang/OTP has become a part of the infrastructural ecosystem.
Of course, there is a strong negative side of this trend, too; developers are doing the marunage with the omakase attitude to the developer of those infrastructural tools with no knowledge about the tools. I try not to fall in this trap by building all user-land programs, kernels, and the Port programs of my FreeBSD development servers, at least for the past ten years. You have to think about the bugs if you have to build your own tools; this is a very good way to learn a new thing. You need to forcefully do so frequently.
Paolo – OK, Kenji. Many thanks for the interview!
Kenji – You’re welcome!
If you often end up in my blog (which of course I hope you do), you will know that I try to spread news and information about the Erlang related events all around the world, as I did for Erlang Camp.
A couple of days ago, I heard about Tech Mesh: The Alternative Programming Conference which will take place in London on December 4th – 6th, 2012.
IMHO, the event is very interesting, mostly because the main focus of the conference is supporting and promoting favolous but still non-mainstream technologies like Erlang, Clojure, Scala, Haskell alongside other emerging technologies. During the days of the conference, there will be time to listen to users and inventors of various languages, but there will be also time for learning and discuss about what is going to be the future.
The lineup of speakers is really amazing! Speaking about Erlang, I can tell you that Joe Armstrong, Mike Williams and Robert Virding will be togheter for the very first time (if you don’t consider “Erlang the movie”!) but many other famous erlangers will be there, so if you want to know more about them just take a look here.