Archive

Posts Tagged ‘basho’

Talking about Erlang, Riak and Vector Clocks with Christopher Meiklejohn (@cmeik)

February 16, 2014 1 comment

Hello! Today you may read here my interview to Christopher Meiklejohn, one of the speakers at the upcoming Erlang Factory in San Francisco. Christopher is working on Riak at Basho Technologies.

Erlang, Vector Clocks and Riak!

 

Paolo – Hello Chris! It’s great to have another Basho Erlanger here! Can you introduce yourself please?

Christopher – It’s great to be here! My name is Christopher Meiklejohn, and I’m currently a software engineer at Basho Technologies, Inc., where I work on our distributed datastore, Riak. In addition to that, I’m also a graduate student at Brown University, in Providence, RI, where I study distributed computing.

Paolo – Before joining Basho you were working in a different company (i.e., Swipely) where you dealt with Ruby code. Did you already know Erlang when you started at Basho? How would you describe the switch between these two languages?

Christopher – During the time I was at Swipely they had Riak deployed in production, which was what initially got me interested in Basho, Riak, in particular, Erlang. When I joined Basho, I knew very little Erlang and spent my first few weeks at the company learning it.

That said, I love Erlang as a language and as platform to build application on. I wouldn’t necessarily say that the change from Ruby to Erlang was anything that was unexpected, specifically because I already had functional programming experience using languages like Scheme and Haskell.

Paolo – Rubyists tend to be addicted to TDD. Were you able to maintain such a good practice also when coding Erlang?

Christopher – Well, I’ll start with a disclaimer. I was primarily responsible for the introduction of behavior driven development at Swipely for feature development, in addition to promoting pair programming within the development team.

That said, testing and verification of software is a very interesting topic to me.

While I believe that all software should be properly tested, I’ve never been particularly dogmatic about when in the cycle of development testing is performed: whether it’s done during development to guide the design of the software or whether it’s done afterwards to validate the authored components. I do, however, have one major exception to this rule: when attempting to reproduce a customer issue and validate a fix for the issue.

This is purely a pragmatic decision that’s come from working on large scale distributed systems: testing and verification of distributed systems is extremely hard given the number of cooperating components involved in completing a task.

At Basho, we take two major approaches to testing Riak: integration testing using a open source test harness we’ve developed that allow us to validate high level operations of the system, and QuickCheck for randomized testing of smaller pieces of functionality.

Paolo – At the upcoming Erlang Factory in San Francisco you will give the following talk: “Verified Vector Clocks: An Experience Report”. Can you introduce in a few words the arguments you will treat during the talk?

Christopher – My talk is going to look at an alternative way of asserting correct operation of software components, commonly known as formal verification.

The talk will specifically focus on modeling vector clocks for use in the Riak datastore using an interactive theorem prover called Coq. This allows us to assert certain mathematical properties about our implementation, and perform extraction of the component into Erlang codewhich we can directly use in our system.

Paolo – Who should be interested in following your talk and why?

Christopher – Given the topics involved, I’m planning on keeping the talk pretty high level and will touch a variety of topics: the theorem prover Coq, which implements a dependently-typed functional programming language, the basics of using Core Erlang, a de-sugared subset of the Erlang programming language, and how we put all of the pieces together.

Paolo – Lamport’s vector clocks are well known by people working in fields connected to distributed systems. Can you explain briefly what they are and in what fields they can be used?

Christopher – Vector clocks provide a model for reasoning about events in a distributed system. It’s a bit involved for this interview to get into the specifics about how they work and when they should be used, so I’ll refer to you two excellent articles written by Bryan Fink and Justin Sheehy of Basho.

“Why Vector Clocks Are Easy”
http://basho.com/why-vector-clocks-are-easy/

“Why Vector Clocks Are Hard”
http://basho.com/why-vector-clocks-are-hard/

Paolo – About the application vvclocks, are you planning to keep the development on? If so how can people contribute?

Christopher – At this point, the project mainly serves as a playground for exploring how we might begin to approach building verifiable software components in Erlang. What has been done so far is available on GitHub, it’s actively being worked on by myself as my time allows, and if you’re interested in helping to explore this further, feel free to reach out to me via e-mail or on Twitter.

An interview with Kenji Rikitake (@jj1bdx)

June 27, 2013 4 comments

Hello there! In this post you can read my interview to Kenji Rikitake. Kenji is a famous Erlang developer and security expert. I really loved this interview because Kenji provided some really interesting anectodes connected to his personal life and many insights on the IT in Japan.

The Erlanger from Japan

Paolo – Hello Kenji! It’s great to have you here! Please, can you describe yourself to our readers.

Kenji – My name is Kenji Rikitake. I am a relatively new user and programmer of Erlang; my experience is only about five years.

I’ve been working on various aspects of internet and distributed computing for 25 years. I started as a intern of VAX/VMS sysadmin in 1987. A couple of years later, I became a VAX/VMS Asian screen management library programmer and the product tester in 1990 at Digital Equipment Corporation Japan.

After leaving Digital in 1992, I decided to start my career as an internet sysadmin, or “devops” in the latest trendy word, and a volunteer evangelist of explaining how internet would change the world. I worked for a systems integration company called TDI, and co-designed and implemented a corporate firewall with BSD/OS systems and dedicated routers, including a simple fault tolerance. The firewall system was running until 2000 when I left the company. I’ve also written two books about internet engineering and technologies in Japanese.

From 2001 to 2005, I was a researcher at KDDI R&D Labs, about network security on intrusion systems, DNS protocol, and teleworking. During the period, I also conducted a joint research with Osaka University as a PhD student. My PhD thesis was about DNS reliability and security.

From 2005 to 2010, I was a researcher for National Institute of Communications and Information Technology (NICT), a research body of the Telecom Ministry of Japan. I involved in the preliminary design of a network intrusion early warning and analysis system called “nicter”, and later I pursued the DNS reliability research especially on the behavior of DNS packet fragments. I also worked in IPv6 and NGN security.

After I met Erlang/OTP in 2008, my research interests have shifted into the concurrency programming and the various related issues, including security, efficiency, and the robustness. Distributed database design is my latest research topic, for the obvious reason that I am currently working on building Riak. I’ve presented four talks at Erlang Factory SF Bay Area from 2010 to 2013, one for each year.

From 2010 to 2012, I was a Kyoto University full professor, though my primary role there was to implement and supervise the campus network security policies and procedures. I worked on two Mersenne-Twister random number implementations for Erlang, called SFMT and TinyMT, which are published in ACM Erlang Workshop 2011 and 2012. I also organized the 2011 Workshop held in Tokyo, as the Workshop General Chair.

I’m currently working for Basho Japan, a Japanese subsidiary of Basho Technologies.

I’m an electronic geek, and my Twitter handle @jj1bdx is derived from my primary ham radio call sign in Japan, which I’ve been assigned since 1976. Morse Code on the shortwave is one of my favorites on the radio, though from 1986 to 1990 I also involved in the packet radio activities
based on TCP/IP. Music is another thing that makes me happy.

Paolo – First real question: how did you meet the functional programming world?

Kenji – I first read a Lisp book in the early 1980s when I was a teenager. I was not that interested in the S-expression though, because I didn’t have an execution environment then. It was even before the C language for the personal computers; I was playing around with my Apple II, mostly in the assembly language, and two tiny programming languages called GAME and TL/1. I even wrote a GAME compiler for 6502 running on Apple II.

Before starting my real career, I was a lab member of Professor Eiiti Wada from 1988 to 1990, at the University of Tokyo. Prof. Wada and his lab members created a Lisp implementation called UtiLisp, and the lab was the most advanced place in the campus networking. I was also learned some basic ideas on the functional and even logic programming, because of the nationwide buzzword called The Fifth Generation Computers. Some of the Wada lab alumni were the key designers and implementers of the language called Guarded Horn Clauses, which has surprisingly similar design philosophy to Erlang, although it is a logic programming language.

My problem about understanding functional/logic programming was, however, that I couldn’t really grasp the core reasons why those programming paradigms were effective and even required for a large-scale system design. I failed on a Prolog course in 1989 either because I didn’t find the unification principle was anything meaningful. So I was a very bad student. I wish I could have learned it in the Erlang way of the pattern matching then!

And unfortunately my mind in the late 1980s was too focused on how to run UUCP and email systems in an inexpensive way without UNIX, so any functional or logical programming paradigms seemed redundant to me, because they were so slow. I didn’t like regular commuting from my home to the university, so I wanted a way to discover a way of working from home. At that time my main target of code hacking at home was MS-DOS then; I had to wait until 1993 when I could use BSD/OS at home for experiencing the real UNIX at home. I later moved into FreeBSD in 1997. And I’ve been running Erlang/OTP mostly on FreeBSD since 2008.

Paolo – And when did you first hear about Erlang?

Kenji – I first saw a Japanese translation of Joe Armstrong’s “Programming Erlang”, published by Ohmsha in November 2007, at a bookstore I visited in Tokyo downtown in February 2008, on my way back home from Tokyo to Osaka. I instinctively found out this was the once I had to learn and go for, so I immediately bought it and started discovering the world of Erlang since then.

Paolo – You told me that you had some bad times during your experiences as developer and University Professor, but also that Erlang and functional programming helped you to overcome your difficulties. Can you tell our readers something about that?

Kenji – Let me start from my programming middle-age crisis first.

I have concentrated my programming effort to C since 1986. I haven’t really grasped the idea of the strict control of the module name space in Java, neither the template-based extension made by C++, even at this moment I am answering to the interview in 2013. Of course I can manage to handle other script based languages such as awk, Python (which is quite good), Ruby, or even JavaScript. I know programmers can no longer choose the languages because every system has chosen the best language for running. But that doesn’t mean you can just improvise all the code; you need to have deep knowledge base on at least a few languages.

I was looking for something completely new and innovative for a programming system to learn, after I thought working only on C was no longer sufficient to keep myself up as a modern programmer. I was sick and tired of understanding and modifying the BIND 9 DNS server code, written mostly in C, for a DNS research paper I was writing then. I don’t blame the BIND 9 programmers because it does really complex magic things, and I admire ISC people especially Paul Vixie, one of my mentors in Digital Equipment and the father of BIND. Nevertheless, having to read hundreds of header macro lines to reach the actual code looked no longer practical to me at that time. And I thought I would have lost my competitiveness as a programming person then, if I stuck into the old way of C programming. So eventually I become a polyglot programmer; I use C, awk, Python, Perl, and Erlang.

I knew multi-core or massive-parallel computing hardware is coming and I wanted to learn something very much different from the past sequential and inherently procedural programming languages and systems. While Erlang is *not* specifically designed for a massive-parallel execution environment, Erlang does have a lot of practical constraints for modern computing hardware requirements embedded in the language, for example the single-assignment variable principle, and the OTP system themselves, for example the gen_server behaviour [sic] framework, to solicit the programmers to do the least wrong things. This is something which other languages cannot emulate or mimic.

Next about the University Professor life crisis.

During my Kyoto University career, most of the things I had been doing there was talking, negotiating, and dealing with people, not with computers. The university is a very large organization, and keeping the campus network secure is something practically impossible without the university member’s help, namely from the administrative, education, research staff, and of course from the students. I am an introvert person and most of the university people are not geeks although many are excellent researchers, so the human communication tasks were the toughest thing to do in my life. Also the long-time commuting from my house to the office, spending four hours in total every day, literally killed me.

Fortunately I was allowed to do the CS research activity, however, during the Kyoto University career. And I was eligible to run a large batch jobs on a large Linux supercomputer cluster. So I decided to run some Erlang code and do the fun things over there. One good thing about Erlang is that it is mostly OS independent, so I did the prototyping on my home FreeBSD machines, and let the huge multi-core jobs run on the cluster. I’ve put the research result into GitHub. So I didn’t have to throw away the possibility of my career as a CS researcher 🙂

Paolo – You are widely respected not only for your knowledge on Erlang/OTP but also for your expertise on distributed system security. What is the intersection between these two fields?

Kenji – Erlang/OTP is a very good candidate for making a reliable system. This means it would be a prospective candidate even for a secure system, if properlydesigned. In other words, an unreliable system could *never* be secure. And every system is not 100% reliable.

The word “security” has a lot of implications in many different aspects, and is widely misused in many contexts, even if I exclude the militaristic and socialistic implications, which may be out of scope of this interview, though very serious issues themselves indeed.

I believe that the foundation of a secure system is a reliable and fault-tolerant system. This has been frequently ignored even by many “security” experts; for many of them, security is only about cryptography, or about restricting the user’s behavior in a system, or just about analyzing the behavior of the pieces of malware. I do not deny those aspects, they are very important, and the outcomes of those research activities are surely essential for making a better computer system, but those aspects are not only *the* security. A very broad perspective is needed for a computer security expert.

Also, I have to stress that security is mostly about people and how people behave. People want convenient systems; at many circumstances, security and convenience do not coexist. For example, if you really want a secure system, do not connect it to the internet. But such a special system, which could enable you to provide sufficient communication capability within the system while rejecting all the attack thwarts and zero attack vector, is virtually impossible to make, from the financial point of view. See the Stuxnet case? Consider what if the power plant were using Erlang/OTP as the core and the end-point controllers.

I wish Erlang/OTP developers always think about making a reliable software. It’s not that difficult; thinking carefully when programming will solve most of the cases.

Paolo – What is the best way to “secure” an Erlang distributed systems?

Kenji – Traditionally, putting the whole systems in a protected network, is the only solution. And unfortunately it has still been so.

This is a very good question to answer, because in the current Distributed Erlang (disterl) system on the OTP, the security model is very weak if any existed. TLS-based disterl (with the ssl and crypto modules) will be a good solution to protect the communication between BEAMs, but the problem is that the communication between BEAMs and the port-mapper daemons are plain text and it’s not trivial to incorporate necessary authentication and cryptographic features.

Erlang/OTP has been depending on the assumption that the whole disterl cluster is in a protected network without any attack vectors. In other words, the disterl cluster itself was considered a system without protection. Opening the communication ports to the internet, however, makes this assumption rather unrealistic; the Erlang/OTP devops must think about all the possible attack vectors for the disterl cluster as a whole system.

One possibility on protecting BEAM-to-BEAM communication is to establish cryptographically authenticated links between the BEAMs and let the links be used persistently, with proper periodic re-keying, without using any port-mapper daemon. I believe incorporating such a facility into Erlang is not that difficult, though the rendez-vous problem between the multiple BEAMs should be solved in another way.

Paolo – During your experience as Professor at Kyoto University, you did also research activity using Erlang and OTP. You worked in particular on SFMT and TinyMT. Would you like to introduce these two projects to our readers?

Kenji – Mersenne Twister (MT), a BSD-licensed innovative long-period (typically 2^19937 – 1) non-cryptographic pseudo random number generator (PRNG) by Profs. Makoto Matsumoto and Takuji Nishimura, has become the de facto standard on popular programming languages such as Python and R. SIMD-oriented Fast MT (SFMT) and TinyMT are the improved algorithms, by Profs. Makoto Matsumoto and Mutsuo Saito. The MT algorithms have all a very high order of equidistribution, which fits very well on a large-scale simulation, including the software testing.

SFMT is an improved version of the original MT, which is even faster than the MT, and has a tunable characteristics of the generation period and the sequence generation. TinyMT is another variant of MT, which has a much shorter generation period (2^127 – 1) and smaller memory footprint, but is still suitable for most simulation use. The algorithm of TinyMT is much compact than SFMT or MT, and can generate a massive number (~ 2^56) of independent orthogonal number sequences, which is suitable for massive-parallel asynchronous PRNG.

For the further details on SFMT and TinyMT, please take a look at: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html

I am not a mathematician so I cannot mathematically prove how MT and the derivatives are better than the other algorithms. But I have to emphasize very much that Erlang/OTP’s random module is still using an archaic old algorithm invented in 1980s which has a significantly shorter generation period (~ 2^43), and that has already become an indirect source of security vulnerability once (CVE-2011-0766, discovered by Geoff Cant). SFMT and TinyMT have much better characteristics than the random module, and I strongly suggest you to try them out if you really need a better non-cryptographic PRNG.

The sfmt-erlang repository is: https://github.com/jj1bdx/sfmt-erlang/
The tinymt-erlang repository is: https://github.com/jj1bdx/tinymt-erlang/

Recently I have put 256M (= 2^28) precomputed keys of TinyMT 32-bit and 64-bit generation parameters. This archive is huge (~82GB), but if you would like to use TinyMT for a serious simulation, it is worth taking a look for. The archive is at: https://github.com/jj1bdx/tinymtdc-longbatch/

Paolo – Currently you are working at Basho Japan. Can I ask you what is like to work in one of the most acknowledged Erlang companies? How much Erlang code do you see in your working daily routine?

Kenji – Basho developers are all superb and are very energetic on making Riak and recently-open-sourced Riak CS even more better products. Working with such talented engineers and keeping yourself up with them is very very tough, but if you are capable to point bugs and propose contributions which have proven to work correctly to Basho’s open-sourced projects, you will surely be welcome.

I would also like to emphasize that Basho is not just an Erlang company. You need to know every programming languages and the computer science elements, from C, C++, Java, Python, Ruby, to the gory details of distributed database, including how the vector clocks work and commutative/conflict-free replicated data types (CRDTs). Riak, Riak CS, and rebar, include a lot of their by-products. See the deps/ directory under Riak and you will be astonished. On the other hand, there might be many ways to contribute your skills.

I would also like to emphasize that Basho’s client service engineers, sales and marketing people including the documentation experts, and all the other staff members, are closely working together with the developers and maintain the high standard of delivering the quality service and products.

I can only answer that the amount of Erlang code I have to see is *enormous*. 🙂

Paolo – I am very interested about Erlang and Japan. Is Erlang a niche programming language there as well or is it spreading fast as in the US and north Europe?

Kenji – I would rather want to ask the question back: is Erlang a popular language in anywhere in the world? I think the answer is probably no, comparing to the popularity of Java or C++. Looking at the TIOBE index will prove this. And I’d rather say nobody cares about that, because whether a language is spreading fast or not has already become irrelevant, comparing to the jobs or tasks what you want to get done with the language.

I do understand Erlang has gained a larger momentum in Sweden, where the language is from. And I see many people solving problems with Erlang in Europe. And in the USA and Canada (hi Fred!). And in Japan too, especially for the server-side programming solutions. So I feel the developers in Japan are slowly but surely showing more interests.

Getting back to the situation in Japan: I think not many people are interested in whatever the new paradigm of programming, except for relatively small number of communities. Fortunately, those communities surely exist. And some visionaries have discovered some languages, such as Haskell, OCaml, or Erlang, to solve *their* problems and helping others solving the problems. But for the majority of programmers, most of the details are “not really something to be carefully taken care of and to be blindly delegated to the experts”, also called the *omakase* attitude in Japan. So most programmers just do the omakase to the Rails, or to Java libraries, or to the pre-built C++ libraries. And that irresponsible attitude towards their profession, though not necessarily only of their sole responsibilities, cause a lot of sometimes lethal or disastrous bugs in the production systems. Unfortunately, many of programmers in Japan are not well-educated as the software engineers, and their supervisors are sometimes even worse. Their mindset of dumping the risks (or *doing the marunage*) for every difficult problem makes things even worse.

I think programming is not something for omakase and the quality of code will not be sufficient so long as major users of computers are doing the marunage to the developers in Japan. And I believe Erlang/OTP is not for the people who are not willing to take the risk of their own computer systems. On the other hand, for those who want to maintain the system by themselves or at least to eagerly, deliberately, and willingly take the responsibility of running the system without major outage, Erlang/OTP will become a great tool because the system provides the critical and essential functions such as non-stopping module replacement.

Paolo – As many other Erlang gurus out there, you are very active not only when it comes to promote new Erlang applications but also when Erlang newbies ask for support or suggestions. In your opinion what are the factors that make the Erlang community so nice?

Kenji – I was pretty much impressed by the friendly environment of the erlang-questions mailing list and the modest attitude of the experienced Erlang community-driving people there, when I first asked some questions. I just read and read and read all the things in the Erlang-related mailing lists as much as I could. Erlang Workshop papers were also a set of excellent source of information. And now we’ve been full of good code in GitHub, including the OTP itself. So we’ve got many many more things ready to learn now for free!

I’ve heard that one of the old sayings in the Erlang community is “no prima donna allowed”. This is so important for maintaining a community. I understand everybody wants to get grumpy sometimes, and quite often flame wars occur, but many people just endure and keep silent. I respect this rather European or even Swedish way of getting rid of chaos 🙂

Paolo – I think that the Erlang community is growing fast: many applications, conferences and new books, still most of the developers out there don’t know that behind many of the tools they use every day there is a piece of Erlang. How would you explain that?

Kenji – I think this is in fact a very good thing. People want to solve their own problems in whatever tools they have to use, or they think suitable to use. Erlang has flexible package release tools which can minimize the users of the package to think about the installation of Erlang/OTP itself. In many popular applications, the Erlang virtual machine and the necessary libraries are silently built-in and being there; and most people don’t care whether it uses Erlang/OTP or not so long as the software works OK. Erlang/OTP has become a part of the infrastructural ecosystem.

Of course, there is a strong negative side of this trend, too; developers are doing the marunage with the omakase attitude to the developer of those infrastructural tools with no knowledge about the tools. I try not to fall in this trap by building all user-land programs, kernels, and the Port programs of my FreeBSD development servers, at least for the past ten years. You have to think about the bugs if you have to build your own tools; this is a very good way to learn a new thing. You need to forcefully do so frequently.

Paolo – OK, Kenji. Many thanks for the interview!

Kenji – You’re welcome!

An interview with Steve Vinoski (@stevevinoski)

May 14, 2013 4 comments

Today you can read my interview to Steve Vinoski, a famous Erlang developer/speaker and distributed systems expert. Steve will give the talk “Addressing Network Congestion in Riak Clusters” at Erlang User Conference 2013.

Some questions, some answers

Paolo – Hi Steve! It’s really good to have one of the most famous Erlangers here in my blog. Would you mind to introduce yourself to our readers in a few words?

Steve – I’m Steve Vinoski, a member of the architecture group at Basho Technologies, the makers of Riak and RiakCS. I have a background in middleware and distributed systems, and have been an Erlang user since 2006.

Paolo – I know you are expert in several programming languages. How did you end up using Erlang? Did you have any previous experience with functional languages?

Steve – As far as functional languages go, I’ve played with them on and off for decades, but never used one in production until I found Erlang.

I worked in middleware from 1991 to 2007, and in 2004 at IONA Technologies I started looking into innovative ways of expanding our product line and reducing the cost of product development. IONA’s products were written in C++, which I’ve used since 1988 and so I am well aware of its complexity, and Java, which frankly I’ve never liked (I like the JVM but don’t like the Java language). Neither language lends itself to rapid development or easy maintenance. I built a prototype that layered Ruby over one of our C++ products that allowed for an order of magnitude decrease in the number of lines of code required to write client applications, and built another prototype that provided a JavaScript layer for writing server applications, but customers didn’t seem interested, and both approaches only increased development and maintenance costs.

Then I found Erlang/OTP. I grew more and more intrigued as I discovered that it already provided numerous features that we had spent years developing and maintaining in our middleware systems, things like internode messaging, node monitoring, naming and discovery, portability across multiple network stacks, logging, tracing, etc. Not only did it provide all the features we needed, but its features were much more powerful and elegant. I put together a proposal for the IONA executive team that suggested we rebuild all of our product servers in Erlang so we could reduce maintenance costs, but the proposal was rejected because, as I later learned, they were trying to sell the company so it didn’t make sense to make such large changes to the code. I left IONA and joined Verivue, where we built video delivery hardware, and there I trained seven or eight other developers in Erlang and we used it to great advantage. After Verivue, I wanted to continue working with Erlang, which is part of the reason I joined Basho.

Paolo – In your blog you state that Erlang is your favourite programming language. Why?

Steve – To me Erlang/OTP is the type of system my middleware colleagues and I spent years trying to create. It’s got so many things a distributed systems developer wants: easy access to networking, libraries and utilities to make
interacting with distributed nodes straightforward, wonderful concurrency support, all the upgrading and reliability capabilities, and the Erlang language itself is sort of a “distributed systems DSL” where its elegance and small size make it easy to learn and easy to use to quickly become productive building distributed applications. And as if that’s not enough, the Erlang community is great, pleasantly supporting each other and newcomers while avoiding pointless arguments and rivalries you find in other communities. My use of other programming languages has actually decreased in recent years due primarily to my continued satisfaction with Erlang/OTP — it’s not great for every problem, but it’s fantastic for the types of problems I tend to work on.

Paolo – I know that in a previous working experience you had to deal with multimedia systems, a field where Erlang has still a minor impact with respect to languages like C++. Do you think Erlang will be able to find its place in this field as well? Can you give reasons for your answer?

Steve – Erlang/OTP is excellent for server systems in general, including multimedia servers. The Verivue system I worked on a few years ago had special TCP offload hardware for video delivery, so we didn’t need Erlang for that. Rather, we used Erlang for the control plane, which for example handled incoming client requests, looked up subscriber details in databases, and interacted with the hardware to set up multimedia data flows. Multimedia systems also have to integrate with billing systems, monitoring systems, and hardware from other vendors, and Erlang shines there as well, especially when it comes to finding bugs in the other systems and hot-loading code to compensate for those bugs. Customers tend to love you when you can quickly turn around fixes like that.

Another Erlang developer, Max Lapshin, built and supports erlyvideo, which seems to work well. I’ve never met Max but I know he faced some challenges along the way, as we did at Verivue, but I think he’s generally happy with how erlyvideo has turned out.

Paolo – Currently you are working at Basho, a very important company in the Erlang world. Do you mind telling our readers something more about your job?

Steve – At Basho I work in CTO Justin Sheehy’s architecture group. It’s a broad role with a lot of freedom to speak at and attend conferences and meetups, and I also work on research projects and pick up development tasks and projects from our Engineering team and Professional Services team when they need my help.

Paolo – At Erlang User Conference 2013 you will give a talk about Riak, its behaviour under extreme loads and the issues we may face when we want to scale it. Can you tell us something more about the topic?

Steve – At Basho we’re fortunate to have customers who continually push the boundaries of Riak’s comfort zone. Network traffic in Riak all goes over TCP — client requests, intracluster messages, and distributed Erlang communication. When clusters are extremely busy with client requests and transfer of data and messages between nodes, under certain conditions network throughput can drop significantly and messages can be lost, including messages intended for client applications. I am currently investigating the use of alternative network protocols to see if they can help prioritize different kinds of network traffic. This work is not yet finished, so my talk will give an overview of the problems along with the current status of the solution I’m investigating.

Paolo – I heard that you will also introduce during the talk a new Erlang network driver that should tackle some of this issues. Is this correct? Can you give us an insight?

Steve – Yes, I have been working on a new network driver. It implements an alternative UDP-based protocol for data transfer that can utilize full bandwidth when available but can also watch for congestion on network switches and quickly back off when detected. It also yields to TCP traffic under congestion conditions, preventing background data transfer tasks from shutting out more important messages like client requests and responses.

Paolo – Who should be interested in this talk? What are the minimum requisites needed in order to fully understand the topics of the talk?

Steve – Attendees should have a high-level understanding of Erlang’s architecture, what drivers are, and how they fit into the system. Other than that, my talk will explain in detail the problems I’m trying to address as well as the solution I’ve been investigating, so neither deep networking expertise nor deep understanding of Erlang internals is required.

Paolo – I can say without doubts that you are an expert in middleware and distributed computing systems. Can you suggest to our readers interested in those topics some books or internet resources?

Steve – The nice thing about distributed systems is that they never seem to get any easier, so there have been interesting research and development in this area for decades. The downside of that is that there are an enormous number of papers I could point to. In no particular order, here are some interesting papers and articles, most of which are currently sitting open in my browser tabs:

“Eventual Consistency Today: Limitations, Extensions, and Beyond”, Peter Bailis, Ali Ghodsi. This article provides an excellent description of eventual consistency and
recent work on eventually consistent systems.

“A comprehensive study of Convergent and Commutative Replicated Data Types”, M. Shapiro, N. Preguiça, C. Baquero, M. Zawirski. This paper explores and details data types that work well for applications built on eventually consistent systems.

“Notes on Distributed Systems for Young Bloods”, J. Hodges. This excellent blog post succinctly summarizes the past few decades of
distributed systems research and discoveries, and also explains some implementation concerns we’ve learned along the way to keep in mind when build distributed applications.

“Impossibility of Distributed Consensus with One Faulty Process”, M.Fischer, N. Lynch, M. Paterson. This paper is nearly 30 years old but is critical to understanding fundamental properties of distributed systems.

“Dynamo: Amazon’s Highly Available Key-value Store”, G. DeCandia, et al. A classic paper detailing trade-offs for high availability distributed systems.

Paolo – Day-by-day Erlang becomes more popular. In your opinion what can we expect from Erlang in the future? What are the next goals the Erlang community should try to reach?

Steve – Under the guidance of Ericsson’s OTP team and with valuable input from the open source community, Erlang/OTP continues to evolve gracefully to address production systems. I expect Erlang will continue to improve as a language
and platform for building large-scale systems that perform well and are relatively easy to understand, reason about, and maintain without requiring an army of developers. In particular I’m looking forward to the OTP team’s
continued work on optimizing multicore Erlang process scheduling. The Erlang community is very good at proving how good Erlang/OTP is through the results of the systems they build, so they need to keep doing that to broaden Erlang’s appeal. If you’re a developer building practical open source or commercial software, the presentations given by community members at events like the Erlang User Conference and the Erlang Factory conferences are amazing sources of knowledge and wisdom for what works well for Erlang/OTP applications and what can be problematic.

How to log your stuff with Lager

August 11, 2012 1 comment

When you start coding something more complex in Erlang, eventually you will need some sort of logging framework for your Erlang/OTP applications.
There are many solutions available out there for you, but today I would like to write a post about Lager. Who knows, maybe in the future I will write something about the others as well.

Lager is tasty!

 

Lager (as the beer) is a logging framework introduced by Basho, a company that you should already know since they are the authors of some killer applications such as rebar and riak. If you want to know more about Basho check out their web site or my interview to Dave Smith.

Why should you pick Lager for you project? In my humble opinion I have to say that Lager is easier to integrate to your project, easy to use and (last but not least) log files generated using it will be loved by your sysadmins.

I have to be honest here: I have always been “scared” by the multi-line-message (maybe with SASL) you receive when something goes wrong in your code. Actually, I strongly believe most of Erlang developers reading this post felt the same when they started with Erlang.

Yes, it is true that with SASL you have much more information about what went wrong and why, but sometimes (especially if you are a beginner) too much information can confuse you. Lager helps a lot in this sense. It gives just a friendlier one-line print, but if you need more information you can still access them since Lager stores them in a crash log which can be inspected later if needed.

What? You say this is still not enough? Well, I can list you some more cool features!

  1. Lager allows you to chose among different log messages (e.g. debug, info, notice, warning, error, critical, alert, emergency), this is something I have seen also on ejabberd custom logger, and I find it very very useful.
  2.  Lager gives support for multiple backends (e.g. console and file). What does this mean in practice? You just need to know that Lager is a gen_event with multiple handlers installed. At Basho, they are planning to add more, but currently they provided a console handler and a file handler. The interesting thing about file logging is that you can have differente files for different log levels, so for example you can have a log of only errors and above. Notice also that log levels may be changed at runtime.
  3. Lager does not need to add a newline (~n) at the end of the message. I’m lazy…so I love this small feature!

If you liked this introduction on Lager, you may want to add it to your project and try some of the API provided! Let’s see how to do it.

I assume you build your projects using rebar, therefore the first thing to do is to add Lager as a deps in you rebar.config file, that in the end should look like this:

{sub_dirs, []}.

{deps, [
        {lager, ".*", {git, "git://github.com/basho/lager.git", "master"}}
       ]}.

{erl_opts, [{parse_transform, lager_transform}]}.

For this tutorial I will show you how to build a simple application (myapp) with consists of just a supervisor and a gen_server (I assume you know how to spawn the top supervisor from the application and the gen_server from the top supervisor).

Since this is not a tutorial on rebar, I will just use a common way used to compile and start an application for development purposes, if you want something which may fit in production take a look at this tutorial.

Ok, this is my directory structure:

paolo@paolo-laptop:~/blog/lager-article$ ls
deps ebin rebar rebar.config src

Let see src content:

paolo@paolo-laptop:~/blog/lager-article$ ls src
myapp.app.src myapp.erl myapp_server.erl myapp_sup.erl

Where myapp.app.src is:

paolo@paolo-laptop:~/blog/lager-article$ less src/myapp.app.src

{application, myapp, [
        {description, "experimenting with lager."},
        {vsn, "0.1.0"},
        {modules, [myapp_sup, myapp_server]},
        {registered, [myapp_sup, myapp_server]},
        {applications, [
                kernel,
                stdlib,
                compiler,
                syntax_tools
        ]},
        {env, []},
        {mod, {myapp, []}}
]}.

Cool, everything is in place! We can compile and run the application using:

paolo@paolo-laptop:~/blog/lager-article$ erl -sname mynode -pa deps/lager/ebin -pa ebin -eval "application:start(compiler), application:start(syntax_tools), application:start(lager), application:start(myapp)."

Erlang R15B (erts-5.9)  [64-bit] [smp:2:2] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.9  (abort with ^G)
(paolo@paolo-laptop)1> 20:24:04.070 [info] Application lager started on node myapp@paolo-laptop
20:24:04.072 [info] Application myapp started on node mynode@paolo-laptop

Lager starts, awesome! Now you can experiment more using in your gen_server (for example in init/1) one of these functions:

     lager:info("~s is ~s!", [lager, cool]),
     lager:warning("but pay ~s!", [attention]),
     lager:error("there is always some ~s", [error]),

You should now have a nice directory called “log”, open it and inspect the files!

Tere is much more stuff I would love to say about Lager, but I think this post is growing too much, so I suggest you to read the readme file you can find with Lager on github and this post by Andrew Thompson.

Categories: Erlang Tags: , ,

An interview with Dave Smith (@dizzyd)

June 24, 2011 1 comment

Hi all! Many of you liked the interviews I made during the Erlang Factory, therefore I decided to interview a couple of more famous erlangers….I hope these interviews will help Erlang newbies to have an insight on the world out there. The interview I propose today is with Dave Smith (a.k.a. dizzyd). Dave is Director of Engineering at Basho Technologies, Inc., in my humble opinion he gave not trivial answers, so I strongly suggest you to read our conversation.

Ask and Answer

Paolo – Hi Dave, thanks for making yourself available for this interview. Please, would you like to introduce yourself to our readers?
Dave – My name is Dave Smith, but a lot of people know me as “Dizzy” — there are a lot of “Dave Smith”s in the world. 🙂 I’ve had the opportunity to contribute to a couple of different Open Source projects over the years, including Jabber and more recently things like rebar, riak, bitcask, etc.

Paolo – Do you remember when you first heard of Erlang?
Dave – I was working on the first commercial Jabber server and had just completed writing a library (in C++!) to manage async sockets on a threadpool. The intent of the library was to make it easy for people to write scalable Jabber components. I read about Erlang and wondered how they had solved that same problem…only to find their implementation was far, far ahead of mine. I thought the syntax was weird and the string handling (all lists, in those days) was poorly suited for Jabber — nonetheless I liked the ideas.

Paolo – Can you describe to our readers your first experiences with this language?
Dave – I think it was a pretty typical experience pre-good books on Erlang. There was lots of fumbling with seemingly random syntax and scratching my head over the point of the OTP stuff. However, even with this pain, I was able to rewrite a business specific system over the course of a long weekend and the end result was a more stable, scalable system with a 90% reduction in lines of code. All very typical for Erlang. 🙂

Paolo – In your past work experiences, you used C++. Was it difficult to switch to Erlang? What features of C++ (if any) would you like to see in Erlang?
Dave – I can’t say there’s anything in C++ that I miss in Erlang. If anything, Erlang has renewed my interest in simpler syntax — I much prefer C over C++ whenever possible now.

Paolo – You are currently Director of Engineering at Basho Technologies, Inc. How does it feel to work in one of the most famous companies using Erlang?
Dave – It’s an honor and privilege to work at Basho. We have a rare freedom to pursue building a product with the best-of-breed language for distributed systems and we do so in a decentralized working environment. More importantly, we’ve managed to assemble a roster of world-class engineers who constantly push each other to improve. I like coming into work and knowing that I’m going to have to work hard to keep up with my co-workers. 🙂

Paolo – Basho is well know among erlangers for riak and rebar. Can you tell us something more about these two products?
Dave – Riak is the product for which Basho is best known. It’s an eventually-consistent, distributed data store that’s designed to scale in a manageable way. Writing this system in Erlang has enabled us to focus on the distributed algorithms and not worry so much about the details of socket, thread management, etc. The expressiveness of Erlang has also enabled us to be a more efficient engineering team and deliver the high quality code one expects from a data store. Rebar is an Erlang build tool that makes it easy to compile and test Erlang code. It uses standard Erlang conventions for choosing what files to compile and makes it very, very easy to create new projects that other Erlang devs will understand.

Paolo – What is in your opinion the biggest benefit a company can have using riak?
Dave – Low, predictable latency data access and operational ease-of-use. When used appropriately (i.e. don’t expect Riak to be a RDBMS), Riak is able to provide an excellent latency profile, even in the case of multiple node failures. In addition, it’s easy to add/remove nodes while the system is running without a lot of operational juggling. Another useful benefit is that multiple nodes can take writes for the same key, thus making it easier to construct a geographically distributed data store.

Paolo – Do you think that you could have reached the same level of product “quality” using another programming language?
Dave – Quality is determined less by the language and more by the values of the team writing the code. The process model in Erlang makes it easier to construct systems that tolerate partial failure; the expressiveness makes it easier to write less code (and thus less bugs) for a given application. However, just because these things are “easier” doesn’t mean you’re going to get a high-quality system. 🙂 Ultimately, quality requires engineering discipline to take advantage of Erlang’s strengths and wield those powers appropriately.

Paolo – Many Erlang programmers I know are students at university. Does Basho provide opportunity for internships?
Dave – At this time, we are not providing internships, unfortunately.

Paolo – If you were a newcomer in the Erlang world, where will you focus your attention?
Dave – Erlang is a pretty wide-open language in terms of stuff left to do. Pick something that’s fun and challenging and go for it!