When I started working on this task, I regarded Erlang as a telecom platform which happens to be suitable for a development of push servers. In the process, I have learned that it is a general purpose development platform, not limited to telecoms in any way. Now I think Erlang is probably the best development platform I have had the pleasure of working with, and I consider it the optimal choice for the development of server side systems which should handle many dynamic requests.
In this article, I will try to give a basic introduction to Erlang from the perspective of a seasoned OO developer. You might find the topic especially interesting if you have been working mostly with mainstream platforms, such as Java, .NET, Ruby, Python, etc. By the end of the article, you should get some information on what is Erlang, which problems does it solve, how does it solve them, what are its pros and cons, and which other similar technologies exist.
In the 80s and early 90s, most applications were single user desktop programs, and properties such as scalability, availability and fault tolerance had limited uses (such as telecom systems). Today, however, we face a very different situation, with many Internet based systems serving a large number of clients, often pushing fast changing data to them, and sometimes even allowing then to communicate with each other using servers as proxies. On the server side, these properties are now a necessity.
Erlang was built to address exactly these types of requirements, so it comes as no surprise that its popularity has been increasing recently. Today, it powers various large systems such as Facebook chat backend, parts of Heroku application cloud and even nosql databases such as Amazon SimpleDB, CouchDB or Riak. In addition, there are many anecdotal reports on the Internet of Erlang based systems, which serve couple of thousands requests per second to thousands concurrent users without any problems.
The processes are completely independent, sharing no memory, and communicating via asynchronous messages. This makes it much easier to reason about thousands of processes in Erlang than to synchronize two threads in a classic shared memory, locking based multithreaded system. The development of concurrent systems is therefore simplified, making it easier to divide your application into many concurrent units, which the VM can distribute over available CPU resources, making them execute in parallel as much as possible. In this way, your system obtains vertical scalability: when the load increases, you can scale up by adding more CPU resources, without the need to modify the code.
When you want to maintain changing state, you will most often start a new Erlang process, and communicate with it via messages. This is called the Actor model and is one of the biggest advantages of Erlang. The actors are similar to objects: they encapsulate mutable state, and can interact with other actors. The biggest difference is that actors are inherently concurrent. Therefore, when you organize your code into many actors, the application automatically uses all available CPU resources and obtains vertical scalability.
Due to its nature, Erlang concurrency provides your applications with additional properties. When a runtime error occurs, and an Erlang process crashes, it does not impact other processes, which increases the stability of your system. The accompanying framework allows you to create so called supervisors: special processes which observe other processes and restart them if they crash. The VM can absorb runtime errors and recover from them, minimizing the need for a manual intervention by a developer.
Since the processes share no memory, the garbage collection can occur at the Erlang process level. Instead of long garbage collections, we have many smaller, shorter ones, meaning that the system will be generally more responsive.
Erlang also offers simple, yet powerful primitives for distributed computing. You can communicate between multiple VM instances, called Erlang nodes, running on separate machines, which gives you the possibility to scale out to multiple servers and/or increase stability by implementing some kind of fail-over mechanism. The nice thing here is that the communication between two Erlang processes, or rather the underlying source code, is always the same, regardless of whether the processes reside on the same node or on the different ones. This means that your code is ready to be rearranged and redistributed over multiple nodes in any way you like, with minimal modifications, even if you initially didn't give any thought to distribution.
Finally, Erlang offers a way to deploy new version of your code without having to restart the VM or your application which additionally increases availability of your system.
These are unique features which differ Erlang from most, if not all, modern development platforms. Couple that with the fact that Erlang has been extensively used in large systems for more than two decades, and that it is constantly being improved, and you get a very attractive choice for the development of modern server systems.
The standard distribution comes with an application framework (OTP) which gives you patterns and abstractions for building standard applications and the means to deploy them. There are also many libraries giving you all kinds of services: different data structure implementations (trees, sets, hashes etc.), network I/O (tcp, ssl, http), odbc client, and you even get a powerful nosql key-value database.
The community is also very active. You will find many 3rd party libraries, such as web servers and frameworks, clients for various 3rd party components (e.g. Redis, 0MQ, MongoDb, etc.). Therefore, when using Erlang, you will be able to interact with many mainstream technologies.
Finally, Erlang is a full blown development platform, so in the standard distribution, you will find a typical set of tools, including unit testing framework, static code analysis checker, tools for debugging, profiling, tracing, monitoring etc. Especially useful is the possibility to connect to a remote running Erlang system and interact with it.
The Learning Process
In my experience, it takes some time to get used to all of these concepts. At first, Erlang will seem very strange and feel a bit low level. I read a nice statement somewhere that "Erlang makes hard things easy, and easy things hard". This is true in the beginning, but with time, I found that Erlang actually makes it easier to think in abstract patterns, instead of dealing with low level mechanics of the code. I would say that, after some accommodation time, Erlang makes hard things easy, and easy things a bit more complicated than classical OO.
For learning the platform, I can recommend Programming Erlang by Joe Armstrong, the language creator. There is also a very popular online book available: Learn you some Erlang for great good!. Both books start from the very basics, and take you to the advanced topics such as concurrency and distribution. Current Erlang distribution can be obtained from the official site. Other than that, there are many great resources on the topic, such as official documentation, blogs, groups, forums, github repos etc. Googling will usually take you there.
On a technical level, the only disadvantage that comes to mind is the execution speed. Erlang is a dynamic language, and programs are run inside the VM. When not using concurrency, it will usually be slower than many other languages, especially the statically typed ones such as C++, C# or Java. Therefore, it is usually not the most appropriate solution when doing many CPU intensive operations, such as mathematical calculations. However, if your server is performing a lot of concurrent operations, which are combining I/O (network, files, databases, etc.) with some CPU processing, Erlang benefits will outweigh its speed problems.
The combination of Scala/Akka takes many of Erlang features combining them with an OO approach, so you are supposed to get "best of both worlds". I read the book on Scala, and personally found the language too complex, but that is just my opinion. Scala is used in many systems (with Twitter being probably the most famous reference), and there is the added benefit of interoping with Java, so you have the access to a wide range of existing libraries.
The best alternative, in my opinion, is Google Go which is sort of a simplified C/C++ like language with lightweight concurrency similar to Erlang. Go misses many features Erlang (and even Scala/Akka) has, and helps you only with scalability part of your system. To gain other properties, such as fault tolerance and high availability, you have to develop your own solution.
The reactor pattern based approaches, such as node.js, EventMachine or Twisted are often used today to implement scalable systems, especially comet (HTTP push) servers. I find this approach inferior, sort of a Visual Basic 6 of scalable systems. It will take you there quickly, but the underlying code will soon start to be very complex, degenerating to a strange mixture of callbacks, deferrables, futures/promises and whatnot. The reactors are by their nature single threaded, so you will have to do additional work to make them utilise multiple cores and scale vertically.
There are also other functional language alternatives, such as Haskell or Clojure. I can't really comment anything on this, since my knowledge on the topic is non existent, but I've seen many religious wars between corresponding camps, which indicates that they are probably viable alternatives to Erlang.
At the end of the day, despite its downsides, I would still choose Erlang over anything else when high concurrency, scalability and availability are called for. In a situations when some (but not high) concurrency is called for, and a lot of imperative logic is expected, I would probably go with Go :-).