Hi, I am Saša Jurić, a software developer with 10+ years of professional experience in programming of web and desktop applications using Elixir, Erlang, Ruby, JavaScript, C# and C++. I'm also the author of the upcoming Elixir in Action book. In this blog you can read about Erlang and other programming related topics. You can subscribe to the feed, follow me on Twitter or fork me on GitHub.

Immutable programming, FP style

| 2 comments
In the the previous post I presented the OO-like technique of manipulating complex hierarchy with immutables. Today we will see how to perform the same task using (almost) pure functional approach, which is a standard way of writing Elixir code.


From OO to FP

We have already gone a long way from typical OO programming, introducing pure immutable structures to manipulate complex data. One OO-ish thing that still remains is the polymorphic dispatch, such as company.add_employee(...) which is in runtime transformed to Company.add_employee(..., company).

In a way, we are already almost using functional approach, but it is hidden behind the polymorphic dispatch. The primary benefit of OO style is the elegant use of the dot (.) operator which allows us to chain multiple statements without using intermediate variables:

Company.new(name: "Initech").
  add_employee(Employee.new(name: "Peter Gibbons",  salary: 10000)).
  add_employee(Employee.new(name: "Michael Bolton", salary: 12000))

To mimic this in pure functional style, we can use the pipeline operator (|>) which feeds the result of the previous function to the next call as the first argument. So the call something |> fun1(a,b) |> fun2(c,d) is in compile time transformed into fun2(fun1(something, a, b), c, d). This makes it easy to chain function calls, much like in OO approach, but without the need for runtime dispatch.

Notice that the pipeline operator feeds the previous result as the first argument, while the dot operator feeds "this" as the last argument. This is the reason for incompatibility between OO and functional style in Elixir, which makes it harder to combine two approaches or to switch from one to another. Consequently, you should decide upfront which approach to use. If you wish to adhere to Elixir conventions and best practices, you should almost always opt for the functional approach.

One exception to this rule are record built-in functions, i.e. accessors/modifiers which are auto-generated when a record is defined. In this particular case, Elixir resorts to OO like syntax, which significantly simplifies record manipulation but it does not combine data with behavior. Polymorphic calls are used only to get and set the fields of a record. It is also worth noting that, if some hints are provided, Elixir compiler can actually resolve calls of standard records operations in compile time.


Manipulating complex data

Due to similarities between pipeline and OO chaining, it is fairly straightforward to convert OO code to the FP version. Here is the functional equivalent of the "Complex mutations" example from the previous article:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
defrecord Employee, id: nil, name: nil, salary: nil

defrecord Company, [
  name: nil, employees: HashDict.new, autoid: 1
] do

  def add_employee(company, employee) do
    company
    |> store_employee(employee.id(company.autoid))
    |> inc_autoid
  end
  
  defp store_employee(company, employee) do
    company.update_employees(HashDict.put(&1, employee.id, employee))
  end
  
  defp inc_autoid(company) do
    company.update_autoid(&1 + 1)
  end

  def get_employee(company, id), do: Dict.get(company.employees, id)
end

c = Company.new(name: "Initech")
    |> Company.add_employee(Employee.new(name: "Peter Gibbons",  salary: 10000))
    |> Company.add_employee(Employee.new(name: "Michael Bolton", salary: 12000))

IO.inspect c
IO.inspect Company.get_employee(c, 1)
IO.inspect Company.get_employee(c, 5)

This is very similar to the OO version presented the last time, so I will not discuss all of the details.

One subtle change is introduction of the function inc_autoid (line 17) which didn't exist in the OO version. The sole purpose of this function is to wrap the OO styled call company.update_autoid so we can use it in the chain (line 10). Unfortunately, standard record operations are currently not compatible with pipeline chaining (there has been some talk on the mailing list about tackling this issue).

Another subtle but important improvement is that internal functions store_employee and inc_autoid are now made private. This is possible since we are not relying on runtime OO dispatch mechanism.

The usage of the record has also changed. Whenever we call the Company module function from the outside, we have to explicitly reference the module. In this example, repeated calls are issued to Company.some_fun in lines 24-26 and 29-30. This is the obvious consequence of not using polymorphic dispatch: the code will be a bit polluted with duplicated module references. While I do regard this as a downside (many functional programmers would probably disagree with me), it is not as huge problem as it might initially seem. Usually, a chain of multiple calls of functions from a single module, could be moved to that module as a distinct function. Once inside the module, we can omit the module prefix when calling functions, just like it is done in lines 9-10.

By abandoning OO styled syntax, we also lose polymorphic nature of our code: the function which gets invoked is now determined in compile time. When you do need a polymorphism, i.e. a run-time decision of which function to call, there are two standard ways of doing it. This is really worth its own post, so I will only briefly mention it without providing detailed explanation.

The basic Erlang way of doing run time polymorphism is to use multi-clause functions and pattern matching. Each clause represents a specific implementation which will be invoked depending on the actual values of arguments. The problem with this technique is that it is not flexible. If a new type must be supported, definition of the polymorphic function must be extended with an appropriate clause. This means that all variants must reside in one module. Furthermore, the problem becomes harder to solve if for some reason you can't modify the module. In such situations, you must add another layer of indirection (essentially another multi-clause function) in front of it, which is somewhat clumsy.

To overcome this, Elixir introduces protocols, a construct somewhat similar to OO interfaces. A protocol is essentially an abstract module, where functions are declared, but not implemented. The generic piece of code uses the protocol, while clients provide implementations for their own types. By implementing the protocol, you get the benefit of reusing the generic code, without modifying it. However, comparing to OO interfaces, protocols are again based on pure functions and do not rely on data + behavior coupling. The main benefit of protocols is that new implementations can be freely provided and are not required to reside in one place in code.


Conclusion

This post is not very long since most of the underlying theory has already been covered in the two previous articles. After understanding basic principles, and experimenting with OO style complex data manipulation, it is really easy to switch to pure FP style. It boils down to moving the "subject" argument (aka "this") to the first place in the parameters list, and using |> instead of the OO-ish dot (.) operator.

By doing this, we have decoupled data from the behavior and are now programming in functional style. The code is divided into small functions which depend only on the input arguments, and not on some global or private instance state. Such functions are highly reusable, composable and easy to test and debug which should be a good reason for using this approach. Hopefully, the example has demonstrated that it is not hard.

2 comments:

  1. Hi Saša. I can't believe I'm the first one to comment, because I found your articles really illuminating. I think you filled not one, but many of the gaps that create misunderstanding and misinformation about programming. You highlighted great points about
    – Elixir
    – Immutability
    – Functional Styles
    I'm recommending your articles to everyone and I'm going to make fruitful use of the understanding that I gained from reading them.

    ReplyDelete
  2. Thank you for the feedback, and it makes me happy that you find those posts useful!

    Due to the work on Elixir in Action, I didn't have much time to deal with this blog and write more posts, but hopefully this will change soon(ish) :-)

    Thanks again!

    ReplyDelete