Archive for the ‘Functional programming’ Category

Reaction to “Clojure: Towards The Essence Of Programming” from a Scala perspective

Sunday, August 28th, 2011

How Scala addresses those problems

Just watched Howard Lewis Ship’s Clojure: Towards The Essence Of Programming, which is an excellent introduction to Clojure and how it addresses the problems in Java. As a Scala fan, I thought it would be a good idea to see how one could use Scala to do the same.

Ceremony vs Essence

Howard uses the example below to show how much ceremony (code with little real info) is required, hiding the essence (the real intention) when using Java to sort a list of Stock objects (a portfolio) either using different properties such as open values or last trade values:

 public static void sortByOpen(List<Stock> portfolio) {
    //ceremony: a lot of code but very little real info
    Comparator<Stock> c = new Comparator<Stock>() {
       public int compare(Stock o1, Stock o2) {
          return o1.getOpen() - o2.getOpen(); //essence: the beef is here
       }
    };
    Collections.sort(portfolio, c);
}

The Clojure version is:

//a Stock object is just a map with each field as a key. ":open" is the key to get the open value.
(sort-by :open portfolio)

In Scala, it can be as simple as the Clojure version:

//full version: using an anonymous function instead of a comparator object
portfolio.sortBy((s: Stock) => s.open)

//simpified version: sortBy is expecting the argument to be a function
// (Stock) => SOME-RESULT-TYPE, so we don't need to declare the
//type for "s".
portfolio.sortBy((s) => s.open)

//final version: as "s" only appears once in the function body, just use
//an underscore and get rid of the parameter declaration.
portfolio.sortBy(_.open)

The Clojure version is like the final Scala version in the first attempt because it doesn’t need typing information in the code. In Scala, the compiler infer it from the code.

In addition, in Clojure the object is just a map so people can refer to a property directly with its name without referring to the object. In Scala, we must use something to denote the object in the form of obj.property. However, the underscore syntax eliminates the need to declare that object, so the code is still very succinct.

The benefits of the Scala approach are:

  • Compile time checking: if the property name mismatches the Stock class, in the Scala the compiler will tell us immediately.
  • Runtime performance: it is a lot faster to access the property by offset than by looking up a key.

The cost we pay is mainly the learning curve: we must learn how Scala infers the typing information so that we know what to type and what to NOT type and learn the meaning of the underscore in an anonymous function.

Form is structure

I may not be fully understanding this example. Howard uses the Fahrenheit conversion, f =9*c/5+32, to show that in Clojure the form (the look on the surface) is the same as the structure (the actual relationship):

//the look is the same as the actual syntax tree
(+
  (/
    (* 9 c)
    5)
  32)

I definitely think the Java or Scala version is much simpler and more concise:

9*c/5+32

This is, after all, what we call DSL. Arithmetics has its rules of precedence and associativity, allowing us to write in a very succinct form to express more complex structure.

Read Eval Print Loop

Scala has it too. I like it very much too.

Functional programming

Howard that moves onto functional programming with examples using filter, reduce, map, for comprehension, stream. Obviously, Scala can do all these in a similar fashion. For example, to get the last trade total value of the portfolio, we can map each stock to its last trade value and then use reduce to sum them all up. In Clojure:

//#() means anonymous function. % refers to the one and only parameter (the stock).
(reduce + (map #(* (% :last-trade) (% :shares)) portfolio))

In Scala:

//+ in Scala is not a function, but a method. So we can't use it directly;
//instead, use an anonymous function with underscores again.
portfolio.map((s)=>s.lastTrade*s.shares).reduceLeft(_+_)

There is not much difference. However, I think the significance is much deeper. The question is not whether functional programming is supported or not, but how much it is intended to be used in place of procedural programming. Scala allows both styles of programming, which may or may not be a good thing, depending on how you view it. If you’re committed to the functional programming style for concurrency, testability, predictability, then using a language that is less capable of procedural programming is actually a good thing. If you’re more comfortable with procedural programming and just using functional programming in a smaller scale, then a hybrid language would be better.

Extending the language with DSL

Then Howard moves onto extending Clojure with DSL. One example he uses is building a DOM tree with a DSL.

(defview root-index
  [env]
  :html [
    :head [
      :title [ "Cascade Blog" ]
    ]
    :body [
      :h1 [ "Cascade Blog" ]
      :ul { :class "recent-postings" } [
        (template-for [posting (recent-postings env)]
          :li [
             (render-link env show-posting (posting :id) (posting :title))
           ])
      ]
    ]
  ])

As an exercise I implemented a similar DOM DSL in Scala:

//the DSL is implemented in the Html object
import Html._

//using the DSL to define a DOM tree
p @@ ("style" -> "color: red") containing (
        "This is an ",
        b containing ("interesting"),
        "DOM tree"
        )

They aren’t that different except that in Clojure each list item is separate from its neighbors by being enclosed in its own () but in Scala we must use a comma to separate the list items. So, choose your own poison: either tolerate a million ()’s or the ugly commas in the DSL.

Another difference is that in Clojure text can be freely mixed in the DOM tree as there is no typing restriction, while in Scala text is magically converted into a TextNode object. Again, the latter provides structure, compile-time protection and runtime performance, but we must learn the implicit conversion mechanism to understand what is going on.

Just to be complete, the DOM DSL is implemented in Scala as below:

//a DOM node
class Node

//a text node
case class TextNode(val text: String) extends Node

//an element node
case class Element(val name: String,
                   val attrs: Map[String, String],
                   val children: Seq[Node]) extends Node {
  def this(name: String) = this (name, Map(), List())

  //we can use symbols as method names, but @ is a reserved keyword in Scala,
  //so I used @@ as the name of the method that adds attributes to the element.
  def @@(attrPairs: (String, String)*) =
    //just add each pair to the map
    Element(name, attrPairs.foldLeft[Map[String, String]](Map())(_ + _), children)

  //add the child nodes (elements or text nodes) to this element
  def containing(children: Node*) = Element(name, attrs, children)
}

object Html {
  //convert strings into text nodes automatically
  implicit def string2TextNode(s: String) = TextNode(s)
  def p = new Element("p")
  def b = new Element("b")
}

Howard also shows the need for lazy evaluation so that some code is only executed on demand. For example, the DSL code to generate each <li> is repeatedly called in a loop. In Clojure lazy evaluation is implemented with macros which generate well-form code instead of arbitrary text. This seems to be more powerful (able to do more) than the call-by-name mechanism Scala, although I don’t know if/when this extra power is really needed.

Learn the basic concepts and significance of functional programming in 10 minutes

Sunday, July 4th, 2010

Introduction

If you’re a Java programmer, this short article will show you the basic concepts of functional programming and the reason why it may be getting more and more important in the near future. All sample code is in Java so it should be very easy to understand.

What is functional programming?

Functional programming is just like our programming in Java or C#, but without assignments. You may wonder is this possible at all? For example, to calculate the total price of a list of product prices, one might write in functional programming style (the code is still in Java):

abstract class List {
}

class EmptyList extends List {
}

class NonEmptyList extends List {
  int head;
  List tail;
}

class Foo {
  int getTotalPrice(List ps) {
    if (ps instanceof EmptyList) {
      return 0;
    } else {
      final NonEmptyList nl = (NonEmptyList) ps; //initialization, NOT assignment
      return nl.head + getTotalPrice(nl.tail);
    }
  }
}

Note that in line 17 there seems to be an assignment, but it is not: It is an initialization and the variable is declared as final, so no modification can be made. This is perfectly allowed in functional programming.

So you can see that assignment is not really required. But why the lack of assignment is a good thing?

Why functional programming may be huge in the near future?

To see the benefit, let’s assume that if a price is >= 100 then you’ll give a 20% off discount. So, you may modify the code as:

class Foo {
  int getTotalPrice(List ps) {
    if (ps instanceof EmptyList) {
      return 0;
    } else {
      final NonEmptyList nl = (NonEmptyList) ps;
      final int headDiscountedPrice = nl.head >= 100 ? (int) (nl.head * 0.8) : nl.head;
      final int tailPrice = getTotalPrice(nl.tail);
      return headDiscountedPrice + tailPrice;
    }
  }
}

Note the three initializations. Because all the variables and fields can’t be modified, their order is unimportant and, in a real functional programming language, the order can be changed at wish without changing the return value. For example, you could change it like:

class Foo {
  int getTotalPrice(List ps) {
    if (ps instanceof EmptyList) {
      return 0;
    } else {
      final int tailPrice = getTotalPrice(nl.tail); //nl is not initialized yet, how it works?
      final int headDiscountedPrice = nl.head >= 100 ? (int) (nl.head * 0.8) : nl.head;
      final NonEmptyList nl = (NonEmptyList) ps;
      return headDiscountedPrice + tailPrice;
    }
  }
}

Note that when initializing the tailPrice variable, the nl variable hasn’t been initialized yet. Will this cause a problem? No. In a real functional programming language, each of three variables will be initialized at the same time with a lazy expression. When the value is really needed, the lazy expression will be evaluated. So, if the value of tailPrice is needed but nl hasn’t been evaluated yet, it will be evaluated and the calculation will proceed. No matter what execution order is, the final total price will be the same.

Now, let’s get to the core issue of why this is important. As the order of these expressions are unimportant, they can be evaluated concurrently. As nowadays we’re getting more CPU cores instead of speedier single CPU, this programming model may become the mainstream in the future as the evaluations of different expressions can be done in different cores.

Note that the semicolon in the code now has a different meaning: In Java, it means sequential execution, but in a real functional programming language, it only separates the expressions but there is no ordering at all. That is, in a real functional programming language, there is no obvious concept of sequential execution.

How to maintain states or write to a database?

How to write to a database in functional programming? Let’s say there is special built-in methods to perform reading and writing to a certain database record. Let’s try to increment the value of the record twice with the code below:

class Foo {
  int read() {
    //...
  }
  void write(int v) {
    //...
  }
  void inc() {
    final int v = read();
    final int w = v+1;
    write(w);
    final int t = w+1;
    write(t); //Problem: It may occur before write(w)!
  }
}

The problem with the code is that, because there is no ordering, write(t) may be evaluated before write(w), so finally the value will have been increased by one, not two! The problem here is that, once we have assignment (the write operation to the record), the ordering becomes important, but there is no obvious ordering in functional programming.

Creating sequential ordering in functional programming

In fact, it is possible to create the effect of sequential evaluation in functional programming. For the above example, you can restructure the code as:

interface Function {
  Object eval(Object arg);
}

class Foo {
  int write(int v) {
    ...
  }
  void inc() {
    final int v = read();
    new Function() {

      public Object eval(Object arg) {
        final int w = v+1;
        final int r = write(w);
        return new Function() {

          public Object eval(Object arg) {
            final int t = w+1;
            return write(t);
          }
        }.eval(r); //Force r to be evaluated before going into eval()
      }
    }.eval(v); //Force v to be evaluated before going into eval()
  }

Now, you’re putting the logic into three steps: step 1: read the record. step 2: write w to the record. step 3: write t to the record. To make sure step 1 is executed before step 2, you wrap step 2 into a Function object and pass the result of step 1 (the “v” variable) as an argument to its eval() method. In a real functional programming language, when calling a method, its arguments will be fully evaluated. So, this guarantees that step 1 is executed before step 2. You use the same trick to put step 3 into a Function object inside step 2 and pass the result of step 2 (the “r” variable) as an argument to its eval() method. For this to work, the write() method must return something instead of void. So, I changed it to return an int.

It works, but the code looks complex. To make it simpler, you can introduce a method like:

class Foo {
  Object chain(Object arg, Function f) {
    return f.eval(arg);
  }
  void inc() {
    final int v = read();
    chain(v, new Function() {

      public Object eval(Object arg) {
        final int w = v+1;
        final int r = write(w);
        return chain(r, new Function() {

          public Object eval(Object arg) {
            final int t = w+1;
            return write(t);
          }
        });
      }
    });
  }

Once you have such a chain() method, you can perform more interesting things. For example, instead of passing the previous result directly as the argument, you could expect it to be an instruction to access the database so that you perform the database access in the chain() in the top level of your program, so that all the code inside is free of any side-effect:

class Read {
}

class Write {
  int v;

  public Write(int v) {
    this.v = v;
  }
}
class Foo {
  Object chain(Object arg, Function f) {
    if (arg instanceof Read) { //arg is now the instruction
      return f.eval(read()); //Perform the side effect here
    }
    if (arg instanceof Write) {
      final int v = ((Write)arg).v;
      return f.eval(write(v)); //Perform the side effect here
    }
    return null;
  }
  void inc() {
    chain(new Read(), new Function() {

      public Object eval(Object arg) {
        final int w = (Integer)arg+1;
        final Write r = new Write(w);
        return chain(r, new Function() {

          public Object eval(Object arg) {
            final int t = w+1;
            return chain(new Write(t), ...);
          }
        });
      }
    });
  }
}

In addition, you can perform other interesting things in the chain() method. For example, the argument could also contain an integer indicating the number of available resources. Every time you could reduce it by one before passing to the next step. If it is zero, you could abort.

BTW, you’ve just learned one of the most intriguing concepts in functional programming: monad. It is just the way above to achieve sequencing, optionally manipulating the previous result, and combining it with the next step.

Better syntax?

Even though it works, the multiple levels of embedding is still very complicated. In a real functional programming, you won’t need to write so much to create the anonymous Function object, to define the eval() method and you probably can use an infix operator in place of the chain() method. For example, in Haskell, the code can be written something like this:

  inc =
    new Read() >>=
    \v ->
        w = v+1
        new Write(w) >>=
    \r ->
        t = w+1
        new Write(t)

Here the >>= infix operator has replaced for the chain() method. The \x -> expresssion defines an anonymous function and x is the argument. It looks simple, right?

What can you do next?

Now you’ve learned the significance of functional programming. What to do next? If you’d like to learn more about functional programming, if you prefer static typing, I’d recommend that you study Haskell by going through this excellent tutorial. Then, for production, you may check out Scala which supports functional programming and integration with JVM and all existing Java classes. It borrows a lot of concepts from Haskell. It also supports assignments and OO, so it can ease the migration from Java.

If you prefer dynamic typing, for production, you may check out Clojure which also integrates with JVM and existing Java classes.