Monday, July 02, 2018

Simple , robust code: part one, simplicity

1. Simple as the oposite of complex


Complexity in software is the root of all evil, and simplicity is the oposite of complexity. Simple is not the same as easy, because sometimes we make software complex just because it is easy (think of adding a library from which you need just a function, which then needs to be upgraded and it's incompatible with other libraries etc).

A complex sistem is like this, where is is very hard to figure out what is going on, thus it cannot be debugged, extended or changed:




and a simple one is the oposite:



2. Simplicity in software DATA and FUNCTIONS*

We use computers to compute (apply functions) some data we need (the final state of a system), given some initial data (initial state of the system). So if we drastically reduce what software does, we end up with just data and functions.





Example:




Obviously this sounds overly simplistic, real code is more complex, more functions are needed.

function greed(name){
var a = ["hello ", name];
var b =capitalize_first_letter(a);
var c =concat(b);
return c;
}


or we could:





Which starts to look like a pipe, where you send the initial_state, and expect at the end the final state.

Now if we need to solve a real world problem, I guess we could solve it by having:
- lots of simple functions, that take as input one parameter and return one parameter
- because the have one parameter in and one parameter our they can be composed
- simple functions put together as a pipeline and can solve very complex problems in a very simple way

3. Functional composition


Now we could compose the two functions into just one:



4. Example: From complex to simple using functional composition


A few years ago, I made a practical example. I'll add it simplified here.

Requirement: in the json that we receive on a server, we need to have a key “measurement”, that is mandatory, cannot be null, needs to be a string and cannot be empty string, Then we also need to make sure the length of the string is between 3 and 8 characters, and cannot be some reserved words like “password” or “archived". So the code is like:



 def validate_simplest_json(json):  
   errors = []  
   if not json.has_key("measurement"):  
     errors.append("measurement cannot be missing")  
   else:  
     if json["measurement"]==None:  
       errors.append("measurement cannot be null")  
     else:  
       if not isinstance(json["measurement"], str) and not isinstance(json["measurement"], unicode):  
         errors.append("measurement needs to string or unicode")  
       else:  
         lenm=len(json["measurement"].strip())  
         if lenm==0:  
           errors.append("measurement cannot be an empty string")  
         else:  
           if lenm<3: data-blogger-escaped-div="">  
             errors.append("measurement needs at least 3 characters")  
           elif lenm&gt;10:  
             errors.append("measurement needs at most 10 characters")  
           elif json["measurement"].strip().lower() in ["archived","password"]:  
             errors.append("measurement has a value which is not allowed")  
   return errors  


Removing complexity can mean, more linear code, and an initial state, and simple composable functions:

 ValidationState = namedtuple("ValidationState","json key errors exit”)  

then I will extract the actual validations in simple functions, like:

 def validate_simplest_json_imperative_linear_with_state(json):  
   initial_state = ValidationState(json=json, key="measurement",errors=[], exit=False)  
   
   state = validate_key_exists(initial_state)  
   
   if not state.exit:  
     state = validate_not_null(state)  
   
   if not state.exit:  
     state = validate_string_or_unicode(state)  
   
   if not state.exit:  
     state = validate_not_empty_string(state)  
   
   if not state.exit:  
     state = validate_length(state, 3,10)  
   
   if not state.exit:  
     state = validate_not_in(state, ["archived","password"])  
   
   return state.errors  

And the functions are like:

 def validate_key_exists(state):  
   print validate_key_exists.__name__,state  
   if not key_exists(state.json,state.key):  
     return state._replace(errors = state.errors+["{0} cannot be missing".format(state.key)])._replace(exit=True)  
   return state  
   
 def validate_not_null(state):  
   print validate_not_null.__name__,state  
   if value_null(state.json,state.key):  
     return state._replace(errors = state.errors+["{0} cannot be null".format(state.key)])._replace(exit=True)  
   return state  
   
...


The code looks is now a series of functions that run with the result of the previous function if the exit parameter is not set to True. So basically having 2 functions f,g they’ll be composed like:

initial_state = …
state = f(initial_state)
if not state.exit:
    return g(state)

And putting this in a function:

 def compose2(f, g):  
   def run(x):  
     result_f = f(x)  
     if not result_f.exit:  
       return g(result_f)  
     else:  
       return result_f  
   return run  

 #compose n functions  
 def compose(*functions):  
   return reduce(compose2, functions)  

And now the final validation code:

 def validate_simplest_functional_composition(json):  
   initial_state = ValidationState(json=json, key="measurement",errors=[], exit=False)  
   
   composed_function = compose(
          validate_key_exists
          validate_not_null,
          validate_string_or_unicode,
          validate_not_empty_string
          create_validate_length(3, 10), 
          create_validate_not_in(["archived","password"]))  
   final_state = composed_function(initial_state)  
   
   return final_state.errors  

It is much better. It basically says: having an initial start of the system, run all these functions (validators) and at the end get a final state. Code: http://runnable.com/VNMhoTKLSn9Tm0GI/fighting-complexity-through-functional-composition-for-python


And it is:




5. So where can I use this?


If you're a backend developer, you can use it on a server (python example):

@mod.route('/api/1/save/', methods=['POST'])
@pi_service()
def generic_save(version=1, typ=None):
    composed_func = compose_list(
    [
        can_write("tags"),
        change("json", request.json),
        change("session", get_session()),
        change("type", get_pi_type(typ)),
        change("object", None),
        change("transformer", get_pi_transformer(typ)),
        get_database_object,
        transform_from_json,
        save_database_object,
        index_tag_or_tag_group,
        pi_transform_to_json,
   ])
return composed_func({})

or in a PDF generating server, written in Clojure over Apache Batik (using transducers but that's another discussion)



You could use javascript promises for piping, with React, if you're a front-end developer. The state of the system the model (immutable) and rendering is done views.render:

StoryboardController.prototype.move_point_by = function(page_object, point_index, dx, dy) {
    pi.startWith(model,"MOVE POINT BY")
        .then(function move_point_by(state){
            pi.info("move point by", page_object, point_index, dx, dy);
            var cursor = get_selected_layer_cursor(state) + ".children" + find_cursor_pageobject(page_object, state);
            if (cursor) {
                var point_cursor = cursor+".points["+point_index+"]";
                var point = pi.pi_value(state, point_cursor);
                var changes = {};
                var nx=point.x+dx;
                var ny=point.y+dy;
                changes[point_cursor+".x"]=nx;
                changes[point_cursor+".y"]=ny;

                state = pi.pi_change_multi(state, changes);

                return resize_shape(state, cursor);
            }
            return state;
        })
        .then(views.render)
        .then(swap_model)
        .then(REST.try_save_page)
}

or


 
or in Clojurescript, where the state of the system is an atom (model) and every time it changes, the view is rerendered:




6. Conclusion


Using this model, code is easier to understand, debug, change, extend. Why: 
- all the data is in a place

initial_state = ValidationState(json=json, key="measurement",errors=[], exit=False) 

- functions are simple 

 def validate_key_exists(state):  
   print validate_key_exists.__name__,state  
   if not key_exists(state.json,state.key):  
     return state._replace(errors = state.errors+["{0} cannot be missing".format(state.key)])._replace(exit=True)  
   return state 

- intermediary states can be easily debugged

 composed_function = compose(
          validate_key_exists
          validate_not_null,
          debug,
          validate_string_or_unicode,
          validate_not_empty_string
          create_validate_length(3, 10), 
          create_validate_not_in(["archived","password"]))  
   final_state = composed_function(initial_state)  


 def debug(state):  
   print state.json, state.key, state.errors, state.exit
   return state 

- data changes flow in a single direction

In part two: robustness, we'll see how we could also make the code robust, by making the code run transactionally same as databases: either all runs or none and the state gets reverted to the previous one. 








Thursday, April 13, 2017

What is wrong with static typing in JavaScript and how clojure.spec solves the problem (Part 2)

The problem


The main problem with dynamic typing seems to be fear that the wrong type of data will end up in the wrong place (function input for instance).

However it turns out static typing is pretty useless.

Example 1: Simple types Age


Let's say you have to record an age for a person in a variable, or have it as a parameter in a function.

You would do something like:

int age = 25;

or

function something(int age...)

Someone pretended even that static typing shows intent. Now the only thing that is in here, is that it will be an int. There is nothing protecting the age from being either negative (-2) or too big (4000, it might work if you're talking about the age of the pyramids). So it is not intent, it is just int, not further protection, so pretty much useless.

Solution 1

In Clojure REPL using spec (require '[clojure.spec :as s]) we define a spec saying we want a natural int (positive int) and it should be smaller then let's say 150:

(s/def ::age (s/and nat-int? (fn [x] (< x 150))))

Now:

user=> (s/valid? ::age "a")
false
user=> (s/valid? ::age -12)
false
user=> (s/valid? ::age true)
false
user=> (s/valid? ::age 1.21)
false
user=> (s/valid? ::age 4000)
false
user=> (s/valid? ::age -2)
false
user=> (s/valid? ::age 0)
true
user=> (s/valid? ::age 12)
true
user=> (s/valid? ::age 29)
true
user=> (s/valid? ::age 99)

true


Example 2: Composed types: Person


Let's say we get through a web call a json like:

{
  "id":6,
  "name":"Dan",
  "age":28
}

Usually people would create a class

class Person
{
    int id;
    string name;
    int age;
}

So we have the same problems, for instance name might be null, or age might be negative.

Then in modern apps, you get json and you send json, so you need to be able to serialize and deserialize to json this class. What happens if one of the parameters is not comform or missing?

Solution 2

(s/def ::id nat-int?)
(s/def ::name string?)
(s/def ::person (s/keys :req-un [::id ::name ::age]))

Now:

user=> (s/valid? ::person {:id 1, :name "Adi"})
false
user=> (s/valid? ::person {:id 1, :name "Adi" :age -1})
false
user=> (s/valid? ::person {:id 1, :name "Adi" :age 40})
true

What's even cooler, is that if it isn't valid, you can get an explanation:

user=> (s/explain ::person {:id 1, :name "Adi" :age -1})
In: [:age] val: -1 fails spec: :user/age at: [:age] predicate: nat-int?

user=> (s/explain ::person {:id 1})
val: {:id 1} fails spec: :user/person predicate: (contains? % :name)
val: {:id 1} fails spec: :user/person predicate: (contains? % :age)

Example 3: Hierarchies 


What if you have:

{
    "id": 6,
    "name": "Dan",
    "age": 28,
    "children": [{
            "id": 7,
            "name": "Alex",
            "age": 5
        }
    ]
}

When the first object is a parent, in a school and he must have at least one child? You can enforce the relationship by writing a function and in the constructor, but then you also need to change the serialization/deserialization from json to enforce the rules, and of course you will write more code and you will forget to check it once, and there will be a bug.

And the most common problem of our times. The json is like:

{
    "id": 11,
    "name": "Maria",
    "age": 95,
    "children": [{
                "id": 5,
                "name": "Elena",
                "age": 28,
                "children": [{
                    "id": 6,
                    "name": "Dan",
                    "age": 60,
                    "children": [{
                        "id": 7,
                        "name": "Alex",
                        "age": 5
                    }],
                    {
                        "id": 9,
                        "name": "Alina",
                        "age": 32,
                        "children": [{
                            "id": 121,
                            "name": "Luiza",
                            "age": 0
                        }]
                    }
                }],

                {
                    "id": 23,
                    "name": "Petru",
                    "age": 70,
                    "children": [{
                            "id": 4,
                            "name": "Adrian",
                            "children": [{
                                "id": 45,
                                "name": "Denis",
                                "age": 12
                            }],
                        ]
                    }]
            }]
}

You have a single error but where? (Maria / Petru / Adrian - missing "age"). It is not only hard to validate it but it is hard to show explicitly where the error occurred.

Solutions

(s/+ says that there will be a collection of person's with a minimum of 1:

(s/def ::children (s/+ ::person))
(s/def ::parent (s/keys :req-un [::id ::name ::age ::children]))

Now:

user=> (s/valid? ::parent {:id 1, :name "Adi" :age 40 :children []})
false
user=> (s/valid? ::parent {:id 1, :name "Adi" :age 40 :children [{:id 1, :name "Adi" :age 40}]})
true
user=> (s/valid? ::parent {:id 1, :name "Adi" :age 40 :children [{:id 1, :name "Adi" :age 40}, {:id 2, :name "Dan" :age 20}]})
true

and if we don't have children:

(s/explain ::parent {:id 1, :name "Adi" :age 40 :children []})
In: [:children] val: () fails spec: :user/person at: [:children] predicate: :user/person,  Insufficient input


Even cooler is that you can check relations between data, like if the children are younger then their parents: 

(defn parent-older-than-children? [parent] (reduce #(or %1 %2) (map #(> (:age parent) (:age %)) (:children parent))))

we redefine the ::parent

(s/def ::parent (s/and (s/keys :req-un [::id ::name ::age ::children]) parent-older-than-children?))

user=> (s/valid? ::parent {:id 1, :name "Adi" :age 40 :children [{:id 1, :name "Adi" :age 50}]})
false
user=> (s/explain ::parent {:id 1, :name "Adi" :age 40 :children [{:id 1, :name "Adi" :age 50}]})
val: {:id 1, :name "Adi", :age 40, :children [{:id 1, :name "Adi", :age 50}]} fails spec: :user/parent predicate: parent-older-than-children?

user=> (s/valid? ::parent {:id 1, :name "Adi" :age 45 :children [{:id 1, :name "Adi" :age 25}, {:id 2, :name "Dan" :age 20}]})
true


In Part 3 we will look at functions and one more thing ...

Thursday, March 30, 2017

What is wrong with static typing in JavaScript and how clojure.spec solves the problem (Part 1)

The problem


The main problem with dynamic typing seems to be fear that the wrong type of data will end up in the wrong place (function input for instance).

However it turns out static typing is pretty useless.

Example 1: Simple types Age


Let's say you have to record an age for a person in a variable, or have it as a parameter in a function.

You would do something like:

int age = 25;

or

function something(int age...)

Someone pretended even that static typing shows intent. Now the only thing that is in here, is that it will be an int. There is nothing protecting the age from being either negative (-2) or too big (4000, it might work if you're talking about the age of the pyramids). So it is not intent, it is just int, not further protection, so pretty much useless.

Example 2: Composed types: Person


Let's say we get through a web call a json like:

{
  "id":6,
  "name":"Dan",
  "age":28
}

Usually people would create a class

class Person
{
    int id;
    string name;
    int age;
}

So we have the same problems, for instance name might be null, or age might be negative.

Then in modern apps, you get json and you send json, so you need to be able to serialize and deserialize to json this class. What happens if one of the parameters is not comform or missing?

Example 3: Hierarchies 


What if you have:

{
    "id": 6,
    "name": "Dan",
    "age": 28,
    "children": [{
            "id": 7,
            "name": "Alex",
            "age": 5
        }
    ]
}

When the first object is a parent, in a school and he must have at least one child? You can enforce the relationship by writing a function and in the constructor, but then you also need to change the serialization/deserialization from json to enforce the rules, and of course you will write more code and you will forget to check it once, and there will be a bug.

And the most common problem of our times. The json is like:

{
    "id": 11,
    "name": "Maria",
    "age": 95,
    "children": [{
                "id": 5,
                "name": "Elena",
                "age": 28,
                "children": [{
                    "id": 6,
                    "name": "Dan",
                    "age": 60,
                    "children": [{
                        "id": 7,
                        "name": "Alex",
                        "age": 5
                    }],
                    {
                        "id": 9,
                        "name": "Alina",
                        "age": 32,
                        "children": [{
                            "id": 121,
                            "name": "Luiza",
                            "age": 0
                        }]
                    }
                }],

                {
                    "id": 23,
                    "name": "Petru",
                    "age": 70,
                    "children": [{
                            "id": 4,
                            "name": "Adrian",
                            "children": [{
                                "id": 45,
                                "name": "Denis",
                                "age": 12
                            }],
                        ]
                    }]
            }]
}

You have a single error but where? (Maria / Petru / Adrian - missing "age"). It is not only hard to validate it but it is hard to show explicitly where the error occurred.

Example 5: Functions


Let's try to find a string in another string. A function would be like:

int indexOf(string search, string what) ...

Which tells you that you will get an int, and you can pass two strings. First what if the strings are null? What if both strings are empty "", "". What if the result for "ab", "b" is 1248764 or -12. According to the function definition it is an int, and should be valid.

Example 6. Unit testing


To ensure the function above is well specified, we also use unit testing. Problem 1: unit testing doesn't care if it is static or dynamic typing. Problem 2 is very unit testing specific: Having enough tests, maintaining them when the function changes (like adding a new parameter), not enough testing, or too optimistic testing.


The solution proposed by Clojure.spec will be shown in part 2. And it is pretty cool! :)

Friday, March 17, 2017

Sorting maps in Clojure

Problem 

If you like to keep data in maps in ClojureScript to be able to access it fast, but also need sorting, maybe you should read this.

 Cause

 Let's say you have a map like:

 (def a {:0 0, :1 1, :2 2, :3 3, :4 4, :5 5, :6 6, :7 7}) 
 (vals a) would return: (0 1 2 3 4 5 6 7) 

 But what about

 (def a {:0 0, :1 1, :2 2, :3 3, :4 4, :5 5, :6 6, :7 7, :8 8}) 

 where

 (vals b) returns: (6 7 4 5 1 0 3 2 8) 

 The trick is how data is represented internally. If the number of pairs is less then 8 then (type a) is a clojure.lang.PersistentArrayMap but (type b) is a clojure.lang.PersistentHashMap which is optimized for access, but loses order as a compromise.

 If we generate our maps using a function:

 (defn gen [x] (doall (map (fn [x] [(keyword (str x)) x]) (range x)))) try: 

 (->> (gen 8) 
         (into {}) 
          type ) 

 (->> (gen 9) 
         (into {}) 
          type ) 

and you'll see for yourself.

Wednesday, October 19, 2016

Learn clojure.spec (cljs,spec) interactively

There is one fantastic tool called klipse which allows you to run Clojurescript interactively in the browser. It is even more helpful when it allows you to learn something like clojure.spec by examples that actually run in realtime and which you can change and see how everything works: http://blog.klipse.tech/clojure/2016/05/30/spec.html

Friday, December 18, 2015

Clojurescript/Reagent: How to start in 1 minute


What is a great development environment for web applications?


One that:

  1.  can be started and configured extremely easy
  2.  allows instant feedback
  3.  can be done in a great programming language  
  4.  doesn't need expensive tools


So Clojurescript. For clojure/clojurescript projects you need one tool installed: Leiningen, which will be used from the command line. (install from here. You need java installed before)

1. can be started and configured extremely easy


Create a new project

lein new figwheel hello_world -- --reagent

And start it:

cd hello_world

lein figwheeel

Now, go in your browser to localhost:3449



It is already there!!!

2. instant feedback


Open the code using some tool, preferably LightTable and do this change in src/hello_world/core.cljs in the hello_world component:

{:style {:color "red"}}

Save the file.



It is already in the browser!!

now, let's start that in console.


(in-ns 'hello_world.core)

Now let's change the text to Hello Dan!

(swap! app-state assoc :text "Hello Dan!")


Boom!



 It is already in the browser. Figwheel takes care of all that!


Now 3 and 4 are answered by clojurescript which is based on the great programming language clojure. We're also using Reagent (clojurescript library on top of Facebook React) and figwheel which allows all the instant feedback stuff.



Saturday, October 17, 2015

REST api using Clojure and MySql. Is Clojure the most productive, robust language on the planet?

In my life, I have written a lot of web applications, using Java, then .NET, PHP and lately REST API;s using Python. I thought there cannot be anything to match Python productivity using Flask and SqlAlchemy until today.

Making a REST api in Clojure using Ring/Compojure and SqlKorma 

There is a great package manager called Leiningen (http://leiningen.org). To create a new web application, in a Terminal:

> lein new compojure todoapp2
> cd todoapp2

Now you have the skeleton application with a pretty known structure. Now we need to configure which packages will be used, editing package.clj:

(defproject todoapp2 "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :min-lein-version "2.0.0"
  :dependencies [[org.clojure/clojure "1.6.0"]
                 [compojure "1.3.1"]
                 [ring/ring-core "1.3.2"]
                 [ring/ring-json "0.3.1"]
                 [ring/ring-defaults "0.1.4"]
                 [korma "0.3.0-RC5"]
                 [mysql/mysql-connector-java "5.1.6"]]
  :plugins [[lein-ring "0.8.13"]]
  :ring {:handler todoapp2.handler/app}
  :profiles
  {:dev {:dependencies [[javax.servlet/servlet-api "2.5"]
                        [ring-mock "0.1.5"]]}})

Now to add all dependencies:

> lein deps 

Basically we need the json package, sqlkorma and the mysql JDBC driver. All installed. 

In MySql we'll create a database todo, where we create a table items, with id and title (varchar).

Now let's create a database.clj file where we configure the database connection details:

(ns todoapp2.database
  (:require [korma.db :as korma]))

(def db-connection-info (korma/mysql 
  {:classname "com.mysql.jdbc.Driver"
   :subprotocol "mysql"
   :user "root"
   :subname "//localhost:3306/todo"}))

; set up korma
(korma/defdb db db-connection-info)

Now let's write the database access functions, in a new file: query.clj

(ns todoapp2.query
  (:require [todoapp2.database]
            [korma.core :refer :all]))

(defentity items)

(defn get-todos []
  (select items))

(defn add-todo [title]
  (insert items
          (values {:title title})))

(defn delete-todo [id]
  (delete items
          (where {:id [= id]})))

(defn update-todo [id title is-complete]
  (update items
          (set-fields {:title title
                       :is_complete is-complete})
          (where {:id [= id]})))

(defn get-todo [id]
  (first
    (select items
          (where {:id [= id]}))))

All done. SqlKorma is extremely easy to use, very composable.

Ok, now let's write the REST services:

(ns todoapp2.handler
  (:require [compojure.core :refer :all]
  [compojure.handler :as handler]
            [compojure.route :as route]
            [ring.middleware.json :as json]
            [ring.util.response :refer [response]]
            [todoapp2.query :refer :all]))

(defroutes app-routes
  (GET "/api/todos" []
       (response (get-todos)))
  (GET "/api/todos/:id" [id]
       (response (get-todo (Integer/parseInt id))))
  (POST "/api/todos" [title]
       (response (add-todo title)))
  (PUT "/api/todos/:id" [id title is_complete]
       (response (update-todo (Integer/parseInt id) title is_complete)))
  (DELETE "/api/todos/:id" [id]
        (response (delete-todo (Integer/parseInt id))))
  (route/resources "/")
  (route/not-found "Not Found"))

(def app
  (-> (handler/api app-routes)
      (json/wrap-json-params)
      (json/wrap-json-response)))

Starting the server:

>lein ring start

Using a tool like Advanced REST Client plugin for Chrome will allow you to use the API:




And accessing http://localhost:3000/api/todos will show you what you created. 

Conclusion


Pretty awesome!

Friday, February 06, 2015

Fighting complexity through functional composition, part 1: how to implement functional composition


The absolute enemy in software (and other things as well) is complexity. Considering complexity as the opposite of simple, it makes our systems hard to understand, hard to debug and hard to extend or adapt. There is one very good talk of the great Rich Hickey, about this called: Simplicity Matters. The video: https://www.youtube.com/watch?v=rI8tNMsozo0. So let;s see how we can fight complexity in a practical example by using functional composition

The problem 



Considering that these days most of the integration is done through web services and JSON, we'll try to illustrate the complexity problem using an example from this area.

Requirement: in the json, we need to have a key “measurement”, that is mandatory, cannot be null, needs to be a string and cannot be empty string.

We’ll do a little bit of TDD here, starting with a test:

Python:
 class TestValidations(unittest.TestCase):  
   
   def validate_pair(self,json, is_valid, number_of_errors):  
     errors = validate_simplest_json(json)  
     print "",is_valid, errors, json  
     self.assertEquals(len(errors)==0,is_valid)  
     self.assertEquals(len(errors),number_of_errors)  
   
   def test_json_validation(self):  
     self.validate_pair({},False,1)  
   
   
 def validate_simplest_json(json):  
   errors = []  
   return errors  



All fail, which is great. Now let’s write the code, to check if the key is there:

 def validate_simplest_json(json):  
   errors = []  
   if not json.has_key("measurement"):  
     errors.append("measurement cannot be missing”)  
   return errors    

Pass. Now what about null? The test extends:

   def test_json_validation(self):  
     self.validate_pair({},False,1)  
     self.validate_pair({"measurement":None},False,1)  

the code to pass:

 def validate_simplest_json(json):  
   errors = []  
   if not json.has_key("measurement"):  
     errors.append("measurement cannot be missing")  
   else:  
     if json["measurement"]==None:  
       errors.append("measurement cannot be null”)  
   return errors  

Pass. Now let’s check if it is a string (or unicode):

   def test_json_validation(self):  
     self.validate_pair({},False,1)  
     self.validate_pair({"measurement":None},False,1)  
     self.validate_pair({"measurement":-1},False,1)  
     self.validate_pair({"measurement":{}},False,1)  
     self.validate_pair({"measurement":False},False,1)  
     self.validate_pair({"measurement":"abc"},True,0)  
     self.validate_pair({"measurement":u"Citroën"},True,0)  
   
 def validate_simplest_json(json):  
   errors = []  
   if not json.has_key("measurement"):  
     errors.append("measurement cannot be missing")  
   else:  
     if json["measurement"]==None:  
       errors.append("measurement cannot be null")  
     else:  
       if not isinstance(json["measurement"], str) and not isinstance(json["measurement"], unicode):  
         errors.append("measurement needs to string or unicode")  
   return errors  

Now we also need to check if it is not emty string:

   def test_json_validation(self):  
     self.validate_pair({},False,1)  
     self.validate_pair({"measurement":None},False,1)  
     self.validate_pair({"measurement":-1},False,1)  
     self.validate_pair({"measurement":{}},False,1)  
     self.validate_pair({"measurement":False},False,1)  
     self.validate_pair({"measurement":"abc"},True,0)  
     self.validate_pair({"measurement":u"Citroën"},True,0)  
     self.validate_pair({"measurement":""},False,1)  
   
 def validate_simplest_json(json):  
   errors = []  
   if not json.has_key("measurement"):  
     errors.append("measurement cannot be missing")  
   else:  
     if json["measurement"]==None:  
       errors.append("measurement cannot be null")  
     else:  
       if not isinstance(json["measurement"], str) and not isinstance(json["measurement"], unicode):  
         errors.append("measurement needs to string or unicode")  
       else:  
         if len(json["measurement"].strip())==0:  
           errors.append("measurement cannot be an empty string")  
   return errors  

All of the sudden, we hear we also need to make sure the length of the string is between 3 and 8 characters, and cannot be some reserved words like “password” or “archived"

Ok, so let’s code, expanding our tests:

   def test_json_validation(self):  
     self.validate_pair({},False,1)  
     self.validate_pair({"measurement":None},False,1)  
     self.validate_pair({"measurement":-1},False,1)  
     self.validate_pair({"measurement":{}},False,1)  
     self.validate_pair({"measurement":False},False,1)  
     self.validate_pair({"measurement":"abc"},True,0)  
     self.validate_pair({"measurement":u"Citroën"},True,0)  
     self.validate_pair({"measurement":""},False,1)  
     self.validate_pair({"measurement":"a"},False,1)  
     self.validate_pair({"measurement":"abcdefghijklmnefghij"},False,1)  
     self.validate_pair({"measurement":"password"},False,1)  
     self.validate_pair({"measurement":"archived"},False,1)  
     self.validate_pair({"measurement":"arCHived"},False,1)  

Then gradually we start coding the validation, arriving to:

 def validate_simplest_json(json):  
   errors = []  
   if not json.has_key("measurement"):  
     errors.append("measurement cannot be missing")  
   else:  
     if json["measurement"]==None:  
       errors.append("measurement cannot be null")  
     else:  
       if not isinstance(json["measurement"], str) and not isinstance(json["measurement"], unicode):  
         errors.append("measurement needs to string or unicode")  
       else:  
         lenm=len(json["measurement"].strip())  
         if lenm==0:  
           errors.append("measurement cannot be an empty string")  
         else:  
           if lenm<3: data-blogger-escaped-div="">  
             errors.append("measurement needs at least 3 characters")  
           elif lenm&gt;10:  
             errors.append("measurement needs at most 10 characters")  
           elif json["measurement"].strip().lower() in ["archived","password"]:  
             errors.append("measurement has a value which is not allowed")  
   return errors  

As requirements are added complexity grows. Now of course this code could be refactored, but eliminating the essential problem of complexity is very hard. Just imagine what will happen if at version 1.2 the customer will change the API and only allow the values to be a measurement unit like “0.12mm” or “13.2mg”. It will grow again and become more complex. Not pretty!

And having json with only one key is kind of rare… Usually the number of keys is a lot higher and of course the code a lot bigger. Bigger and more complex = disaster. In terms of code quality it will fail at being able to extend it easily and it will fail at being able to debug it easily.


The solution: implementing functional composition



Removing complexity can mean, more linear code, so let’s refactor it to be more linear:

 def validate_simplest_json_imperative_linear(json):  
   errors = []  
   should_exit=False  
   key = "measurement"  
   if not key_exists(json,key):  
    errors.append("{0} cannot be missing".format(key))  
    should_exit=True  
   
   if not should_exit:  
     if value_null(json, key):  
       errors.append("{0} cannot be null".format(key))  
       should_exit=True  
   
   if not should_exit:  
     if not is_string_or_unicode(json, key):  
       errors.append("{0} needs to string or unicode".format(key))  
       should_exit=True  
   
   if not should_exit:  
     if is_empty_string(json, key):  
       errors.append("{0} cannot be an empty string".format(key))  
       should_exit=True  
   
   
   if not should_exit:  
     lenm=len(json[key].strip())  
     if lenm<3: data-blogger-escaped-div="">  
       errors.append("{0} needs at least 3 characters".format(key))  
       should_exit=True  
     elif lenm&gt;10:  
       errors.append("{0} needs at most 10 characters".format(key))  
       should_exit=True  
   
   if not should_exit:  
     if json[key].strip().lower() in ["archived","password"]:  
       errors.append("{0} has a value which is not allowed".format(key))  
       should_exit=True  
   
   return errors  

And yes, all the tests still pass. But we’re far from over, although we do see a pattern by which each method is executed after the other… Hmm, now I’ll move all variables like json, key, errors and exit into a single object (a tuple) so that we don’t pass 4 parameters back and forth:

 ValidationState = namedtuple("ValidationState","json key errors exit”)  

then I will extract the actual validations in simple functions, like:

 def validate_simplest_json_imperative_linear_with_state(json):  
   initial_state = ValidationState(json=json, key="measurement",errors=[], exit=False)  
   
   state = validate_key_exists(initial_state)  
   
   if not state.exit:  
     state = validate_not_null(state)  
   
   if not state.exit:  
     state = validate_string_or_unicode(state)  
   
   if not state.exit:  
     state = validate_not_empty_string(state)  
   
   if not state.exit:  
     state = validate_length(state, 3,10)  
   
   if not state.exit:  
     state = validate_not_in(state, ["archived","password"])  
   
   return state.errors  

And the functions:

 def validate_key_exists(state):  
   print validate_key_exists.__name__,state  
   if not key_exists(state.json,state.key):  
     return state._replace(errors = state.errors+["{0} cannot be missing".format(state.key)])._replace(exit=True)  
   return state  
   
 def validate_not_null(state):  
   print validate_not_null.__name__,state  
   if value_null(state.json,state.key):  
     return state._replace(errors = state.errors+["{0} cannot be null".format(state.key)])._replace(exit=True)  
   return state  
   
 def validate_string_or_unicode(state):  
   print validate_string_or_unicode.__name__,state  
   if not is_string_or_unicode(state.json,state.key):  
     return state._replace(errors = state.errors+["{0} needs to string or unicode".format(state.key)])._replace(exit=True)  
   return state  
   
 def validate_not_empty_string(state):  
   print validate_not_empty_string.__name__,state  
   if is_empty_string(state.json,state.key):  
     return state._replace(errors = state.errors+["{0} cannot be an empty string".format(state.key)])._replace(exit=True)  
   return state  
   
 def validate_length(state, min, max):  
   print validate_length.__name__,state  
   lenm=len(state.json[state.key].strip())  
   if lenm  
     return state._replace(errors = state.errors+["{0} needs at least 3 characters".format(state.key)])._replace(exit=True)  
   elif lenm&gt;max:  
     return state._replace(errors = state.errors+["{0} needs at most 10 characters".format(state.key)])._replace(exit=True)  
   return state  
   
 def validate_not_in(state,vals):  
   print validate_not_in.__name__,state  
   if state.json[state.key].strip().lower() in vals:  
     return state._replace(errors = state.errors+["{0} has a value which is not allowed".format(state.key)])._replace(exit=True)  
   return state  
   

The code looks is now a series of functions that run with the result of the previous function if the exit parameter is not set to True. So basically having 2 functions f,g they’ll be composed like:

initial_state = …
state = f(initial_state)
if not state.exit:
    return g(state)

And putting this in a function:

 def compose2(f, g):  
   def run(x):  
     result_f = f(x)  
     if not result_f.exit:  
       return g(result_f)  
     else:  
       return result_f  
   return run  

Using this we can now compose 2 functions into one:

 def validate_simplest_json_imperative_linear_with_state(json):  
   initial_state = ValidationState(json=json, key="measurement",errors=[], exit=False)  
   
   # state = validate_key_exists(initial_state)  
   #  
   # if not state.exit:  
   #   state = validate_not_null(state)  
   
   composed_function = compose2(validate_key_exists, validate_not_null)  
   state = composed_function(initial_state)  


But we don’t have only 2 function, we have more, so we write a reduce:

 #compose n functions  
 def compose(*functions):  
   return reduce(compose2, functions)  

and out function becomes:

 def validate_simplest_json_imperative_linear_with_state(json):  
   initial_state = ValidationState(json=json, key="measurement",errors=[], exit=False)  
   
   composed_function = compose(validate_key_exists, validate_not_null,validate_string_or_unicode,validate_not_empty_string)  
   state = composed_function(initial_state)  
   
   if not state.exit:  
     state = validate_length(state, 3,10)  
   
   if not state.exit:  
     state = validate_not_in(state, ["archived","password"])  
   
   return state.errors  

but we just hit a problem. He have some functions that have more parameters and we need to pass them. We’ll use closures:

 def create_validate_length(min, max):  
   def validate_length(state):  
     print validate_length.__name__,state  
     lenm=len(state.json[state.key].strip())  
     if lenm  
       return state._replace(errors = state.errors+["{0} needs at least 3 characters".format(state.key)])._replace(exit=True)  
     elif lenm&gt;max:  
       return state._replace(errors = state.errors+["{0} needs at most 10 characters".format(state.key)])._replace(exit=True)  
     return state  
   return validate_length  
   
 def create_validate_not_in(vals):  
   def validate_not_in(state):  
     print validate_not_in.__name__,state  
     if state.json[state.key].strip().lower() in vals:  
       return state._replace(errors = state.errors+["{0} has a value which is not allowed".format(state.key)])._replace(exit=True)  
     return state  
   return validate_not_in  

And now the final validation code:

 def validate_simplest_functional_composition(json):  
   initial_state = ValidationState(json=json, key="measurement",errors=[], exit=False)  
   
   composed_function = compose(validate_key_exists, validate_not_null,validate_string_or_unicode,validate_not_empty_string, create_validate_length(3, 10), create_validate_not_in(["archived","password"]))  
   final_state = composed_function(initial_state)  
   
   return final_state.errors  

It is much better. It basically says: having an initial start of the system, run all these functions (validators) and at the end get a final state.


a preview:


Or in Javascript: http://jsfiddle.net/danbunea1/gz87dt5a/





Conclusion: Why is this better?



Now you would think, how can a solution with ... lines of code be better then one with just 21. In part 2 of the article, called "Why is functional composition better" I will illustrate why, and how functional composition makes our code simpler, easier to understand, debug and change.