Thursday, June 28, 2012

An intuitive view of Algorithms in terms of Category Theory


Inspired by this wonderful article : Why do monads matter?

A function gives one only value of the codomain for each element of the domain set (and thus it can be evaluated a single time for each input value, get its result and forgetting about doing this computation again, that is called graph reduction), 

A category (forget about laws and properties for a moment) is made of a "function generalization" that admita many arrows  from each element of the domain to many elements in the codomain. this wider concept is called a morphism (If I´m not wrong). for that matter, an statement like"getChar", that returns a different output each time can be understood mathematically only using category theory. This is the reason why Category Theory is so important for a mathematical understanding of any algorithm in any  programming language. Because an algorithm is a chain of statements, mathematically it is a chain of morphisms. Generally an algorithm has many statements like getChar that are impure , because it does something with the external world) so the algorithm when executed trough these morphisms travels different graphs on each execution.  To discover where the chain of arrows goes each time, it is necessary to execute the chain of sentences every time.

Imperative languages works "categorically" ever, so they don´t need an special syntax for doing so. It an imperative language find a pure function he will execute it again and again.. It does not know what is pure and what is not.. Haskell permits to discriminate functional code from "normal" "categorical/imperative" code, so the programmer and the compiler can make use of the mathematical properties of pure code with unique paths of execution. That permits equational reasoning and some optimizations. For example, graph reduction, explained above (equational reasoning may be considered as a mental form of graph reduction).


Because a function can not give a different result  haskell can execute it later in time in case it is needed, lazines is another optimization possible as result of the separation of pure functional code from impure code. Something that is not said about laziness is that it permits optimal coarse grained interfaces without losing performance. You can make a deserialized of a  database query result  and only the fields that you use will be deserialized when used.

Because haskell is lazy by default, it needs an special syntax for chaining impure morphisms of the type of getChar where the sequence of execution matters for the resulting output. This imperative execution model is not hardcoded in the language, but it is defined in the Monad instance, defined by a library programmer. And this execution model is not unique, but it is different for each category. The "Monad" instance of each category tell haskell how to compose morphisms for this category in order to traverse the path of execution. 

In the same way that '+' or '*' compose two numbers, the 'bind' method of the Monad instance for each category tell how to compose arrows until the path of execution is complete (that is the  sense of the akward sentence "a monad is a monoid in the category of endofunctors" . A monoid define the way of composing elements of a category. An endofunctor is a particular class of morphisms, but I said from the beginning that we must leave details aside.


Because its unique execution model, imperative languages work in a single category and have an unique execution model. Haskel mimick  other imperative languages by using the monad instance of the IO category. But new categories and execution models can be defined by the programmer.  There are some other Monadic execution models defined. The List category for example, work with all the graph paths of a morphism at the same time, not only a single result (see below). There are some categories defined to handle exceptional paths in a otherwise functional computation ( with the Maybe, Exception or Error monads of each respective category). Even there are executions that can restart graps until transactional completiion (STM) and so on.


Concerning the List category, a (pure) function  

f :: a -> [a]

that return a list can be understood alternatively as a morphism that has many graphs from each element of the domain to many elements of the  codomain. So this function can be considered a morhism in the List category and his corresponding monadic execution model. The monad instance of List permits to handle all graphs resulting from each morphism simultaneously. The computation is pure and imperative at the same time. the List monad operates with (ordered) sets instead of individual results.

Let´s look at the execution model of the list category:

1
2
3
instance  Monad []  where
  m >>= k = concat (map k m)
  return x= [x]

In the bind (>>=) method, the morphism k is applied to all the results of the previous morphism m, (via map) resulting in a set for each value in m. Then these set of sets is unified in a single set with concat.


Oddy enough, the list monad is considered non-deterministic,. Actually, the List monad computes all possible graph simultaneously, while the IO monad traverse a different path of execution each time, so the IO should be the non-deterministic monad.


DISCLAIMER: Ths text is provided "as is" and any express or implied warranties of mathematical accuracy are disclaimed. In no event shall the author ve liable for any direct, indirect, incidental, parallel or sequential damages, including but not limited to loss of use, or business interruption. blah blah blah.



1 comment:

Bianca said...

Great post, thanks.