tag:blogger.com,1999:blog-78508141703091118482018-03-02T07:52:40.241-08:00The Joy of TypesPhilip J-Fhttp://www.blogger.com/profile/03701325990487750739noreply@blogger.comBlogger8125tag:blogger.com,1999:blog-7850814170309111848.post-41789683853894006512014-01-13T21:01:00.000-08:002014-01-13T21:02:42.863-08:00On Graphs and Matrices<p><i>Warning: Everything here is probably either wrong or obvious. It is about an epiphany I had today, and nothing more. It has been almost a year since I have blogged, and I want to get back into it. </i></p><p>A student in my office today was asking about graphs. It is easy to prove that the nth power of an adjacency matrix is counts the number of paths of length n. Proving this was enough to placate me as an undergrad, but today for some reason showing this result got me thinking. The "matrix" is not an arbitrary notion. Matrix multiplication isn't just some arcane ritual that happens to have useful properties. Rather, matrices are a computationally efficient implementation for a certain class of functions--namely, linear transformation. Matrix multiplication is just function composition. </p><p>So that lead me to thinking--is there a way to view adjacency matrices (and graphs) as linear transformation? The answer is that: of course there is. </p><p>One can think of an element in a set $S$ as being a function $S \to \mathbb{Z}_2$ that is $1$ at the point and $0$ everywhere else. In fact, one can generalize this to work in any ring, not just $\mathbb{Z}_2$. </p><p>This then produces a notion of "super position." A subset of a set $S$ can be seen as an arbitrary function $S \to \mathbb{Z}_2$, while a general notion of super position is achieved using other rings. In the integers we get multi-sets, while if we use something like real numbers we get a more general notion where we can include fractions of elements... </p><p>All of these notions define "K-modules" for what ever ring $K$ we use. If $K$ is a field we get vector spaces. For example: given a set $S$, the set of subsets of $S$ form a $\mathbb{Z}_2$-vector space. Bizarre, but true. </p><p>What does this fave to do with graphs? Well recall that a graph is usually defined as a tuple $(V,E)$ of the vertices and edges. Usually $E$ is thought of as a subset of $V \otimes V$. Equivalently, we could, using the construction above, express $E : (V \otimes V) \to \mathbb{Z}_2$. </p><p>And this motivates the adjacency matrix representation. For each pair of vertices, we assign a zero if there is not an edge between them, and a one if there is. </p><p>This though was the unsatisfying position I was in at the beginning. The question is can we do better. </p><p>Here is the trick though, we define the function $A : (V \to \mathbb{Z}_2) \to (V \to \mathbb{Z}_2)$ as $A(f)(y) = \Sigma_{x \in V}f(x)E(x,y)$. Why this odd function? Well, we know that for any particular vertex $v$, there exists a function $\bar{v}$ that is $1$ at $v$ and zero everywhere else. Observe that $A(\bar{v})(y) = E(v,y)$, and thus we can recover $E$ from $A$. In this respect, $A$ is just a very odd way of writing $E$. </p><p>More importantly, we can observe two things about $A$ <ol><li>$A$ is linear.</li><li>The matrix representation of $A$ is exactly the adjacency matrix.</li></ol></p><p>We can further generalize. If we replace $\mathbb{Z}_2$ with the integers, then a function $V \to \mathbb{Z}$ assigns an integer "weight" to each element in $V$. While the function $A$ produces a new assignment, copying the weight to each adjacent vertex, and then summing. If we let this weight be the number of ways to get from some starting positions $s$ to each position in $V$ using $n$ steps, then it should be obvious that the action of $A$ is to compute the number of ways to get from $s$ to each position in $V$ using $n+1$ steps. </p><p>Whats more, we have a natural way of encoding multigraphs--instead of the adjacency matrix consisting of elements from $\mathbb{Z}_2$, it consists of elements from $\mathbb{Z}$ encoding the number of edges between two vertices. From the linear algebra perspective, $A$ is precisely the linear operator that given a vector of paths from some $s$ to each position in $V$ using $n$ steps, returns a vector of paths from $s$ to each vertex in $V$ using $n+1$ steps. </p><p>In fact, it seems to me that switching to integers makes everything more logical, not less. The idea that $A^n$ is the number of paths of length $n$ should be obvious in this framework. </p><p>I don't know why I didn't learn this as an undergrad, but it seems to me now that graph theory is really just linear algebra in disguise. </p>Philip J-Fhttp://www.blogger.com/profile/03701325990487750739noreply@blogger.com0tag:blogger.com,1999:blog-7850814170309111848.post-18940471111323165762013-01-25T00:16:00.002-08:002013-01-25T00:28:10.197-08:00Using Compiler Bugs for Fun and Profit: Introducing Cartesian Closed Constraints<figure><cite>Anyway, implicit parameters are not idiomatic Haskell these days, and unlikely to make it into a standard, and once you start to involve unsafeCoerce then you're clearly off the reservation. </cite><figcaption>--sclv</figcaption></figure><figure><cite>You can't have an implicit parameter in the context of a class or instance declaration. For example, both these declarations are illegal: class (?x::Int) => C a where ... instance (?x::a) => Foo [a] where ... Reason: exactly which implicit parameter you pick up depends on exactly where you invoke a function. But the ``invocation'' of instance declarations is done behind the scenes by the compiler, so it's hard to figure out exactly where it is done. Easiest thing is to outlaw the offending types. </cite><figcaption><a href=http://www.haskell.org/ghc/docs/7.4.2/html/users_guide/other-type-extensions.html>--The Glorious Glasgow Haskell Compilation System User's Guide, Version 7.4.2</a></figcaption></figure><p>Modern Haskell allows you to pass typeclass instances between functions by wrapping them with a GADT. But this is very limited. Wouldn't it be nice if there were a simple way to produce a typeclass instance from a record? This would only be useful if we could have multiple instances of the same typeclass/type combination, and everyone knows that is not permitted. Right? Almost a year ago I started this blog with a post on how to generate "First Class Instances" but the techniques were hackish and used unsafeCoerce in an unsafe way (okay, only as unsafe as GeneralizedNewtypeDeriving). </p><p>What would you say if I showed you a way to build arbitrary instances of typeclasses from dictionaries locally? A way that didn't use unsafeCoerce? A way that except for violating the "single instance" assumption, appears to be totally safe? </p>First thing is first. Haskell posts need to begin with the requisite language pragmas. <p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ {-# LANGUAGE ImplicitParams, Safe #-} ]]></script>yes, you got that right, today we will be using SafeHaskell. </p><p>For an example, we will define differing ordering relations on a type. The techniques presented in this post will not allow one to reliably "shadow" an instance declaration in scope, so we define a newtype to work with <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype MultiOrd a = MultiOrd {fromMultiOrd :: a} ]]></script></p><p>Since we plan to be working with ordering relations, lets define a record variant of the Ord type class. Actually, a single newtype will do. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype OrdFun a = OrdFun {runOrdFun :: a -> a -> Ordering} isEq :: Ordering -> Bool isEq EQ = True isEq _ = False ]]></script></p><p>Now here is the trick. The GHC docs say we can't use ImplicitParams in an instance declaration. Turns out the docs are wrong. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ instance (?ordFun :: OrdFun a) => Eq (MultiOrd a) where (MultiOrd a) == (MultiOrd b) = isEq $ runOrdFun ?ordFun a b instance (?ordFun :: OrdFun a ) => Ord (MultiOrd a) where compare (MultiOrd a) (MultiOrd b) = runOrdFun ?ordFun a b ]]></script></p><p>Now we can test it. Sometimes I want to compare strings by looking at length. Other times I want to alphabetize them. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ test1 :: Bool test1 = let ?ordFun = OrdFun $ \a b -> compare (length a) (length b) in (MultiOrd "hat") == (MultiOrd "cat") test2 :: Bool test2 = let ?ordFun = OrdFun compare in (MultiOrd "hat") == (MultiOrd "cat") main :: IO () main = print test1 >> print test2 ]]></script></p><p>I have tried this code in several versions of GHC. Just now I tried it on GHC 7.4.2 and 7.6.1. In each case I ran it both in GHCi and compiled at varios optimization levels. It always compilers with out error and produces the expected result, namely "<tt>True False</tt>." </p><h1>This isn't scary</h1><p>I noticed this behavior a long time ago, and other than showing it to some people at the Oregon Programming Language Summer School, I have kept pretty quite about it. The truth is I'm terrified that people will want to fix it. Don't. The compiler bug here is doing the right thing. </p><p>The common assumption of the "single global instance" simply does not hold anyways. At least, it doesn't hold in a way that can be used to ensure the kind of invariants that things like <tt>Data.Set</tt> and <tt>Data.Map</tt> assume. And, this should be obvious. Imagine a module (call it <tt>M1</tt>) that exports a type. Then two other modules (<tt>M2</tt> and <tt>M3</tt>) each import that type, and declare the same (orphan) instance for that type. A forth module (<tt>M4</tt>) imports both instances. Obviously, M4 can't use either instance directly. But, this restriction does not prevent violating module invariants. If <tt>M2</tt> exposes a <tt>Set</tt> or our type, and <tt>M3</tt> exposes a function to modify <tt>Sets</tt> of our type, composing these two functions will have unpredictable module boundary breaking behavior: even though <b>all instances obeyed the typeclass laws</b>. </p><p>Try it. It is not that hard to break <tt>Data.Set</tt> and similar modules. Of course, using things like GADTs we can push the invariant breaking much further. The simple fact is that modules that rely on the single instance property for safety are fundamentally broken. </p><h1>A Categorical Use Case</h1><p>I have made several bold claims in this post. Here is another one. <b>The ImplicitParams trick is useful and should be made official behavior.</b></p><p>Edward Kmett has written some pretty cool category theory packages, and has a very nice package for using GHC's support for ConstraintKinds. These packages don't quite work for what I want to show here, but they come close so I will use similar ideas. </p><p>After enabling some extensions (we are no longer in SafeHaskell land as I wrote this up quickly) and hiding a bunch of the Prelude, we can define some useful typeclasses for working with categories. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ class Category k where id :: k a a (.) :: k b c -> k a b -> k a c class Category k => Bifunctor k p where (***) :: k a b -> k a' b' -> k (p a a') (p b b') class Bifunctor k p => Associative k p where associate :: k (p (p a b) c) (p a (p b c)) coassociate :: k (p a (p b c)) (p (p a b) c) class Associative k p => Monoidal k p i | k p -> i where idl :: k (p i a) a idr :: k (p a i) a coidl :: k a (p i a) coidr :: k a (p a i) class Braided k p where braid :: k (p a b) (p b a) class (Monoidal k prod i, Braided k prod) => Cartesian k prod i | k -> prod i where fst :: k (prod a b) a snd :: k (prod a b) b diag :: k a (prod a a) (&&&) :: k a b -> k a c -> k a (prod b c) f &&& g = (f *** g) . diag class Cartesian k p i => CCC k p i e | k -> p i e where apply :: k (p (e a b) a) b curry :: k (p a b) c -> k a (e b c) uncurry :: k a (e b c) -> k (p a b) c ]]></script>CCCs show up a lot, as they provide a categorical version of the simply typed lambda calculus. Thus, a good example of a CCC is the category Hask. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ instance Category (->) where id = P.id (.) = (P..) instance Bifunctor (->) (,) where f *** g = \(a,b) -> (f a,g b) instance Associative (->) (,) where associate ((a,b),c) = (a,(b,c)) coassociate (a,(b,c)) = ((a,b),c) instance Monoidal (->) (,) () where idl ((),a) = a idr (a,()) = a coidl a = ((),a) coidr a = (a,()) instance Braided (->) (,) where braid (a,b) = (b,a) instance Cartesian (->) (,) () where fst = P.fst snd = P.snd diag x = (x,x) instance CCC (->) (,) () (->) where apply (f,a) = f a curry = P.curry uncurry = P.uncurry ]]></script>We could even write a program that took Haskell like lambda calculus syntax (perhaps generated with a TemplateHaskell or a finally tagless interface) and produced a Haskell term polymorphic in the CCC. </p><p>Put aside the category stuff, and think about the algebra of Constraints in Haskell. With current versions of GHC we can develop a rich set of tools for talking about constraints <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data Dict c where Dict :: c => Dict c newtype a :- b = Sub (a => Dict b) buildSub :: (Dict a -> Dict b) -> a :- b buildSub f = Sub (f Dict) runSub :: (a :- b) -> Dict a -> Dict b runSub (Sub d) Dict = d class a :=> b where ins :: a :- b ]]></script>that last type class encodes <i>as a constraint</i> the judgment that one constraint implies another. That is, it internalizes the entailment relation. </p><p>We would actually like some instances of this relation, and that is where the trick in this post becomes useful. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ instance (?inserter :: (a :- b)) => a :=> b where ins = ?inserter ]]></script>some more stuff that is not very interesting <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ insWith :: Dict (a :=> b) -> a :- b insWith Dict = ins data Proxy x = Proxy dictToProxy :: Dict c -> Proxy c dictToProxy _ = Proxy ]]></script>and we can begin to define some other standard connectives. For example, Haskell allows you to combine Constraints using tuple notation, but this isn't really considered a constructor in GHC. So, we define our own "product" constraint <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ class a :/\: b where fstDict :: Proxy (a :/\: b) -> Dict a sndDict :: Proxy (a :/\: b) -> Dict b instance (?hasFst :: Dict a, ?hasSnd :: Dict b) => a :/\: b where fstDict _ = ?hasFst sndDict _ = ?hasSnd fstSub :: (a :/\: b) :- a fstSub = buildSub (\d@Dict -> fstDict (dictToProxy d)) sndSub :: (a :/\: b) :- b sndSub = buildSub (\d@Dict -> sndDict (dictToProxy d)) splitPair :: Dict (a :/\: b) -> (Dict a, Dict b) splitPair d@Dict = (fstDict (dictToProxy d),sndDict (dictToProxy d)) buildPair :: (Dict a, Dict b) -> Dict (a :/\: b) buildPair (d1,d2) = let ?hasFst = d1 in let ?hasSnd = d2 in Dict ]]></script></p><p>What is the point of all this? Well, Edward Kmett noted that the judgment form of type class entailment (that is, the type version) can form a category <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ instance Category (:-) where id = Sub Dict a . b = Sub ((\(Sub Dict) -> (\(Sub Dict) -> Dict) a) b) ]]></script>using the pairs above though, we can make this into a Symmetric Monoidal Category <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ instance Bifunctor (:-) (:/\:) where f *** g = buildSub $ \d -> let (x,y) = splitPair d in buildPair (runSub f x,runSub g y) instance Associative (:-) (:/\:) where associate = buildSub (\abc -> let (ab,c) = splitPair abc (a,b) = splitPair ab in buildPair (a,buildPair (b,c))) coassociate = buildSub (\abc -> let (a,bc) = splitPair abc (b,c) = splitPair bc in buildPair (buildPair (a,b),c)) instance Monoidal (:-) (:/\:) (() :: Constraint) where idl = sndSub idr = fstSub coidl = buildSub (\d -> buildPair (Dict,d)) coidr = buildSub (\d -> buildPair (d,Dict)) instance Braided (:-) (:/\:) where braid = buildSub $ buildPair . braid . splitPair ]]></script>and for that matter a CCC <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ instance Cartesian (:-) (:/\:) (() :: Constraint) where fst = fstSub snd = sndSub diag = Sub $ buildPair (Dict,Dict) instance CCC (:-) (:/\:) (() :: Constraint) (:=>) where apply = buildSub $ \d -> let (f,x) = splitPair d in runSub (insWith f) x curry f = buildSub $ \a -> let ?inserter = buildSub $ \b -> runSub f (buildPair (a,b)) in Dict uncurry f = buildSub $ \ab -> let (a,b) = splitPair ab in runSub (insWith (runSub f a)) b ]]></script></p><p>Is this useful? Possibly. But it is super cool regardless. I strongly think that this should be made the official behavior and kept in the language. Having tons of categorical structure show up around one of your languages core features (in this case, typeclasses) is usually going to be a good thing. Let's let people see if we can make this useful. </p><p>The GHC docs give the justification that the behavior of what instance would be picked would be unclear. I don't think this is true. We already have both local evidence and implicit parameters which present exactly the same problems. It is time to document this behavior, and start using it for good. </p>Philip J-Fhttp://www.blogger.com/profile/03701325990487750739noreply@blogger.com7tag:blogger.com,1999:blog-7850814170309111848.post-5843892480342404862012-08-14T18:42:00.000-07:002012-08-14T18:42:36.207-07:00GeneralizedNewtypeDeriving is Profoundly Unsafe<p>It is a well known fact in the Haskell community that GeneralizedNewtypeDeriving as currently implemented is unsafe. What I don't think people generally realize is just how unsafe it is. </p><p>Perhaps the most obvious problem comes when GeneralizedNewtypeDeriving is combined with TypeFamilies <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ {-# LANGUAGE TypeFamilies, GeneralizedNewtypeDeriving #-} ]]></script>using this all to common pair of extensions we can start doing stuff that looks innocuous <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ type family Result a b class Convert a where coerce :: a -> Result a b newtype SAFE a = SAFE a type instance Result (SAFE a) b = a instance Convert (SAFE a) where coerce (SAFE a) = a ]]></script>but then suddenly starts to get ever so slightly suspicious <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype IS_NO_LONGER a = IS_NO_LONGER a deriving Convert type instance Result (IS_NO_LONGER a) b = b ]]></script>The problem comes when we combine the little functions we have built <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ --infered type is --unsafeCoerce :: a -> b unsafeCoerce = coerce . IS_NO_LONGER . SAFE ]]></script>this is really full featured unsafeCoerce <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ --use it safely id' :: a -> a id' = unsafeCoerce --segfault (with high probability) crash :: segfault crash = unsafeCoerce . tail . tail . tail . unsafeCoerce $ True ]]></script>and can even break the most holy Haskell's type system enforced invariants <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ import Control.Monad.ST --time for side effects unsafePerformIO :: IO a -> a unsafePerformIO x = runST $ unsafeCoerce x ]]></script>try running this stuff in GHCi. <p>Okay, so the combination of GeneralizedNewtypeDeriving and open type functions is clearly unsound. More interesting is the fact that you don't need TypeFamilies to type unsafe coerce. Just a healthy GADT (note: I believe this is a new presentation). <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ {-# LANGUAGE GeneralizedNewtypeDeriving, GADTs #-} ]]></script>using GADTs we can pass around proofs of type equality <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data SameType a b where Refl :: SameType a a coerce :: SameType a b -> a -> b coerce Refl = id trans :: SameType a b -> SameType b c -> SameType a c trans Refl Refl = Refl sameUnit :: SameType () () sameUnit = Refl ]]></script>and define classes that talk about them <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ class IsoUnit a where iso1 :: SameType () b -> SameType a b iso2 :: SameType b () -> SameType b a instance IsoUnit () where iso1 = id iso2 = id ]]></script>by this point you should know what comes next. We just define a type <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype Tagged a b = Tagged b deriving IsoUnit ]]></script>and start committing atrocities <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ sameTagged :: SameType (Tagged a b) (Tagged a' b') -> SameType a a' sameTagged Refl = Refl unsafe' :: SameType (Tagged a ()) (Tagged a' ()) unsafe' = (iso1 sameUnit) `trans` (iso2 sameUnit) unsafe :: SameType a b unsafe = sameTagged unsafe' --once again inferred type is a -> b unsafeCoerce = coerce unsafe ]]></script>Shucks. </p><p>Okay, so now 2 of my favorite extensions (TypeFamilies and GADTs) have been shown to be incompatible with GeneralizedNewtypeDeriving. But, we still are not at the end of the story. If you look at the now 5 year old <a href="http://hackage.haskell.org/trac/ghc/ticket/1496">trac</a> for the "GeneralizedNewtypeDeriving is unsafe" bug, you see that it can also be used to break system invariants in code that is otherwise Haskell98. GeneralizedNewtypeDeriving will lay waste to module boundaries, so many of the assumptions made by common Haskell libraries are trivial to break. In fact, I had wanted to be able to demo yet a third way of typing unsafeCoerce using an <a href="dl.acm.org/citation.cfm?id=581494">alternative representation of type equality proofs</a><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ {-# LANGUAGE GeneralizedNewtypeDeriving, RankNTypes #-} newtype Id a = Id {runId :: a} newtype SameType a b = SameType {coerce' :: forall f. f a -> f b} coerce :: SameType a b -> a -> b coerce s = runId . (coerce' s) . Id ]]></script>for this blog post, but have not been able to get it to work in time. I believe that not only may this work, but it is highly possible that Haskell 98 + GeneralizedNewtypeDeriving admits unsafeCoerce, using type classes to simulate higher ranked types. Research for another day. </p><p>The moral is that GeneralizedNewtypeDeriving really needs to be fixed. In an ideal world it would re-type check the definitions to ensure no fishy business was going on. Sadly, I suspect this would break large amounts of existing code. Until then, GeneralizedNewtypeDeriving should be treated as on par with unsafeCoerce. It might be correct, but it just as well might cause your machine to catch fire. </p><p>PS: I feel a bit guilty posting this on the same day as (the always brilliant) Bob Harper posted one of his anti-Haskell polemics on the unsafe nature of Typeable. At the Oregon Programming Language Summer School this year, several other Haskellers asked me to publish some of the more demonic Haskell code I have written. I have been traveling since then, and will be again in the coming week, but I felt compelled to show off at least some of the stuff I have done. </p> Philip J-Fhttp://www.blogger.com/profile/03701325990487750739noreply@blogger.com0tag:blogger.com,1999:blog-7850814170309111848.post-72457694812418681632012-04-23T23:27:00.001-07:002012-04-23T23:27:04.033-07:00Memory Regions à la carte<p>In the previous post we saw how to implement checked exceptions using free monads. The crucial idea was that, so long as the underlying monad was written using the "à la carte" style, we could add exceptions in such a way that <ol><li>Programms would only compile if all exceptions were caught</li><li>We could program exactly as we would in the underlying (free) monad</li><li>Type inference was sufficient such that we did not require any type signatures</li><li>Client code only required language extensions to write functions with uncaught exceptions</li></ol></p><p>Related to checked exceptions is another kind of fine grained effect control: memory regions. The basic idea is to allow blocks of mutable memory to be accessed in such a way that the type system ensures external purity. This is already achievable in Haskell using the ST monad. But, ST has some clunky-ness. In particular ST does not compose well with other monads. </p><p>An alternative to the ST monad is the <a href="http://hackage.haskell.org/packages/archive/STMonadTrans/0.2/doc/html/Control-Monad-ST-Trans.html">ST monad transformer</a>. There are problems with the ST transformer though. The one we fix is that it is not well suited for code that involves multiple memory regions. The second problem is that the ST transformer does not play nice with all monads. I strongly suspect this second problem still exists with the code presented here. In fact, our implementation uses STMonadTrans internally, and I am not convinced it can be implemented in terms of ST alone. More about this at the end. </p><p>As much as possible we would like to reuse the machinery built for checked exceptions. Lets get started.</br>Nothing that interesting in the extensions <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ {-# LANGUAGE TypeOperators, FlexibleContexts, GADTs, Rank2Types #-} ]]></script>and the API we expose is again very simple <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ module Regions( RG, RGKey, RGRef, region, newRGRef, readRGRef, writeRGRef, ) where ]]></script>With the imports out of the way <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ import Control.Monad.ST.Trans import Control.Monad.Trans import AlaCarte ]]></script>the first thing we need to do is to define a datatype to express the operations we want our monad to be able to perform. There are three essential operations involving references, so we implement all three of them as a single functor. The one complexity is that reference are quantified over a type, so our data constructors need to be existentially quantified. We this use a GADT. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data RG s x where WriteRef :: (STRef s a) -> a -> x -> RG s x ReadRef :: (STRef s a) -> (a -> x) -> RG s x NewRef :: a -> ((STRef s a) -> x) -> RG s x instance Functor (RG s) where fmap f (WriteRef r a x) = WriteRef r a (f x) fmap f (ReadRef s g) = ReadRef s (f . g) fmap f (NewRef a g) = NewRef a (f . g) ]]></script>The meat of our implementation is a function that converts a region into a ST monad transformer. We can easily implement this using the removeStep function from the last post <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ execRegion :: Functor f => Free (RG s :+: f) a -> STT s (Free f) a execRegion reg = do x <- lift $ removeStep reg case x of (Right r) -> return r (Left (WriteRef r a g)) -> (writeSTRef r a) >> execRegion g (Left (ReadRef s g)) -> (readSTRef s) >>= execRegion . g (Left (NewRef a g)) -> (newSTRef a) >>= execRegion . g ]]></script>Now, we define the public API. The first thing is a type for references in a region <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype RGRef s a = RGRef (STRef s a) ]]></script>The big problem implementing memory regions by stacking ST transformers on top of each other, is that we have no way of knowing which of the various STs is intended when we use newSTRef. We solve this by adding "keys" associated with each region <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data RGKey r = RGKey ]]></script>now, when a user wants to use a new region, they will define a function taking a key and returning an appropriate action <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ region :: Functor f => (forall r. RGKey r -> Free (RG r :+: f) a) -> Free f a region f = runST $ (return RGKey) >>= execRegion . f ]]></script>the remainder of the API is trivial <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newRGRef :: (RG s :<: f) => RGKey s -> a -> Free f (RGRef s a) newRGRef RGKey a = mkFree . inj $ NewRef a RGRef readRGRef :: (RG s :<: f) => RGRef s a -> Free f a readRGRef (RGRef r) = mkFree . inj $ ReadRef r id writeRGRef :: (RG s :<: f) => RGRef s a -> a -> Free f () writeRGRef (RGRef r) a = mkFree . inj $ WriteRef r a () ]]></script></p><p>What can we do with this? Well, lets test it out <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ testRegion1 key = do r <- newRGRef key "in region" readRGRef r ]]></script>running it produces the expected results. <pre><br />*Main> run $ region $ testRegion1<br />"in region"<br /></pre>we can do some more interesting things <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ testRegion2 key1 = do r1 <- newRGRef key1 "in region 1" region $ \key2 -> do r2 <- newRGRef key2 "set in region 2" s <- readRGRef r2 writeRGRef r1 s readRGRef r ]]></script>which produces the expected <pre><br />*Main> run $ region $ testRegion2<br />"set in region 2"<br /></pre>we can even combine regions and exceptions <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ testRegion3 key1 = do r1 <- newRGRef key1 "in region 1" (region $ \key2 -> do r2 <- newRGRef key2 "set in region 2" s <- readRGRef r2 writeRGRef r1 s raise Ex1 writeRGRef r1 "should not get here" ) `recover` ( \Ex1 -> readRGRef r1 >>= \s -> writeRGRef r1 $ s ++ " before exception was thrown") readRGRef r1 ]]></script><pre><br />*Main> run $ region $ testRegion3<br />"set in region 2 before exception was thrown"<br /></pre>Fine grained effect control. And it didn't cost too much either. Of course, the STMonadTrans library provides support for arrays, but those can easily be added by amending the definition of RG to have more constructors. </p> <p>Now, we haven't solved the second problem with STMonadTrans, namely, that we can have effects happen unexpected number of times if we are careful about it. This is obvious, since we can define a simple monad transformer as a free monad: <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype M m x = M {runMFunctor :: m x} instance Monad m => Functor (M m) where fmap f (M mx) = M $ liftM f mx liftToMFree :: (Monad m, M m :<: f) => m a -> Free f a liftToMFree = mkFree . inj . M ]]></script>and an algebra for running it <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ class MAlg f m where runMAlg :: f a -> m a instance MAlg (M m) m where runMAlg = runMFunctor instance MAlg VoidF m where runMAlg = ex_falso_voidF instance (MAlg f m, MAlg g m) => MAlg (f :+: g) m where runMAlg (InjR f) = runMAlg f runMAlg (InjL g) = runMAlg g ]]></script>supporting functions such as <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ runM :: Monad m => Free ((M m) :+: VoidF) a -> m a runM = foldFree return (join . runMAlg) ]]></script>now, consider the following function which has no control flow operators <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ unexpected :: RGKey s -> Free (RG s :+: M [] :+: VoidF) Integer unexpected key = do r <- newRGRef key (0 :: Integer) x <- liftToMFree [1..30 :: Integer] v <- readRGRef r writeRGRef r (x + v) readRGRef r ]]></script>the problem is that if we run it <pre><br />*Main> runM $ region $ unexpected<br />[1,3,6,10,15,21,28,36,45,55,66,78,91,105,120,136,153,171,190,210,231,253,276,300,325,351,378,406,435,465]<br /></pre>we see that combining mutation with a monad that represent multiple answers produces interesting results. The documentation for STMonadTrans warns that it might break referential transparency--I'm not sure that I understand why this is (how is this different than a StateT over a list?) but, this warning is enough for me to not be willing to advocate the use of these free regions just yet. </p>Philip J-Fhttp://www.blogger.com/profile/03701325990487750739noreply@blogger.com0tag:blogger.com,1999:blog-7850814170309111848.post-41716320884945989112012-04-22T22:32:00.000-07:002012-05-18T11:40:37.107-07:00Checked Exception for Free<p>Monad transformers are a very effective abstraction mechanism. But, they can make code difficult to reason about, and they sometimes suffer from a form of what C++ developers call the "n times m" problem. That is, if you separate implementation in the form of transformer stacks, from interfaces in the form of typeclasses, you potentially need an instance for each typeclass/transformer combination. The GHC developers have generally avoided transformers<sup><a href="#fn1" id="ref1">1</a></sup>. While, it has been suggested that best practice (at least most of the time) is to define custom monads, either by defining your own machinery, or by wrapping a transformer stack in a newtype. </p><p>One of my fellow students suggested that "the right way" to do it was to list the operations you wanted as functor, and then to simply use the free monad over this set of operations. This approach is quite elegantly described in Swierstra's <a href="http://www.cs.ru.nl/~W.Swierstra/Publications/DataTypesALaCarte.pdf"> Data types à la carte</a></p><p>Given this discusion, I spent a fair chunk of my weekend thinking about just how far one could go with this approach. It is known that most of the common monad transformers (really everything except Cont) can be encode as free monads. We need something a bit more interesting. </p><p>I am bit obsessed with fine grain control of side effects. Perhaps, the most widely used system for fine grained effect control is Java's checked exceptions. Checked exceptions in Java get a bad rap, and with reason. In a language without type inference they are clunky to work with. And, the hierarchical model makes it too appealing to <script type="syntaxhighlighter" class="brush: java"><![CDATA[ catch (Exception e) {} ]]></script>But, Haskell does not have these problems. What's more, Haskell is flexible enough that checked exceptions can be implemented as a library. You can download a checked exception library on <a href="http://hackage.haskell.org/package/control-monad-exception-0.10.0">Hackage</a> today. In this post we will implement a simple checked exceptions library using an interesting variant of the "À la Carte" approach. </p><H2>Machinery</H2><p>First thing is to define some of the machinery we will use for building the exceptions library. We need some extensions <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ {-# LANGUAGE TypeOperators, MultiParamTypeClasses, FlexibleInstances, FlexibleContexts #-} ]]></script>we also need one of IncoherentInstances or OverlappingInstances. It does not matter which we use for this post. But, for somethings we will want to do with this machinery we need IncoherentInstances <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ {-# LANGUAGE IncoherentInstances #-} ]]></script></p><p>Right now we are only interested in the supporting machinery <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ module AlaCarte ( Free, --Free monads mkFree, --usefull utility :: f a -> Free f a (:+:)(InjR,InjL), --functor coproduct (:<:)(inj,prj), --sub-sumption relation VoidF, --the void functor ex_falso_voidF, -- :: VoidF a -> b run, -- :: Free VoidF a -> a removeStep) --?? where ]]></script>We need free monads. There is a library for that. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ import Control.Monad.Free --Control.Monad.Free does not provide a function I find very useful mkFree :: Functor f => f a -> Free f a mkFree fa = wrap (fmap return fa) ]]></script></p><p>Data types a la carte heavily uses functor coproducts, as will we. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ infixr 9 :+: data (f :+: g) a = InjL (f a) | InjR (g a) instance (Functor f, Functor g) => Functor (f :+: g) where fmap f (InjL x) = InjL $ fmap f x fmap f (InjR x) = InjR $ fmap f x ]]></script></p><p>We also need a way to express that a given functor is subsumed by a coproduct. This is the one spot where we have overlapping instances. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ infixr 1 :<: class (Functor sub, Functor sup) => sub :<: sup where inj :: sub a -> sup a prj :: sup a -> Maybe (sub a) instance (Functor f, Functor g) => f :<: (f :+: g) where inj = InjL prj (InjL x) = Just x prj _ = Nothing instance (Functor g, Functor h, f :<: h) => f :<: (g :+: h) where inj = InjR . inj prj (InjR y) = prj y prj _ = Nothing ]]></script></p><p>Swierstra's model for how we use these is to define algebras over functors, and build evaluators to "run" free monads by folding over those algebras. We will make special use of one such algebra. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype VoidF a = VoidF (VoidF a) ex_falso_voidF :: VoidF a -> b ex_falso_voidF (VoidF x) = ex_falso_voidF x instance Functor VoidF where fmap f v = ex_falso_voidF v ]]></script>The void functor is uninhabited. It is, though, an honest functor. In particular, it is the initial object in the category of endofunctors, and the initial object under functor coproducts, making it surprisingly useful. Often we would need to define a typeclass to encode an interesting algebra, but here we can express it trivially <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ run :: Free VoidF a -> a run = foldFree id ex_falso_voidF ]]></script>foldFree is just like fold over lists, but over free monads. </p><p>So far we haven't deviated much from the norm. Now though, we build some unusual machinery. One of the reasons to prefer free monads over monad transformers is to avoid the linear nature of a stack of monad transformers. Sometimes though, a stack is just what we need. We can recover the stack by assuming that the functor inside a free monad is not an arbitrary tree of coproducts, but rather is one long list (at least it has a list like head) <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ without :: (x :+: xs) a -> Either (x a) (xs a) without (InjL x) = Left x without (InjR y) = Right y ]]></script>The reason for this is that in addition to the algebraic style, sometimes we want to run one functor at a time <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ removeStep :: Functor f => Free (x :+: f) a -> Free f (Either (x (Free (x :+: f) a)) a) removeStep (Pure a) = Pure $ Right a removeStep (Impure f) = case without f of Left x -> Pure $ Left x Right x -> Impure $ fmap removeStep x ]]></script></p><p>That is all the machinery for "A la Carte" programming. Now, we just have to use it. </p><h2>Checked exceptions</h2><p>We don't need much this time as far as extensions <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ {-# LANGUAGE TypeOperators, FlexibleContexts #-} ]]></script>the API is small and self documenting <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ module Checked( Throws, raise, recover ) where import AlaCarte ]]></script>Now it is easy. We need a functor to represent exceptions. Since, Maybe is the free monad over the functor that has only one constructor taking no parameters, Either is the free monad over the functor that has only one constructor taking a single arbitrary object as a parameter. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype Throws err x = Bad err instance Functor (Throws err) where fmap _ (Bad err) = Bad err ]]></script>raising exception is now trivial <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ raise :: (Throws err :<: f) => err -> Free f a raise = mkFree . inj . Bad ]]></script>catching exceptions is a bit more involved, but still rather simple. The idea is to use the removeStep function to "remove" the appropriate Throws functor from the code we are catching <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ recover :: Functor f => Free (Throws t :+: f) b -> (t -> Free f b) -> Free f b recover act f = do x <- removeStep act case x of Right fx -> return fx Left (Bad err) -> f err ]]></script>we have to assume that this is the front of the functor co-list. </p><p>So now you are probably wondering how this works in practice. Lets get a new file with <b>no extensions</b><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data Ex1 = Ex1 deriving Show data Ex2 = Ex2 deriving Show test1 = return "test1" test2 = (raise Ex1) `recover` (\e@Ex1 -> return $ "test2, expect e1 " ++ show e) test3 = (raise Ex2) `recover` (\e@Ex1 -> return $ "test3, that is odd " ++ show e) `recover` (\e@Ex2 -> return $ "test3, expect e2 " ++ show e) test4 = (raise Ex1) `recover` (\e@Ex1 -> return $ "test4, expect e1 " ++ show e) `recover` (\e@Ex2 -> return $ "test4, that is odd " ++ show e) ]]></script>all of these work without a hitch. Type inference correctly figures them all out (we can't write the type signatures without turning some extensions on, however). The more interesting test is <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ test5 = (raise Ex1) `recover` (\e@Ex1 -> return $ "test5, first handler " ++ show e) `recover` (\e@Ex1 -> return $ "test5, second handler " ++ show e) ]]></script>Which handler gets called? Turns out it is always the first (as we would hope). Interestingly enough, it compiles and uses the first even if we use OverlappingInstances instead of IncoherentInstances. <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ testExceptions = run $ sequence [test1,test2,test3,test4,test5] ]]></script><pre><br />Prelude Main> testExceptions<br />["test1","test2, expect e1 Ex1","test3, expect e2 Ex2","test4, expect e1 Ex1","test5, first handler Ex1"]<br /></pre></p><p>Next up: memory regions. </p><sup id="fn1">1. Norman Ramsey <a href="http://stackoverflow.com/questions/2759968/has-anyone-ever-encountered-a-monad-transformer-in-the-wild">Stack Overflow</a> <a href="#ref1" title="Jump back to footnote 1 in the text.">↩</a></sup><br>Philip J-Fhttp://www.blogger.com/profile/03701325990487750739noreply@blogger.com0tag:blogger.com,1999:blog-7850814170309111848.post-36049650347723827222012-04-22T17:19:00.000-07:002012-04-22T18:49:23.863-07:00Free Monad Primer<p><i>I plan a set of posts over the next few days on cool things you can do with free monads. This post is an attempt to get all the abstract nonsense out of the way first.</i></p> <a href="http://james-iry.blogspot.com/2009/05/brief-incomplete-and-mostly-wrong.html">A Brief, Incomplete, and Mostly Wrong History of Programming Languages</a> provides some background <blockquote>1990 - A committee formed by Simon Peyton-Jones, Paul Hudak, Philip Wadler, Ashton Kutcher, and People for the Ethical Treatment of Animals creates Haskell, a pure, non-strict, functional language. Haskell gets some resistance due to the complexity of using monads to control side effects. Wadler tries to appease critics by explaining that "a monad is a monoid in the category of endofunctors, what's the problem?" </blockquote><p>A monoidal category is a category C equipped with an additional operation called the tensor product $\otimes$ which is a bifunctor from $C \times C \to C$, an identity object $I$, such that forall objects $x,y,z \in C$, $x \otimes I \cong I \otimes x \cong x$, and $(x \otimes y) \otimes z \cong x \otimes (y \otimes z)$. </p><p>Hask (the category of Haskell types...ignoring non-totality) is a monoidal category with product types (two-tuples) serving as the bifunctor, and the terminal object () serving as the identity object. Conveniently, two-tuples are also products in Hask. </p><p>In such a category, a monoid object $m$ is any object equipped with two morphism $$\eta : I \to m$$ $$\mu : m \otimes m \to m$$ such that some diagrams commute. We can express this in Haskell as <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ class MonoidalObject m where eta :: () -> m mu :: (m,m) -> m ]]></script>and the diagrams correspond to three laws <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ mu (eta (),a) = a mu (a,eta ()) = a mu (a,mu (b,c)) = mu (mu (a,b), c) ]]></script>this class is of course, equivalent to the usual <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ class Monoid m where mempty :: m mappend :: m -> m -> m ]]></script></p><p>We can construct a category where the objects are all the endofunctors over Hask, and the morphisms are all the natural transformations over these functors. </p>Aside: due to parametricity any function (in a large subset of Haskell) with a type like <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ forall a. F a -> G a -- where both F and G are functors ]]></script>is a natural transformation. </p><p>This category of endofunctors is a monoidal category with functor composition as the bifunctor <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype Tensor f g a = Tensor (f (g a)) instance (Funtor f, Functor g) => Functor (Tensor f g) where fmap f (Tensor fga) = Tensor $ fmap (fmap f) fga ]]></script>and the identity functor <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype Id a = Id a instance Functor Id where fmap f (Id a) = Id $ f a ]]></script>so, then we should be able to define monoids in this category <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ class Functor f => MonoidFunctor f where etaF :: Id a -> f a muF :: Tensor f f a -> f a ]]></script>with the same laws as above. A moment of reflection and we see this is equivalent to <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ class Functor f => Monad f where return :: a -> f a join :: f (f a) -> f a ]]></script>which is what the monad class in Haskell should be. </p><p>Okay, so in Hask we also have coproducts of the form <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data Either a b = Left a | Right b ]]></script>we also get coproducts in the cateogry of endofunctors <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data (f :+: g) a = InjL (f a) | InjR (g a) instance (Functor f, Functor g) => Functor (f :+: g) where fmap f (InjL fa) = InjL $ fmap f fa fmap f (InjR ga) = InjR $ fmap f ga ]]></script>Why is this useful? Well every Haskeller knows that given some type we can construct a monoid over that type <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data [a] = [] | (a:[a]) ]]></script>which we can express perhaps more explitly as <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype List a = List (Either () (a,List a)) ]]></script>that is, as the coproduct of the tensor identity and the tensor product of a and List a. This is the "free monoid". </p><p>Lets translate this structure to funtors <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype Free f a = Free ((Id :+: Tensor f (Free f)) a) ]]></script>which is itself equivalent to <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data Free f a = Pure a | Impure (f (Free f a)) ]]></script>lets define the monad instance <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ instance Functor f => Functor (Free f) where fmap f (Pure a) = Pure $ f a fmap f (Impure (ffa) = Impure $ fmap (fmap f) ffa ]]></script>that was easy, now <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ instance Functor f => Monad (Free f) where return = Pure m (>>=) f = joinFree $ fmap f m ]]></script>okay, so the hard part is join. How do we define that? </p><p>Recall that <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ cancat [] = [] concat (x:xs) = x:concat xs ]]></script>so <script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ joinFree (Pure a) = a joinFree (Impure f) = Impure $ fmap joinFree f ]]></script>and now we have free monads! </p>Philip J-Fhttp://www.blogger.com/profile/03701325990487750739noreply@blogger.com1tag:blogger.com,1999:blog-7850814170309111848.post-25091802451077684202012-02-29T22:04:00.000-08:002012-02-29T22:48:39.987-08:00Haskell Supports First-Class Instances<p>Haskell’s type classes are one of its most powerful and distinctive features. They are also, at times, rather annoying. One of the most annoying facts about Haskell’s type classes is that they are decidedly not first class. GHC 7.4 brings first class constraints to Haskell, but most of the difficulties come from the lack of first class instance. Not having first class instances has a range of negative consequences. </p><p> For example, type class instances are automatically carried over module boundaries. If you don’t think this is a problem you have never had to deal with <strong>Control.Monad.Either</strong> and <strong>Control.Monad.Error</strong> in the same program. </p><p>The ability to have only a single instance for each class associated with a type makes little theoretical sense. Integers form a monoid under addition. Integers also form a monoid under multiplication. In fact, forming a monoid under two different operations is the crux of the definition of a semi-ring. So, why does it make any sense at all that in Haskell there can be at most one definition of “Monoid” for any type? </p><p>Scala’s implicit objects do not suffer from these limitations<sup><a href="#fn1" id="ref1">1</a></sup>. Neither do Coq’s type classes<sup><a href="#fn2" id="ref2">2</a></sup>. There have been various proposals for “first class instances”<sup><a href="#fn3" id="ref3">3</a></sup>, but none have made it into Haskell proper. </p><p>Or have they? The remainder of this blog post will show how to get first class instance in GHC 7.4.1. To the best of my knowledge this has not been published by anyone else yet (although Edward Kmett's blog has gotten pretty close<sup><a href="#fn4" id="ref4">4</a></sup>). </p><p>First of all we are going to need some language extensions, and some imports for later on: </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ {-# LANGUAGE ConstraintKinds #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE FlexibleContexts #-} import Unsafe.Coerce import Control.Applicative ]]></script><p>Other than the truly evil <strong>Unsafe.Coerce</strong>, there is nothing here that threatening. No Undecidable/Incoherent/CommunistInstances. No kind polymorphism, monomorphism enabling, or data type promoting hogwash. We didn’t even import <strong>GHC.Prim</strong>, which is pretty amazing considering that without it we can’t use the <strong>Constraint</strong> keyword. </p><p>Following Kmett we define a type for explicit dictionaries </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data Dict c where Dict :: c => Dict c ]]></script><p>Now is the magic part. We define a function that lets us use an explicit dictionary to do implicit things </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ withDict :: Dict c -> (c => b) -> b withDict Dict b = b ]]></script><p>This function works, but by itself isn’t very useful. So far to produce a dictionary, you have to already have a type class instance in scope. We need a way to keep instances from taking over and flooding across module boundaries. The standard Haskell solution to this is to define a <strong>newtype</strong>. So, lets define one <strong>newtype</strong> to rule them all </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype Proxy label p = Proxy p ]]></script><p>Okay, this is very good for proxying types of kind *, but we really should also define </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ newtype Proxy1 label p x = Proxy1 (p x) ]]></script><p>What good is this proxy definition? Well, in GHC newtypes disappear completely at runtime. That means (ignoring type families) that the representation of an instance of some class for <strong>Proxy Label Type</strong> is the same as it is for <strong>Type</strong>, so we can define a truly mischievous little function </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ unProxyDict :: Dict (c (Proxy label p)) -> Dict (c p) unProxyDict = unsafeCoerce ]]></script><p>Lets add some stuff to make working with these things friendlier </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ proxyDictFromLabel :: c (Proxy label p) => label -> Dict (c (Proxy label p)) proxyDictFromLabel _ = Dict dictFromLabel :: c (Proxy label p) => label -> Dict (c p) dictFromLabel = unProxyDict . proxyDictFromLabel ]]></script><p>And that’s it. We have a fully functional first class instances system. Here is how you use it. Lets define a new version of the <strong>Show</strong> type class </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ class Show' t where show' :: t -> String ]]></script><p>Now, normally there is only one way to <strong>Show</strong> a thing. But, our <strong>Show'</strong> wont be so lame </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data StupidShow = StupidShow data NormalShow = NormalShow instance Show' (Proxy StupidShow Int) where show' _ = "from StupdiShow" instance Show' (Proxy NormalShow Int) where show' (Proxy p) = show p ]]></script><p>Some stuff to get the instance: </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ stupidShowIntDict :: Dict (Show' Int) stupidShowIntDict = dictFromLabel StupidShow normalShowIntDict :: Dict (Show' Int) normalShowIntDict = dictFromLabel NormalShow ]]></script><p>Now we need a function to test. Luckily, we turned on FlexibleContexts at the beginning </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ showInt :: Show' Int => Int -> String showInt = show' ]]></script><p>In GHCi we can confirm that it works. </p><pre><br />*Main> showInt 3<br /><br /><interactive>:118:1:<br /> No instance for (Show' Int)<br /> arising from a use of `showInt'<br /> Possible fix: add an instance declaration for (Show' Int)<br /> In the expression: showInt 3<br /> In an equation for `it': it = showInt 3<br />*Main> withDict stupidShowIntDict (showInt 3)<br />"from StupdiShow"<br />*Main> withDict normalShowIntDict (showInt 3)<br />"3"<br />*Main><br /></pre><p>“Big f-ing deal” you say. “That’s what we had newtypes for. This is pretty useless.” Well, I turn to the most popular example of bad legacy design in the Haskell standard library. <strong>Monad</strong> is not a subtype of <strong>Functor</strong>, and even worse, many types have nice <strong>Monad</strong> instances, but no instances of <strong>Applicative</strong>. We can fix that. </p><p>First of all, we need to get to recreate our dictionary making machinery for <strong>Proxy1</strong>. Disturbingly, the code (up to the numerals) is identical to what we had before </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ unProxyDict1 :: Dict (c (Proxy1 label p)) -> Dict (c p) unProxyDict1 = unsafeCoerce proxyDictFromLabel1 :: c (Proxy1 label p) => label -> Dict (c (Proxy1 label p)) proxyDictFromLabel1 _ = Dict dictFromLabel1 :: c (Proxy1 label p) => label -> Dict (c p) dictFromLabel1 = unProxyDict1 . proxyDictFromLabel1 ]]></script><p>Now we can define some functions to produce <strong>Functor</strong> and <strong>Applicative</strong> instances for all monads </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data AppLabel = AppLabel instance Monad m => Functor (Proxy1 AppLabel m) where fmap f (Proxy1 x) = Proxy1 $ x >>= (\y -> return (f y)) instance Monad m => Applicative (Proxy1 AppLabel m) where pure x = Proxy1 $ return x (Proxy1 x) <*> (Proxy1 y) = Proxy1 $ do x' <- x y' <- y return (x' y') functorFromMonad :: Monad m => Dict (Functor m) functorFromMonad = dictFromLabel1 AppLabel appFromMonad :: Monad m => Dict (Applicative m) appFromMonad = dictFromLabel1 AppLabel ]]></script><p>So, now when you run into someone who for unknown reasons defined their own list type and only made it a <strong>Monad</strong></p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ data List a = Nil | Cons a (List a) deriving Show listApp Nil x = x listApp (Cons x xs) ys = Cons x (listApp xs ys) instance Monad List where return x = Cons x Nil Nil >>= f = Nil (Cons a ls) >>= f = listApp (f a) (ls >>= f) ]]></script><p>You don’t need to be stuck. Just define a function so you don’t need to stick type signatures all over the place </p><script type="syntaxhighlighter" class="brush: haskell"><![CDATA[ listFunctor :: Dict (Applicative List) listFunctor = appFromMonad ]]></script><p>And you got your Functor! </p><pre><br />*Main> fmap (+1) $ Cons 1 Nil<br /><br /><interactive>:121:1:<br /> No instance for (Functor List)<br /> arising from a use of `fmap'<br /> Possible fix: add an instance declaration for (Functor List)<br /> In the expression: fmap (+ 1)<br /> In the expression: fmap (+ 1) $ Cons 1 Nil<br /> In an equation for `it': it = fmap (+ 1) $ Cons 1 Nil<br />*Main> withDict listFunctor $ fmap (+1) $ Cons 1 Nil<br />Cons 2 Nil<br /><br /></pre><p>I’m not quite ready to make this my first package on Hackage. The API still needs some kinks worked out. And, I need to do a lot more testing to see if this creates any opportunities for segmentation faults. Still, it is nice to know that passable instances are at least possible in Haskell. </p><sup id="fn1">1. Oliveira and Ordesky <a href="http://ropas.snu.ac.kr/~bruno/papers/TypeClasses.pdf">http://ropas.snu.ac.kr/~bruno/papers/TypeClasses.pdf</a> <a href="#ref1" title="Jump back to footnote 1 in the text.">↩</a></sup><br><sup id="fn2">2. Sozeau and Oury <a href="http://mattam.org/research/publications/First-Class_Type_Classes.pdf">http://mattam.org/research/publications/First-Class_Type_Classes.pdf</a> <a href="#ref2" title="Jump back to footnote 2 in the text.">↩</a></sup><br><sup id="fn3">3. Oliveira and Sulzmann <a href="http://www.cs.mu.oz.au/~sulzmann/manuscript/objects-unify-type-classes-gadts.ps">http://www.cs.mu.oz.au/~sulzmann/manuscript/objects-unify-type-classes-gadts.ps</a> <a href="#ref3" title="Jump back to footnote 3 in the text.">↩</a></sup><br><sup id="fn4">4. The Comonad.Reader <a href="http://comonad.com/reader/2011/what-constraints-entail-part-1/">http://comonad.com/reader/2011/what-constraints-entail-part-1/</a> <a href="#ref4" title="Jump back to footnote 3 in the text.">↩</a></sup><br>Philip J-Fhttp://www.blogger.com/profile/03701325990487750739noreply@blogger.com5tag:blogger.com,1999:blog-7850814170309111848.post-6112651106356164082012-02-29T19:47:00.002-08:002012-02-29T19:47:26.235-08:00Hello WorldHello World!Philip J-Fhttp://www.blogger.com/profile/03701325990487750739noreply@blogger.com0