Here’s an interesting problem…
Given an undirected graph, find all the paths (not just the shortest) between two nodes that are less than some given length.
I decided to attack this problem using Haskell. Simply because I’ve been trying to get to grips with the language and felt that this was a nice “real” problem for me to practice my fledgling skills on. I’ve chosen to model a graph as a list of edges which are described as a simple pair (see the end of the following source code). I have therefore made the following assumptions in the code;
- An entry in the list (‘a’,'b’) explicitly states that an edge exists between the nodes ‘a’ and ‘b’ and further this implies that the edge (‘b’,'a’) exists since the graph is undirected.
- An entry (‘a’,'a’) implies that a node ‘a’ exists that does not have any edges, rather than an edge exists that self-references the node ‘a’. Nodes are not allowed to have edges to themselves.
The following code does work, and returns the correct number of paths given my test data and the requests I make of it. I can’t escape the feeling, though, that my Haskell could be made more simple and efficient. I just don’t know Haskell well enough yet to work out how!
The Solution
Edit 1: code to apply Hlint comments
- Only HLint error: use of “not (elem …)” rather than “notElem …” has been fixed
- HLint warning: use of “[p] ++ (some list)” rather than “p : (some list” has been fixed
- HLint warning: unnecessary parenthesis used in a few places, most of these have been fixed
Edit 2: corrected some cut and paste mistakes
import Data.List
-- finds all the paths with length <= l from u to v in g
findAllPathsTo l u v g
= filter endsAt (findAllPaths l u g)
where
endsAt ps = (last ps) == v
-- finds all the paths with length >= l from u in g
findAllPaths l n g
| notElem n (allNodes g) = []
| otherwise = allPaths' l g [[n]]
-------------------------------------------------------------------------------
-- Supporting functions below
-------------------------------------------------------------------------------
appendNeighboursToPath l p g
= append ns p
where
ns = neighbours n g
n = last p
append [] _ = []
append (n:ns) p = flatten (appendToAll l [p] n) : append ns p
neighbours a g
= reverse (foldl step [] g)
where step acc (u,v)
| u == a = v : acc
| v == a = u : acc
| otherwise = acc
allNodes :: (Eq a) => [(a,a)] -> [a]
allNodes [] = []
allNodes [(u,v)] = [u,v]
allNodes ((u,v):gs) = nub (u : v : allNodes gs)
allPaths' l g acc
| length acc == length paths = acc
| otherwise = allPaths' l g (nub (acc ++ paths))
where
paths = nub (appendNeighboursToPaths l acc g)
appendNeighboursToPaths l [] _ = []
appendNeighboursToPaths l (p:ps) g
= p : nub (appendNeighboursToPath l p g ++ appendNeighboursToPaths l ps g)
appendToAll l [] _ = []
appendToAll l (x:xs) y = maybeAdded : appendToAll l xs y
where
maybeAdded | elem y x = x
| length (x) > (l-1) = x
| otherwise = x ++ [y]
flatten = foldl (++) []
-------------------------------------------------------------------------------
-- Test code below
-------------------------------------------------------------------------------
test=do putStr "\n"
putStr "Simple tests to check return length of path lists\n"
putStr "1 == length (findAllPaths 100 'a' g0)? " ;
print (1 == length (findAllPaths 100 'a' g0))
putStr "5 == length (findAllPaths 100 'a' g1)? " ;
print (5 == length (findAllPaths 100 'a' g1))
putStr "9 == length (findAllPaths 100 'a' g2)? " ;
print (9 == length (findAllPaths 100 'a' g2))
putStr "9 == length (findAllPaths 100 'a' g3)? " ;
print (9 == length (findAllPaths 100 'a' g3))
putStr "4 == length (findAllPaths 100 'a' g4)? " ;
print (4 == length (findAllPaths 100 'a' g4))
putStr "21 == length (findAllPaths 100 'a' g5)? " ;
print (21 == length (findAllPaths 100 'a' g5))
putStr "\n"
putStr "2 == length (findAllPathsTo 100 'a' 'f' g5)? " ;
print (2 == length (findAllPathsTo 100 'a' 'f' g5))
putStr "1 == length (findAllPathsTo 6 'a' 'f' g5)? " ;
print (1 == length (findAllPathsTo 6 'a' 'f' g5))
putStr "0 == length (findAllPathsTo 100 'a' 'z' g5)? " ;
print (0 == length (findAllPathsTo 100 'a' 'z' g5))
putStr "\n"
putStr "Has all paths [\"abcdf\",\"abcjklmnf\"] in (findAllPathsTo 100 'a' 'f' g5)? " ;
print (checkSameContents (findAllPathsTo 100 'a' 'f' g5) ["abcdf","abcjklmnf"])
putStr "Has all paths [\"abcdf\"] in (findAllPathsTo 6 'a' 'f' g5)? " ;
print (checkSameContents (findAllPathsTo 6 'a' 'f' g5) ["abcdf"])
g0=[('a','a'),('b','b'),('c','c')]
g1=[('a','b'),('b','c'),('c','d'),('d','e')]
g2=[('a','b'),('b','c'),('c','d'),('c','e'),('d','f'),('e','f')]
g3=[('a','b'),('b','c'),('c','d'),('c','e'),('d','f'),('e','g'),('g','h'),('h','i')]
g4=[('a','b'),('b','c'),('b','d')]
g5=[('a','b'),('b','c'),('c','d'),('c','e'),('d','f'),('e','g'),('g','h'),('h','i'),('c','j'),('j','k'),('k','l'),('l','m'),('m','n'),('n','f')]
checkSameContents :: (Eq a) => [a] -> [a] -> Bool
checkSameContents a b
= (length a == length b) && (length b) == (length (takeWhile f a))
where
f x = elem x b
October 21, 2009 at 20:52 |
I’m curious if you looked at using a preexisting graph library and if so, what your thoughts of the library(s) are.
There’s graphalize [1] and fgl [2] for starters.
[1] http://hackage.haskell.org/package/Graphalyze
[2] http://hackage.haskell.org/package/fgl
October 21, 2009 at 20:57 |
HLint found an error in your code.
October 22, 2009 at 01:13 |
Maybe you would be interested in: http://web.engr.oregonstate.edu/~erwig/fgl/haskell/
October 23, 2009 at 08:39 |
@tommd and Diego Echeverri
Thanks for the links and references. I did have a quick look around for graph algorithms in Haskell and did come across at least one existing library.
I wanted to treat this problem as a Haskell learning exercise as well as an interesting graph problem. So I left all existing libraries alone and haven’t even looked at their documentation.
@P Jones
I’m very much a Haskell newbie and hadn’t heard of Hlint before. Thanks for pointing out that some errors and warnings existed. I’ve now edited the code to fix (most of) them. Whilst reading about Hlint I came across a comment by someone advising code reviewers not to review code that hasn’t already been Hlint-ed, so I’ll bare that advice in mind in the future.
I’m curious as to why the use of “not (elem …)” is an error and not a warn. I’ve taken Hlint’s advice and changed it to “notElem …”. The only thing I can think of is that “notElem” could fail fast while “not (elem …)” would have to munch the entire list before it could return. That still seems more of a ‘warning’ than an ‘error’ to me though.