Attoparsec Tutorial Part 3 - Parse Data in Variable Order

May 10, 2016
haskellattoparsec

In ths lesson, we will build upon the parses from the previous lesson. The goal is to take a string in which the name and phone key-value pairs may occur in any order and can succesfully be parsed into a value of type Person.

String with name first:

name:Isaac
phone:1122-3344

String with phone first:

phone:1122-3344
name:Isaac

My first intuition was to parse it like this:

parsePerson :: Parser Person
parsePerson = do
  first  <- parseName <|> parsePhone
  ...

The problem this is unable to distinguish between the resulting type. They both return Text. The value of the key-value pairs looks the same. A simple solution is to create a sum type that can represent a name or a phone, then we can type match the parse result.

data PersonItem = NameResult Text | PhoneResult Text

parsePerson :: Parser Person
parsePerson = do
  first <- (NameResult <$> parseName) <|> (PhoneResult <$> parsePhone)
  case first of
    NameResult n -> do
      p <- parsePhone
      return $ Person n p
    PhoneResult p -> do
      n <- parseName
      return $ Person n p

We try to perform parseName and wrap it in NameResult. If parseName fails then we try parsePhone and wrap it in PhoneResult. If that fails also then parsePerson fails. Keep in mind that both parsers used in (<|>) must return the same type. We have met this requirement by wrapping the parse result of the two parses with constructors of of the same parse type.

If name is parsed first then we parse the phone number, and if phone number is parsed then we parse the name. The end result is the same regardless of which key-value pair occurs first. However, this pattern does not scale very well. Think about what it would like if we had four or five constructors in PersonItem. The code would become very messy.

spec :: Spec
spec = do
  describe "parsePerson" $ do
    it "should parse \"name:\"" $
      parseNameKey `shouldSucceedOn` ("name:" :: Text)
    it "should parse \"name:James\nphone:867-5309\"" $
      ("name:James\nphone:867-5309" :: Text) ~> parsePerson `shouldParse` (Person "James" "867-5309")
    it "should parse \"phone:867-5309\nname:James\"" $
      ("phone:867-5309\nname:James" :: Text) ~> parsePerson `shouldParse` (Person "James" "867-5309" :: Person)

main :: IO ()
main = hspec spec

Run code with stack --resolver lts-8.17 runghc 2016-05-10-attoparsec-tutorial-3.lhs.