<?xml version="1.0" encoding="UTF-8"?><feed xmlns="http://www.w3.org/2005/Atom"><title xmlns:ns="http://www.w3.org/2005/Atom" ns:type="text">Parsing – Haskell – Aelve Guide</title><id>https://guide.aelve.com/haskell/feed/category/lnwybqv9</id><updated>2017-10-03T15:23:56Z</updated><link xmlns:ns="http://www.w3.org/2005/Atom" ns:href="https://guide.aelve.com/haskell/feed/category/lnwybqv9"/><entry><id>ktz9bewh</id><title xmlns:ns="http://www.w3.org/2005/Atom" ns:type="text">fastparser</title><updated>2017-10-03T15:23:56Z</updated><content xmlns:ns="http://www.w3.org/2005/Atom" ns:type="html">&lt;h1&gt;  &lt;span class=&#34;item-name&#34;&gt;fastparser&lt;/span&gt;

  
  (&lt;a href=&#34;https://hackage.haskell.org/package/fastparser&#34;&gt;Hackage&lt;/a&gt;)
&lt;/h1&gt;&lt;p&gt;A very simple, backtracking, fast parser combinator library.&lt;/p&gt;
&lt;p&gt;Do not use fastparser when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;performance is not the most pressing concern.&lt;/li&gt;
&lt;li&gt;you need to parse anything else but strict ByteString.&lt;/li&gt;
&lt;li&gt;you need to use a battle-tested library (still experimental)&lt;/li&gt;
&lt;li&gt;you need to parse large inputs that are not easily cut into many smaller pieces that can be parsed independently&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Pros&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;Very, very fast. Measurably faster than attoparsec (36% in &lt;a href=&#34;https://hbtvl.banquise.net/posts/2015-12-14-fastParsing03.html&#34;&gt;this use case&lt;/a&gt;)&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;&lt;h2&gt;Cons&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;only works on strict &lt;code&gt;ByteString&lt;/code&gt;&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;lacks many helper functions&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;is not resumable&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;</content><link xmlns:ns="http://www.w3.org/2005/Atom" ns:href="https://guide.aelve.com/haskell/parsing-lnwybqv9#item-ktz9bewh"/></entry><entry><id>voyglb4x</id><title xmlns:ns="http://www.w3.org/2005/Atom" ns:type="text">uu-parsinglib</title><updated>2016-03-10T11:06:40Z</updated><content xmlns:ns="http://www.w3.org/2005/Atom" ns:type="html">&lt;h1&gt;  &lt;span class=&#34;item-name&#34;&gt;uu-parsinglib&lt;/span&gt;

  
  (&lt;a href=&#34;https://hackage.haskell.org/package/uu-parsinglib&#34;&gt;Hackage&lt;/a&gt;)
&lt;/h1&gt;&lt;h2&gt;Pros&lt;/h2&gt;&lt;ul&gt;&lt;/ul&gt;&lt;h2&gt;Cons&lt;/h2&gt;&lt;ul&gt;&lt;/ul&gt;</content><link xmlns:ns="http://www.w3.org/2005/Atom" ns:href="https://guide.aelve.com/haskell/parsing-lnwybqv9#item-voyglb4x"/></entry><entry><id>r2axd97c</id><title xmlns:ns="http://www.w3.org/2005/Atom" ns:type="text">megaparsec</title><updated>2016-03-10T11:06:40Z</updated><content xmlns:ns="http://www.w3.org/2005/Atom" ns:type="html">&lt;h1&gt;  &lt;a href=&#34;https://mrkkrp.github.io/megaparsec/&#34; class=&#34;item-name&#34;&gt;megaparsec&lt;/a&gt;

  
  (&lt;a href=&#34;https://hackage.haskell.org/package/megaparsec&#34;&gt;Hackage&lt;/a&gt;)
&lt;/h1&gt;&lt;p&gt;An unofficial &lt;a href=&#34;https://notehub.org/w7037&#34;&gt;successor&lt;/a&gt; of parsec (which hasn&#39;t seen any updates in quite some time). Nothing particularly fancy – just a good, modern parsing library.&lt;/p&gt;
&lt;h2&gt;Pros&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;Very easy to use.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Error messages are good.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Allows to use custom error messages tailored to your domain of interest (that means you can signal errors using your own custom data constructors).&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;The API is largely similar to Parsec, so existing tutorials/code samples could be reused and migration is easy.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Works well with Text and custom streams of tokens, such as result of running Alex/Happy.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Has special combinators for parsing indentation (good if you&#39;re writing a parser for a small programming language or data format like YAML).&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Has rudimentary &lt;a href=&#34;https://mrkkrp.github.io/megaparsec/tutorials/fun-with-the-recovery-feature.html&#34;&gt;error recovery&lt;/a&gt; – if a part of a parser fails, you can log a parse error and skip a part of input. Sometimes it&#39;s useful.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Has special combinator (as of 5.1.0) for debugging that shows what is going on on lower level.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Well-tested and robust.&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;&lt;h2&gt;Cons&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;Like all parsec-like libraries, it doesn&#39;t like left recursion – i.e. if you&#39;re parsing &lt;code&gt;1+2+3&lt;/code&gt;, you can&#39;t just write something like (in pseudocode) &lt;code&gt;expr = number | (expr &#39;+&#39; number)&lt;/code&gt; and expect it to work. See &lt;a href=&#34;http://blog.moertel.com/posts/2005-08-27-power-parsing-with-haskell-and-parsec.html&#34;&gt;this post&lt;/a&gt; for a more detailed explanation.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Doesn&#39;t have automatic backtracking. This means that if you write &lt;code&gt;expr = add | multiply&lt;/code&gt; and the parser for &lt;code&gt;add&lt;/code&gt; fails in the middle (e.g. after parsing a single number), it won&#39;t try &lt;code&gt;multiply&lt;/code&gt; unless you explicitly tell it to. This can be a good thing (saying when you want to backtrack explicitly can lead to better performance and better error messages), but it can still be somewhat annoying.&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;&lt;h2&gt;Ecosystem&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://hackage.haskell.org/package/hspec-megaparsec&#34;&gt;hspec-megaparsec&lt;/a&gt; - utility functions for testing Megaparsec parsers with Hspec.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://hackage.haskell.org/package/cassava-megaparsec&#34;&gt;cassava-megaparsec&lt;/a&gt; - Megaparsec parser of CSV files that plays nicely with Cassava.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://hackage.haskell.org/package/tagsoup-megaparsec&#34;&gt;tagsoup-megaparsec&lt;/a&gt; - a Tag token parser and Tag specific parsing combinators.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://hackage.haskell.org/package/parser-combinators&#34;&gt;parser-combinators&lt;/a&gt; - lightweight package providing commonly useful parser combinators.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Notes&lt;/h2&gt;&lt;h1&gt;&lt;span id=&#34;item-notes-r2axd97c-links&#34;&gt;&lt;/span&gt;Links&lt;/h1&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;http://blog.jakubarnold.cz/2014/08/10/parsing-css-with-parsec.html&#34;&gt;Parsing CSS with Parsec&lt;/a&gt; is a very to-the-point tutorial and I recommend looking at it first – it&#39;s possible that after reading it you&#39;ll understand how to do parsing without any lengthy explanations. It&#39;s about Parsec, not Megaparsec, but the only difference is in combinator names (e.g. &lt;code&gt;many1&lt;/code&gt; should be &lt;code&gt;some&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://notehub.org/w7037&#34;&gt;Why Megaparsec was created and how it&#39;s better&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://mrkkrp.github.io/megaparsec/tutorials/switch-from-parsec-to-megaparsec.html&#34;&gt;Switching from Parsec to Megaparsec&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://mrkkrp.github.io/megaparsec/tutorials/parsing-simple-imperative-language.html&#34;&gt;An example of parsing a simple language&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://mrkkrp.github.io/megaparsec/tutorials/indentation-sensitive-parsing.html&#34;&gt;How to do indentation-sensitive parsing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://mrkkrp.github.io/megaparsec/&#34;&gt;official site&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://mrkkrp.github.io/megaparsec/tutorials.html&#34;&gt;list of tutorials&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;&lt;span id=&#34;item-notes-r2axd97c-imports&#34;&gt;&lt;/span&gt;Imports&lt;/h1&gt;&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Text.Megaparsec&lt;/span&gt;
&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Text.Megaparsec.String&lt;/span&gt;    &lt;span class=&#34;co&#34;&gt;-- or Text / Text.Lazy / etc&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Additionally, if you&#39;re going to use number parsers (e.g. &lt;code&gt;integer&lt;/code&gt;):&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Text.Megaparsec.Lexer&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1&gt;&lt;span id=&#34;item-notes-r2axd97c-a-long-very-basic-example&#34;&gt;&lt;/span&gt;A long, very basic example&lt;/h1&gt;&lt;p&gt;(Only read it if you were left confused by &lt;a href=&#34;http://blog.jakubarnold.cz/2014/08/10/parsing-css-with-parsec.html&#34;&gt;Parsing CSS with Parsec&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;Parsers work like this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;A parser consumes a piece of string and turns it into a value. &lt;code&gt;Parser a&lt;/code&gt; returns a value of type &lt;code&gt;a&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If you combine parsers with &lt;code&gt;&amp;gt;&amp;gt;&lt;/code&gt; or &lt;code&gt;do&lt;/code&gt;, they get applied one-by-one (the 2nd parser would start consuming string from the place where the 1st parser stopped).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If you combine parsers with &lt;code&gt;&amp;lt;|&amp;gt;&lt;/code&gt;, they will be applied to the same pieces of string, but the 2nd parser would only be tried if the 1st parser fails.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let&#39;s say you have something that is either a pair or a triplet of numbers, and has to be enclosed into double parens: &lt;code&gt;((1,2))&lt;/code&gt; or &lt;code&gt;((1,2,3))&lt;/code&gt;. We&#39;ll represent it with &lt;code&gt;Vec&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;data&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Vec&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;V2&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Integer&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Integer&lt;/span&gt;
         &lt;span class=&#34;fu&#34;&gt;|&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;V3&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Integer&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Integer&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Integer&lt;/span&gt;
  &lt;span class=&#34;kw&#34;&gt;deriving&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Show&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Let&#39;s write a parser for &lt;code&gt;Vec&lt;/code&gt;. First of all, notice that the parens are the same in both cases, and so we can write a special function for parsing something inside double parens:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;ot&#34;&gt;doubleParens ::&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Parser&lt;/span&gt; a &lt;span class=&#34;ot&#34;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Parser&lt;/span&gt; a
doubleParens p &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; between (string &lt;span class=&#34;st&#34;&gt;&amp;quot;((&amp;quot;&lt;/span&gt;) (string &lt;span class=&#34;st&#34;&gt;&amp;quot;))&amp;quot;&lt;/span&gt;) p&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;between&lt;/code&gt; is a combinator available in Megaparsec. There are many such combinators – you can find the whole list in &lt;a href=&#34;https://hackage.haskell.org/package/megaparsec/docs/Text-Megaparsec-Combinator.html&#34;&gt;&lt;code&gt;Text.Megaparsec.Combinator&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Next, parsers for &lt;code&gt;V2&lt;/code&gt; and &lt;code&gt;V3&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;ot&#34;&gt;v2 ::&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Parser&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Vec&lt;/span&gt;
v2 &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; doubleParens &lt;span class=&#34;fu&#34;&gt;$&lt;/span&gt; &lt;span class=&#34;kw&#34;&gt;do&lt;/span&gt;
  a &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; integer
  char &lt;span class=&#34;ch&#34;&gt;&#39;,&#39;&lt;/span&gt;
  b &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; integer
  return (&lt;span class=&#34;dt&#34;&gt;V2&lt;/span&gt; a b)

&lt;span class=&#34;ot&#34;&gt;v3 ::&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Parser&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Vec&lt;/span&gt;
v3 &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; doubleParens &lt;span class=&#34;fu&#34;&gt;$&lt;/span&gt; &lt;span class=&#34;kw&#34;&gt;do&lt;/span&gt;
  a &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; integer
  char &lt;span class=&#34;ch&#34;&gt;&#39;,&#39;&lt;/span&gt;
  b &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; integer
  char &lt;span class=&#34;ch&#34;&gt;&#39;,&#39;&lt;/span&gt;
  c &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; integer
  return (&lt;span class=&#34;dt&#34;&gt;V3&lt;/span&gt; a b c)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Once we have those 2 parsers, we can combine them:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;ot&#34;&gt;vec ::&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Parser&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Vec&lt;/span&gt;
vec &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; try v2 &lt;span class=&#34;fu&#34;&gt;&amp;lt;|&amp;gt;&lt;/span&gt; v3&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;try&lt;/code&gt; means that if &lt;code&gt;v2&lt;/code&gt; consumes some input and then fails, the next parser (i.e. &lt;code&gt;v3&lt;/code&gt;) will be tried. By default Megaparsec only tries the next parser if the previous one fails without consuming any input, which gives you more control over how your parser behaves. (There are some parsing libraries that do &lt;em&gt;automatic backtracking&lt;/em&gt; (i.e. when a parser fails, even at the last step, they always go back in time and try other parsers), but Megaparsec doesn&#39;t do it.)&lt;/p&gt;
&lt;p&gt;To actually use the parser, we need &lt;a href=&#34;https://hackage.haskell.org/package/megaparsec/docs/Text-Megaparsec.html#v:parse&#34;&gt;&lt;code&gt;parse&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;parse
&lt;span class=&#34;ot&#34;&gt;  ::&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Parser&lt;/span&gt; a
  &lt;span class=&#34;ot&#34;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;String&lt;/span&gt;            	    &lt;span class=&#34;co&#34;&gt;-- Filepath (can be &amp;quot;&amp;quot;)&lt;/span&gt;
  &lt;span class=&#34;ot&#34;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;String&lt;/span&gt;                 &lt;span class=&#34;co&#34;&gt;-- String that is being parsed&lt;/span&gt;
  &lt;span class=&#34;ot&#34;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Either&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;ParseError&lt;/span&gt; a	 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(The actual signature is more general because it can also work on &lt;code&gt;Text&lt;/code&gt; and so on.)&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode repl&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;fu&#34;&gt;&amp;gt;&lt;/span&gt; parse vec &lt;span class=&#34;st&#34;&gt;&amp;quot;&amp;quot;&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;((3,12))&amp;quot;&lt;/span&gt;
&lt;span class=&#34;dt&#34;&gt;Right&lt;/span&gt; (&lt;span class=&#34;dt&#34;&gt;V2&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;3&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;12&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;What would happen if the input is bad?&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode repl&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;fu&#34;&gt;&amp;gt;&lt;/span&gt; parse vec &lt;span class=&#34;st&#34;&gt;&amp;quot;&amp;quot;&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;1,2))&amp;quot;&lt;/span&gt;

&lt;span class=&#34;dt&#34;&gt;Left&lt;/span&gt; line &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;, column &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;:&lt;/span&gt;
unexpected &lt;span class=&#34;ch&#34;&gt;&#39;1&#39;&lt;/span&gt;
expecting &lt;span class=&#34;st&#34;&gt;&amp;quot;((&amp;quot;&lt;/span&gt;

&lt;span class=&#34;fu&#34;&gt;&amp;gt;&lt;/span&gt; parse vec &lt;span class=&#34;st&#34;&gt;&amp;quot;&amp;quot;&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;((3,12,))&amp;quot;&lt;/span&gt;

&lt;span class=&#34;dt&#34;&gt;Left&lt;/span&gt; line &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;, column &lt;span class=&#34;dv&#34;&gt;7&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;:&lt;/span&gt;
unexpected &lt;span class=&#34;ch&#34;&gt;&#39;,&#39;&lt;/span&gt;
expecting &lt;span class=&#34;st&#34;&gt;&amp;quot;))&amp;quot;&lt;/span&gt; or rest &lt;span class=&#34;kw&#34;&gt;of&lt;/span&gt; integer

&lt;span class=&#34;fu&#34;&gt;&amp;gt;&lt;/span&gt; parse vec &lt;span class=&#34;st&#34;&gt;&amp;quot;&amp;quot;&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;((1,a))&amp;quot;&lt;/span&gt;

&lt;span class=&#34;dt&#34;&gt;Left&lt;/span&gt; line &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;, column &lt;span class=&#34;dv&#34;&gt;5&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;:&lt;/span&gt;
unexpected &lt;span class=&#34;ch&#34;&gt;&#39;a&#39;&lt;/span&gt;
expecting integer&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;By the way, there&#39;s a bit more repetition in our parsers that we could eliminate – specifically, parsing the first 2 numbers:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;ot&#34;&gt;vec ::&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Parser&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Vec&lt;/span&gt;
vec &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; doubleParens &lt;span class=&#34;fu&#34;&gt;$&lt;/span&gt; &lt;span class=&#34;kw&#34;&gt;do&lt;/span&gt;
  a &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; integer
  char &lt;span class=&#34;ch&#34;&gt;&#39;,&#39;&lt;/span&gt;
  b &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; integer
  choice [
    &lt;span class=&#34;kw&#34;&gt;do&lt;/span&gt; char &lt;span class=&#34;ch&#34;&gt;&#39;,&#39;&lt;/span&gt;
       c &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; integer
       return (&lt;span class=&#34;dt&#34;&gt;V3&lt;/span&gt; a b c),
    return (&lt;span class=&#34;dt&#34;&gt;V2&lt;/span&gt; a b) ]&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here we used &lt;code&gt;choice&lt;/code&gt;, which is like &lt;code&gt;&amp;lt;|&amp;gt;&lt;/code&gt; but for many parsers instead of just 2.&lt;/p&gt;
</content><link xmlns:ns="http://www.w3.org/2005/Atom" ns:href="https://guide.aelve.com/haskell/parsing-lnwybqv9#item-r2axd97c"/></entry><entry><id>p6odina8</id><title xmlns:ns="http://www.w3.org/2005/Atom" ns:type="text">attoparsec</title><updated>2016-03-10T11:06:40Z</updated><content xmlns:ns="http://www.w3.org/2005/Atom" ns:type="html">&lt;h1&gt;  &lt;span class=&#34;item-name&#34;&gt;attoparsec&lt;/span&gt;

  
  (&lt;a href=&#34;https://hackage.haskell.org/package/attoparsec&#34;&gt;Hackage&lt;/a&gt;)
&lt;/h1&gt;&lt;p&gt;A very fast parsing library for &lt;code&gt;Text&lt;/code&gt; and &lt;code&gt;ByteString&lt;/code&gt;. Best suited for parsing things that aren&#39;t going to be seen by humans (like JSON, binary protocols, and so on). Not that good for parsing e.g. programming languages – for instance, it doesn&#39;t even tell you the positions of errors when they happen.&lt;/p&gt;
&lt;h2&gt;Pros&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;Performance (see &lt;a href=&#34;http://www.serpentine.com/blog/2014/05/31/attoparsec/&#34;&gt;this&lt;/a&gt; for a comparison of sorts). Can be 10× faster than Parsec.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Has automatic backtracking, which means that you don&#39;t have to figure out where to put &lt;code&gt;try&lt;/code&gt; – everything just works.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Has a simpler API than parsec/megaparsec.&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;&lt;h2&gt;Cons&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;Can&#39;t report positions of parsing errors. (And the error messages are generally poor.)&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Doesn&#39;t provide a monad transformer. This means that if you want to do something while parsing (e.g. keep state, or print warnings, or whatever), you can&#39;t.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Backtracking can&#39;t be turned off or limited in scope (i.e. you can&#39;t say “if this parser didn&#39;t fail then commit to it”). It makes error messages worse and likely hurts performance (but I&#39;m not sure, given that attoparsec is still the fastest library around).&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;&lt;h2&gt;Ecosystem&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Additional parsers:
&lt;a href=&#34;https://hackage.haskell.org/package/attoparsec-expr&#34;&gt;attoparsec-expr&lt;/a&gt; (Parsec-like expression parser),
&lt;a href=&#34;https://hackage.haskell.org/package/attoparsec-binary&#34;&gt;attoparsec-binary&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/http-attoparsec&#34;&gt;http-attoparsec&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/aeson/docs/Data-Aeson-Parser.html&#34;&gt;aeson&lt;/a&gt; (JSON),
&lt;a href=&#34;https://hackage.haskell.org/package/timeparsers&#34;&gt;timeparsers&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/html-entities&#34;&gt;html-entities&lt;/a&gt;,
&lt;a href=&#34;http://hackage.haskell.org/package/taggy/docs/Text-Taggy-Parser.html&#34;&gt;taggy&lt;/a&gt; (HTML/XML),
&lt;a href=&#34;https://hackage.haskell.org/package/css-text&#34;&gt;css-text&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/hweblib&#34;&gt;hweblib&lt;/a&gt; (HTTP, MIME, URI, ABNF),
&lt;a href=&#34;https://hackage.haskell.org/package/http-date&#34;&gt;http-date&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Iteratees:
&lt;a href=&#34;https://hackage.haskell.org/package/attoparsec-iteratee&#34;&gt;attoparsec-iteratee&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/pipes-attoparsec&#34;&gt;pipes-attoparsec&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/conduit-extra/docs/Data-Conduit-Attoparsec.html&#34;&gt;conduit-extra&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/conduit-tokenize-attoparsec&#34;&gt;conduit-tokenize-attoparsec&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/streaming-utils&#34;&gt;streaming-utils&lt;/a&gt;,
&lt;a href=&#34;http://hackage.haskell.org/package/io-streams/docs/System-IO-Streams-Attoparsec.html&#34;&gt;io-streams&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Other:
&lt;a href=&#34;https://hackage.haskell.org/package/foldl-transduce-attoparsec&#34;&gt;foldl-transduce-attoparsec&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/hspec-attoparsec&#34;&gt;hspec-attoparsec&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/list-t-attoparsec&#34;&gt;list-t-attoparsec&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/network-attoparsec&#34;&gt;network-attoparsec&lt;/a&gt;,
&lt;a href=&#34;https://hackage.haskell.org/package/attosplit&#34;&gt;attosplit&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Notes&lt;/h2&gt;&lt;h1&gt;&lt;span id=&#34;item-notes-p6odina8-links&#34;&gt;&lt;/span&gt;Links&lt;/h1&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://www.schoolofhaskell.com/school/starting-with-haskell/libraries-and-frameworks/text-manipulation/attoparsec&#34;&gt;Parsing log files with Attoparsec&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;&lt;span id=&#34;item-notes-p6odina8-imports&#34;&gt;&lt;/span&gt;Imports&lt;/h1&gt;&lt;p&gt;For parsing &lt;code&gt;ByteString&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Control.Applicative&lt;/span&gt;
&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Data.Attoparsec.ByteString&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For parsing &lt;code&gt;Text&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Control.Applicative&lt;/span&gt;
&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Data.Attoparsec.Text&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1&gt;&lt;span id=&#34;item-notes-p6odina8-helpers&#34;&gt;&lt;/span&gt;Helpers&lt;/h1&gt;&lt;p&gt;This function modifies a parser to print some info about it (namely, what it has consumed, remaining input, and the value it has parsed):&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Debug.Trace&lt;/span&gt;
&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Data.Attoparsec.Combinators&lt;/span&gt;

&lt;span class=&#34;ot&#34;&gt;debug ::&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Show&lt;/span&gt; a &lt;span class=&#34;ot&#34;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Parser&lt;/span&gt; a &lt;span class=&#34;ot&#34;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Parser&lt;/span&gt; a
debug p &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kw&#34;&gt;do&lt;/span&gt;
  (consumed, a) &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; match p
  remaining &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; lookAhead takeText
  traceM (&lt;span class=&#34;st&#34;&gt;&amp;quot;result    : &amp;quot;&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;++&lt;/span&gt; show a)
  traceM (&lt;span class=&#34;st&#34;&gt;&amp;quot;consumed  : &amp;quot;&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;++&lt;/span&gt; show consumed)
  traceM (&lt;span class=&#34;st&#34;&gt;&amp;quot;remaining : &amp;quot;&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;++&lt;/span&gt; show remaining)
  return a&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
</content><link xmlns:ns="http://www.w3.org/2005/Atom" ns:href="https://guide.aelve.com/haskell/parsing-lnwybqv9#item-p6odina8"/></entry><entry><id>msuib883</id><title xmlns:ns="http://www.w3.org/2005/Atom" ns:type="text">Earley</title><updated>2016-03-10T11:06:40Z</updated><content xmlns:ns="http://www.w3.org/2005/Atom" ns:type="html">&lt;h1&gt;  &lt;span class=&#34;item-name&#34;&gt;Earley&lt;/span&gt;

  
  (&lt;a href=&#34;https://hackage.haskell.org/package/Earley&#34;&gt;Hackage&lt;/a&gt;)
&lt;/h1&gt;&lt;h2&gt;Pros&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;Handles left recursion just fine&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Can &lt;a href=&#34;https://github.com/ollef/Earley#textearleygenerator&#34;&gt;generate strings given a parser that parses them&lt;/a&gt;.&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;&lt;h2&gt;Cons&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;Doesn&#39;t have monadic parsing&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;&lt;h2&gt;Ecosystem&lt;/h2&gt;&lt;p&gt;&lt;a href=&#34;https://hackage.haskell.org/package/pinchot&#34;&gt;pinchot&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Notes&lt;/h2&gt;&lt;h1&gt;&lt;span id=&#34;item-notes-msuib883-links&#34;&gt;&lt;/span&gt;Links&lt;/h1&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/ollef/Earley/blob/master/docs/implementation.md&#34;&gt;Implementation notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/ollef/Earley/tree/master/examples&#34;&gt;Examples&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;&lt;span id=&#34;item-notes-msuib883-imports-and-pragmas&#34;&gt;&lt;/span&gt;Imports and pragmas&lt;/h1&gt;&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;ot&#34;&gt;{-# LANGUAGE RecursiveDo #-}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Control.Applicative&lt;/span&gt;
&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Text.Earley&lt;/span&gt;
&lt;span class=&#34;co&#34;&gt;-- If you want to construct a parser for a language with operators&lt;/span&gt;
&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Text.Earley.Mixfix&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1&gt;&lt;span id=&#34;item-notes-msuib883-usage&#34;&gt;&lt;/span&gt;Usage&lt;/h1&gt;&lt;p&gt;The type for parsers is &lt;code&gt;Prod&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;data&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Prod&lt;/span&gt; 
       r    &lt;span class=&#34;co&#34;&gt;-- A phantom variable, like “s” in “ST s a”&lt;/span&gt;
       e    &lt;span class=&#34;co&#34;&gt;-- Type for names of parsers&lt;/span&gt;
       t    &lt;span class=&#34;co&#34;&gt;-- Type for characters (e.g. Char or Word8 or some Token)&lt;/span&gt;
       a    &lt;span class=&#34;co&#34;&gt;-- Result of the parser&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;e&lt;/code&gt; is usually going to be &lt;code&gt;String&lt;/code&gt; even if you&#39;re parsing e.g. &lt;code&gt;Text&lt;/code&gt;. &lt;code&gt;t&lt;/code&gt; will be &lt;code&gt;Char&lt;/code&gt; for &lt;code&gt;String&lt;/code&gt; and &lt;code&gt;Text&lt;/code&gt;, and &lt;code&gt;SomeToken&lt;/code&gt; if you have previously lexed/tokenized your input. It&#39;s usual to define a type synonym like this:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;type&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Parser&lt;/span&gt; r a &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Prod&lt;/span&gt; r &lt;span class=&#34;dt&#34;&gt;String&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Char&lt;/span&gt; a&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Writing parsers with Earley is similar to Parsec, with 2 major differences:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;You can&#39;t use &lt;code&gt;do&lt;/code&gt; and monadic parsing – for instance, it&#39;s impossible to write a parser that would parse a prime number, while with Parsec it&#39;s easy.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You&#39;re given much less combinators by default and so you have to write many things (&lt;code&gt;between&lt;/code&gt;, etc) by yourself.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If a parser depends on itself, it will loop. To avoid that, you have to define recursive parsers in the context of &lt;code&gt;Grammar&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;data&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Term&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Number&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Integer&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;|&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Add&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Term&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Term&lt;/span&gt;
  &lt;span class=&#34;kw&#34;&gt;deriving&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Show&lt;/span&gt;

&lt;span class=&#34;ot&#34;&gt;grammar ::&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Grammar&lt;/span&gt; r (&lt;span class=&#34;dt&#34;&gt;Parser&lt;/span&gt; r &lt;span class=&#34;dt&#34;&gt;Term&lt;/span&gt;)
grammar &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; mdo
  &lt;span class=&#34;kw&#34;&gt;let&lt;/span&gt; number &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;Number&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;&amp;lt;$&amp;gt;&lt;/span&gt; integer
  add &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; rule &lt;span class=&#34;fu&#34;&gt;$&lt;/span&gt;
    &lt;span class=&#34;dt&#34;&gt;Add&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;&amp;lt;$&amp;gt;&lt;/span&gt; term &lt;span class=&#34;fu&#34;&gt;&amp;lt;*&amp;gt;&lt;/span&gt; (word &lt;span class=&#34;st&#34;&gt;&amp;quot;+&amp;quot;&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;*&amp;gt;&lt;/span&gt; number)
  term &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; rule &lt;span class=&#34;fu&#34;&gt;$&lt;/span&gt;
    number &lt;span class=&#34;fu&#34;&gt;&amp;lt;|&amp;gt;&lt;/span&gt; add
  return term&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(&lt;code&gt;word&lt;/code&gt; is the same as &lt;code&gt;string&lt;/code&gt; in Parsec.)&lt;/p&gt;
&lt;p&gt;As you can see, you can define parsers that depend on each other, but all such parsers have to be marked with &lt;code&gt;rule&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;To run a &lt;code&gt;Grammar&lt;/code&gt;, use &lt;code&gt;fullParses&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode repl&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;fu&#34;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&#34;kw&#34;&gt;let&lt;/span&gt; (parses, rep) &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; fullParses (parser grammar) &lt;span class=&#34;st&#34;&gt;&amp;quot;1+2+3&amp;quot;&lt;/span&gt;

&lt;span class=&#34;fu&#34;&gt;&amp;gt;&lt;/span&gt; parses
[&lt;span class=&#34;dt&#34;&gt;Add&lt;/span&gt; (&lt;span class=&#34;dt&#34;&gt;Add&lt;/span&gt; (&lt;span class=&#34;dt&#34;&gt;Number&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;) (&lt;span class=&#34;dt&#34;&gt;Number&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;2&lt;/span&gt;))
     (&lt;span class=&#34;dt&#34;&gt;Number&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;3&lt;/span&gt;)]

&lt;span class=&#34;fu&#34;&gt;&amp;gt;&lt;/span&gt; rep
&lt;span class=&#34;dt&#34;&gt;Report&lt;/span&gt; {position &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;5&lt;/span&gt;, expected &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; [], unconsumed &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;&amp;quot;&lt;/span&gt;})&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can also use &lt;code&gt;allParses&lt;/code&gt; to get partial parses as well, or &lt;code&gt;report&lt;/code&gt; to see how much of the input the parser can consume without actually getting all parses.&lt;/p&gt;
&lt;p&gt;Note that our grammar is left-recursive, but Earley was still able to parse it. The model of parsing you might have if you know Parsec is inapplicable to Earley.&lt;/p&gt;
&lt;p&gt;Also note that there can be several parses. For instance, if we changed the definition of &lt;code&gt;add&lt;/code&gt; to&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;  add &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; rule &lt;span class=&#34;fu&#34;&gt;$&lt;/span&gt;
    &lt;span class=&#34;dt&#34;&gt;Add&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;&amp;lt;$&amp;gt;&lt;/span&gt; term &lt;span class=&#34;fu&#34;&gt;&amp;lt;*&amp;gt;&lt;/span&gt; (word &lt;span class=&#34;st&#34;&gt;&amp;quot;+&amp;quot;&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;*&amp;gt;&lt;/span&gt; term)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;we&#39;d get the following parses:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode repl&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;fu&#34;&gt;&amp;gt;&lt;/span&gt; fst &lt;span class=&#34;fu&#34;&gt;$&lt;/span&gt; fullParses (parser grammar) &lt;span class=&#34;st&#34;&gt;&amp;quot;1+2+3&amp;quot;&lt;/span&gt;
[ &lt;span class=&#34;dt&#34;&gt;Add&lt;/span&gt; (&lt;span class=&#34;dt&#34;&gt;Add&lt;/span&gt; (&lt;span class=&#34;dt&#34;&gt;Number&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;) (&lt;span class=&#34;dt&#34;&gt;Number&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;2&lt;/span&gt;))
      (&lt;span class=&#34;dt&#34;&gt;Number&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;3&lt;/span&gt;)
, &lt;span class=&#34;dt&#34;&gt;Add&lt;/span&gt; (&lt;span class=&#34;dt&#34;&gt;Number&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;)
      (&lt;span class=&#34;dt&#34;&gt;Add&lt;/span&gt; (&lt;span class=&#34;dt&#34;&gt;Number&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;2&lt;/span&gt;) (&lt;span class=&#34;dt&#34;&gt;Number&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;3&lt;/span&gt;))]&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1&gt;&lt;span id=&#34;item-notes-msuib883-mixfix-parsing&#34;&gt;&lt;/span&gt;Mixfix parsing&lt;/h1&gt;&lt;p&gt;See &lt;a href=&#34;https://hackage.haskell.org/package/Earley/docs/Text-Earley-Mixfix.html&#34;&gt;&lt;code&gt;Text.Earley.Mixfix&lt;/code&gt;&lt;/a&gt; and &lt;a href=&#34;https://github.com/ollef/Earley/blob/master/examples/Mixfix.hs&#34;&gt;this example&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;(TODO: write a better example here.)&lt;/p&gt;
</content><link xmlns:ns="http://www.w3.org/2005/Atom" ns:href="https://guide.aelve.com/haskell/parsing-lnwybqv9#item-msuib883"/></entry><entry><id>izjbbjio</id><title xmlns:ns="http://www.w3.org/2005/Atom" ns:type="text">trifecta</title><updated>2016-03-10T11:06:40Z</updated><content xmlns:ns="http://www.w3.org/2005/Atom" ns:type="html">&lt;h1&gt;  &lt;span class=&#34;item-name&#34;&gt;trifecta&lt;/span&gt;

  
  (&lt;a href=&#34;https://hackage.haskell.org/package/trifecta&#34;&gt;Hackage&lt;/a&gt;)
&lt;/h1&gt;&lt;p&gt;A library that is supposed to give you much nicer error messages that the ones of Parsec. Can be hard to figure out, but worth trying if you&#39;re writing an interpreter/compiler.&lt;/p&gt;
&lt;h2&gt;Pros&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;Lets you report errors in a manner similar to Clang, with colors and &lt;code&gt;^~~~~~~~~&lt;/code&gt; and so on, which is very useful when writing e.g. a compiler. (For an example of what Clang does, see &lt;a href=&#34;http://clang.llvm.org/diagnostics.html&#34;&gt;here&lt;/a&gt;.)&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Has a module for doing highlighting of parsed text (i.e. you assign &lt;a href=&#34;https://hackage.haskell.org/package/parsers/docs/Text-Parser-Token-Highlight.html#t:Highlight&#34;&gt;labels&lt;/a&gt; like &lt;code&gt;Number&lt;/code&gt;, &lt;code&gt;Operator&lt;/code&gt;, &lt;code&gt;Identifier&lt;/code&gt;, etc and you can generate colored text from them).&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;&lt;h2&gt;Cons&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;Kinda complicated, doesn&#39;t have any tutorials available, and documentation doesn&#39;t help at all.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Can parse &lt;code&gt;String&lt;/code&gt; and &lt;code&gt;ByteString&lt;/code&gt; natively, but not &lt;code&gt;Text&lt;/code&gt;.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Depends on &lt;code&gt;lens&lt;/code&gt; and thus by depending on &lt;code&gt;trifecta&lt;/code&gt; you pull in half of Hackage too.&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;&lt;h2&gt;Ecosystem&lt;/h2&gt;&lt;p&gt;&lt;a href=&#34;https://hackage.haskell.org/package/indentation&#34;&gt;indentation&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Notes&lt;/h2&gt;&lt;h1&gt;&lt;span id=&#34;item-notes-izjbbjio-imports&#34;&gt;&lt;/span&gt;Imports&lt;/h1&gt;&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Control.Applicative&lt;/span&gt;
&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Text.Trifecta&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1&gt;&lt;span id=&#34;item-notes-izjbbjio-gotchas&#34;&gt;&lt;/span&gt;Gotchas&lt;/h1&gt;&lt;p&gt;If you want to parse &lt;code&gt;Text&lt;/code&gt;, either convert it to &lt;code&gt;String&lt;/code&gt; or do &lt;a href=&#34;https://github.com/ekmett/trifecta/issues/51#issue-122418200&#34;&gt;something like this&lt;/a&gt;.&lt;/p&gt;
</content><link xmlns:ns="http://www.w3.org/2005/Atom" ns:href="https://guide.aelve.com/haskell/parsing-lnwybqv9#item-izjbbjio"/></entry><entry><id>fn7l37g7</id><title xmlns:ns="http://www.w3.org/2005/Atom" ns:type="text">ReadP</title><updated>2016-03-10T11:06:40Z</updated><content xmlns:ns="http://www.w3.org/2005/Atom" ns:type="html">&lt;h1&gt;  &lt;a href=&#34;http://hackage.haskell.org/package/base/docs/Text-ParserCombinators-ReadP.html&#34; class=&#34;item-name&#34;&gt;ReadP&lt;/a&gt;

&lt;/h1&gt;&lt;p&gt;A small, simple parsing module shipped with GHC by default. Pretty usable if you don&#39;t care about getting error messages. Good for writing &lt;code&gt;Read&lt;/code&gt; instances.&lt;/p&gt;
&lt;h2&gt;Pros&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;It&#39;s in base, so you can use it even when you can&#39;t (or don&#39;t want to) depend on any parsing library.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Non-deterministic – all parse results will be returned. Hence doesn&#39;t need &lt;code&gt;try&lt;/code&gt; or backtracking, and doesn&#39;t leak space. (Left-biased Parsec-like choice is still possible with &lt;code&gt;&amp;lt;++&lt;/code&gt;.)&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Can be used for writing complicated &lt;code&gt;Read&lt;/code&gt; instances that are fully compliant with Haskell&#39;s precedency parsing requirements (see the &lt;a href=&#34;http://hackage.haskell.org/package/base/docs/Text-ParserCombinators-ReadPrec.html&#34;&gt;&lt;code&gt;ReadPrec&lt;/code&gt;&lt;/a&gt; module).&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Can be faster than Parsec (see &lt;a href=&#34;http://lpaste.net/157877&#34;&gt;this benchmark&lt;/a&gt; where parsing a simple config file is twice as fast with ReadP).&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Has a function for using the &lt;code&gt;Read&lt;/code&gt; instance as a parser (i.e. &lt;code&gt;readP_to_S reads&lt;/code&gt;).&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;&lt;h2&gt;Cons&lt;/h2&gt;&lt;ul&gt;&lt;p&gt;&lt;li&gt;Doesn&#39;t give any errors whatsoever.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Non-determinism everywhere can be annoying if you don&#39;t need it (for instance, it&#39;s non-trivial to write a greedy &lt;code&gt;many&lt;/code&gt; if you need it).&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Can&#39;t parse &lt;code&gt;Text&lt;/code&gt;.&lt;/li&gt;&lt;/p&gt;&lt;p&gt;&lt;li&gt;Doesn&#39;t provide any advanced features like monad transformers or state.&lt;/li&gt;&lt;/p&gt;&lt;/ul&gt;&lt;h2&gt;Notes&lt;/h2&gt;&lt;h1&gt;&lt;span id=&#34;item-notes-fn7l37g7-imports&#34;&gt;&lt;/span&gt;Imports&lt;/h1&gt;&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Text.ParserCombinators.ReadP&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Or if you have &lt;code&gt;Control.Applicative&lt;/code&gt; imported, write it like this to avoid clashes:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Control.Applicative&lt;/span&gt;
&lt;span class=&#34;kw&#34;&gt;import &lt;/span&gt;&lt;span class=&#34;dt&#34;&gt;Text.ParserCombinators.ReadP&lt;/span&gt; &lt;span class=&#34;kw&#34;&gt;hiding&lt;/span&gt; (many, optional)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1&gt;&lt;span id=&#34;item-notes-fn7l37g7-usage&#34;&gt;&lt;/span&gt;Usage&lt;/h1&gt;&lt;p&gt;It might not be obvious how to run a parser, so:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;co&#34;&gt;-- Return all possible parses&lt;/span&gt;
&lt;span class=&#34;ot&#34;&gt;parse ::&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;ReadP&lt;/span&gt; a &lt;span class=&#34;ot&#34;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;String&lt;/span&gt; &lt;span class=&#34;ot&#34;&gt;-&amp;gt;&lt;/span&gt; [(a, &lt;span class=&#34;dt&#34;&gt;String&lt;/span&gt;)]
parse &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; readP_to_S

&lt;span class=&#34;co&#34;&gt;-- Return all possible parses, and additionally require all input to be consumed&lt;/span&gt;
&lt;span class=&#34;ot&#34;&gt;parseAll ::&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;ReadP&lt;/span&gt; a &lt;span class=&#34;ot&#34;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&#34;dt&#34;&gt;String&lt;/span&gt; &lt;span class=&#34;ot&#34;&gt;-&amp;gt;&lt;/span&gt; [a]
parseAll p &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; map fst &lt;span class=&#34;fu&#34;&gt;.&lt;/span&gt; readP_to_S (p &lt;span class=&#34;fu&#34;&gt;&amp;lt;*&lt;/span&gt; eof)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1&gt;&lt;span id=&#34;item-notes-fn7l37g7-gotchas&#34;&gt;&lt;/span&gt;Gotchas&lt;/h1&gt;&lt;p&gt;&lt;code&gt;many&lt;/code&gt; is rather inefficient because it&#39;s nondeterministic and so all alternatives (0 elements consumed, 1 element consumed, 2 elements consumed, etc) have to be considered:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;p &lt;span class=&#34;fu&#34;&gt;=&lt;/span&gt; (,) &lt;span class=&#34;fu&#34;&gt;&amp;lt;$&amp;gt;&lt;/span&gt; many (satisfy isLetter)
        &lt;span class=&#34;fu&#34;&gt;&amp;lt;*&amp;gt;&lt;/span&gt; many (satisfy isUpper)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&#34;sourceCode&#34;&gt;&lt;pre class=&#34;sourceCode repl&#34;&gt;&lt;code class=&#34;sourceCode&#34;&gt;&lt;span class=&#34;fu&#34;&gt;&amp;gt;&lt;/span&gt; map fst &lt;span class=&#34;fu&#34;&gt;$&lt;/span&gt; readP_to_S (p &lt;span class=&#34;fu&#34;&gt;&amp;lt;*&lt;/span&gt; eof) &lt;span class=&#34;st&#34;&gt;&amp;quot;abcXYZ&amp;quot;&lt;/span&gt;
[(&lt;span class=&#34;st&#34;&gt;&amp;quot;abc&amp;quot;&lt;/span&gt;,&lt;span class=&#34;st&#34;&gt;&amp;quot;XYZ&amp;quot;&lt;/span&gt;),
 (&lt;span class=&#34;st&#34;&gt;&amp;quot;abcX&amp;quot;&lt;/span&gt;,&lt;span class=&#34;st&#34;&gt;&amp;quot;YZ&amp;quot;&lt;/span&gt;),
 (&lt;span class=&#34;st&#34;&gt;&amp;quot;abcXY&amp;quot;&lt;/span&gt;,&lt;span class=&#34;st&#34;&gt;&amp;quot;Z&amp;quot;&lt;/span&gt;),
 (&lt;span class=&#34;st&#34;&gt;&amp;quot;abcXYZ&amp;quot;&lt;/span&gt;,&lt;span class=&#34;st&#34;&gt;&amp;quot;&amp;quot;&lt;/span&gt;)]&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you know you wouldn&#39;t need this, use &lt;code&gt;munch&lt;/code&gt; or &lt;code&gt;munch1&lt;/code&gt;, it will be much faster. (Also note that in this example you&#39;ll get &lt;code&gt;(&amp;quot;abcXYZ&amp;quot;,&amp;quot;&amp;quot;)&lt;/code&gt;, not &lt;code&gt;(&amp;quot;abc&amp;quot;,&amp;quot;XYZ&amp;quot;)&lt;/code&gt;, if you use &lt;code&gt;munch&lt;/code&gt;, but this is to be expected – just like with &lt;code&gt;many&lt;/code&gt; in Parsec.)&lt;/p&gt;
&lt;p&gt;Other combinators are non-deterministic too, which can sometimes lead to &lt;a href=&#34;http://stackoverflow.com/a/22589396&#34;&gt;unexpected issues&lt;/a&gt;.&lt;/p&gt;
</content><link xmlns:ns="http://www.w3.org/2005/Atom" ns:href="https://guide.aelve.com/haskell/parsing-lnwybqv9#item-fn7l37g7"/></entry><entry><id>ec1898fg</id><title xmlns:ns="http://www.w3.org/2005/Atom" ns:type="text">alex/happy</title><updated>2016-03-10T11:06:40Z</updated><content xmlns:ns="http://www.w3.org/2005/Atom" ns:type="html">&lt;h1&gt;  &lt;span class=&#34;item-name&#34;&gt;alex/happy&lt;/span&gt;

&lt;/h1&gt;&lt;h2&gt;Pros&lt;/h2&gt;&lt;ul&gt;&lt;/ul&gt;&lt;h2&gt;Cons&lt;/h2&gt;&lt;ul&gt;&lt;/ul&gt;</content><link xmlns:ns="http://www.w3.org/2005/Atom" ns:href="https://guide.aelve.com/haskell/parsing-lnwybqv9#item-ec1898fg"/></entry></feed>