This is part of a series of posts documenting Sprache:

This post covers a few miscellaneous helpers - Token, Contained, Identifier and LineTerminator.

Token

Token is used to discard whitespace around something else being parsed, identified using char.IsWhiteSpace.

  • Parser<T> Token<T>(this Parser<T> parser)

It is extremely useful when parsing things like expressions, where elements could be surrounded by any amount of whitespace, for example:

1
2
3
4
5
6
7
8
9
Parser<int> expression =
      from left in Parse.Number.Token()
      from plus in Parse.Char('+').Token()
      from right in Parse.Number.Token()
      select int.Parse(left) + int.Parse(right);

Assert.Equal(4, expression.Parse("2 + 2"));
Assert.Equal(4, expression.Parse(" 2 + 2"));
Assert.Equal(4, expression.Parse("\n2\n  +   \n 2 \n "));

Contained

Helper that identifies elements contained by some other tokens.

  • Parser<T> Contained<T, U, V>(this Parser<T> parser, Parser<U> open, Parser<V> close)

The following example shows parsing elements surrounded by brackets:

1
2
3
4
5
6
7
8
Parser<string> parser = Parse.Letter.Many().Text().Contained(Parse.Char('('), Parse.Char(')'));

Assert.Equal("foo", parser.Parse("(foo)"));
// Empty elements are allowed
Assert.Equal("", parser.Parse("()"));

// Unexpected end of input reached; expected )
Assert.Throws<ParseException>(() => parser.Parse("(foo"));

Identifier

Parser for identifiers starting with firstLetterParser and continuing with tailLetterParser.

  • Parser<string> Identifier(Parser<char> firstLetterParser, Parser<char> tailLetterParser)

Its common for identifiers (like variable or function names) to have extra restrictions on the first character, for example 2d is not a valid identifier in C# (as it starts with a number), but _2d is. Identifier is a helper for parsing such identifiers that works by combining two other parsers, one for the first character, and the second for the rest of the identifier, for example:

1
2
3
4
5
6
Parser<string> identifier = Parse.Identifier(Parse.Letter, Parse.LetterOrDigit);

Assert.Equal("d1", identifier.Parse("d1"));

// unexpected '1'; expected letter
Assert.Throws<ParseException>(() => identifier.Parse("1d"));

LineTerminator

Parses a line ending or the end of input.

1
2
3
4
5
Parser<string> parser = Parse.LineTerminator;

Assert.Equal("", parser.Parse(""));
Assert.Equal("\n", parser.Parse("\n foo"));
Assert.Equal("\r\n", parser.Parse("\r\n foo"));