One great feature of LPeg is that it's binary-safe, meaning that (unlike regular expressions) it can be safely used to parse binary data! This makes it an excellent tool for parsing binary protocols, especially network communication protocols, such as the Action Message Format (used by Adobe Flash for making remote calls and even in FLV movie files). I'll leave it to you to explore the possibilities...
Beware that from here on, I assume that you know your way around Lua, LPeg and how they work.
The problem
That being said, this article is actually about an unusual roadblock I hit while using LPeg to build a Lua-based AMF parser, and the various solutions I found and/or came up with to overcome it (you didn't think that I mentioned AMF before by accident, did you?).
The issue is LPeg's implementation of repetitive patterns: in particular, its inability to match (or capture) a fixed number of occurrences of a certain pattern, although it can match a minimum or a maximum number of such occurrences, which is perfect for stream-oriented parsing (such as parsing programming languages) but insufficient for binary data.
Just to clarify, here's a small list of LPeg patterns which correspond to the typical PCRE repetitive constructs (in each case we're trying to match the string 'cloth'):
Nr. | Matching occurrences of 'cloth' | PCRE pattern | LPeg pattern | ||||
1 | 0 or more (at least 0) | [cci_text]/(cloth)*/[/cci_text] | [cci_lua]lpeg.P'cloth'^0[/cci_lua] | ||||
2 | 1 or more (at least 1) | [cci_text]/(cloth)+/[/cci_text] | [cci_lua]lpeg.P'cloth'^1[/cci_lua] | ||||
3 | X or more (at least X) | [cci_text]/(cloth){X,}/[/cci_text] | [cci_lua]lpeg.P'cloth'^X[/cci_lua] | ||||
4 | 1 or less (at most 1) | [cci_text]/(cloth)?/[/cci_text] | [cci_lua]lpeg.P'cloth'^-1[/cci_lua] | ||||
5 | X or less (at most X) | [cci_text]/(cloth){,X}/[/cci_text] | [cci_lua]lpeg.P'cloth'^-X[/cci_lua] | ||||
6 | precisely X (no more, no less) | [cci_text]/(cloth){X,X}/[/cci_text] | [cci_lua]-- not implemented --[/cci_lua] | ||||
7 | anywhere between X and Y | [cci_text]/(cloth){X,Y}/[/cci_text] | [cci_lua]-- not implemented --[/cci_lua] | ||||
For cases 6 and 7, LPeg does not offer any simple constructs so we have to find a complex one. But let's put case 7 aside for a while, and try to tackle case 6, then we'll see...