Pattern Matching |
![]() ![]() ![]() |
Pattern matching is a way to test whether data has a particular structure. It can be used for data validation and as a means to extract the part of a data item that matches against a specified element of the pattern. The pattern matching operations are the query processor LIKE and UNLIKE keywords, the QMBasic MATCHES operator and MATCHFIELD() function, and the M search of ED. All of these compare a character string with a pattern template.
Pattern matching breaks the character set into three classes of character, each represented by a character type code:
There are also three ways to specify how many characters are present:
The template consists of one or more concatenated items formed from pairs of lengths and character type:
The values n and m are integers with any number of digits. m must be greater than or equal to n.
The 0X code is a wildcard that matches against anything. It has a commonly used synonym:
The 0A, nA, 0N, nN and "string" patterns may be preceded by a tilde (~) to invert the match condition. For example, ~4N matches four non-numeric characters such as ABCD (not a string which is not four numeric characters such as 12C4).
A null string matches patterns ..., 0A, 0X, 0N, their inverses (~0A, etc) and "".
The 0X and n-mX patterns match against as few characters as necessary before control passes to the next pattern. For example, the string ABC123DEF matched against the pattern 0X2N0X matches the pattern components as ABC, 12 and 3DEF.
The 0N, n-mN, 0A, and n-mA patterns match against as many characters as possible. For example, the string ABC123DEF matched against the pattern 0X2-3N0X matches the pattern components as ABC, 123 and DEF.
The template string may contain alternative patterns separated by value marks. The source data will match the overall pattern if any of the pattern values match.
Examples
"A123BCD" would match successfully against patterns of 1A3N3A 1A1-3N3A 'A'1-3N3A 0A0N0A 1A...3A 1A~3A3A and many more
It is often acceptable to omit the quotes around literal components. The above example would also match A1-3N3A There is no confusion between the leading A as a literal or as a character type as it is not preceded by a length value. It is, however, recommended that the quotes should be included. Omitting the quotes in a pattern used in the MATCHFIELD() function may affect the function's behaviour as each character of the literal will be counted as a separate component of the pattern.
A program might need to test whether data entered by a user is a non-negative integer (whole number) value. The QMBasic NUM() function can be used to test for numeric data but this would allow fractional or negative values. Testing against a pattern of "1-4N" would allow only integer values in the range 0 to 9999. To remove the upper limit, a pattern of 1N0N tests for one digit followed by any number of further digits, including none. |