Regular Expressions provide a means for identifying strings of text such as particular characters, words or patterns of characters.
With Model Review some of the more common uses for regular expressions are for adherence to naming conventions, or for sorting through annotation within the project model for very specific information.
Regular Expressions employ special characters to provide more flexibility in defining matches. The "special characters" are:
+ * ? . [ ] ^ ( ) | \
The following sections describe how to use each of the special characters:
A period (".") will match any one character.
Expression |
Meaning |
Matches |
Does Not Match |
390-. |
Match the string "390-" followed by any character. |
390-A, 390-1, 390-- |
390-A1, 1390-1 |
Revision . Released |
Match the string "Revision " followed by any character and then the string " Released" |
Revision A Released Revision 1 Released Revision # Released |
Revision A1 Released RevisionAReleased |
Square Brackets ("[ ]") define a character class, which matches any single character against the characters inside the brackets. Inside of the bracket, all of the special characters lose their meaning, except for "^", which when used as the first character in brackets means NOT matching the specified characters.
Also, ranges can be used inside of the square brackets.
Expression |
Meaning |
Matches |
Does Not Match |
[akm] |
One character: either a, k, or m. |
a, k, m |
Akm, ak, G |
[a-z] |
Any letter |
A, b, c, d |
1, 2, -, # |
[^akm] |
One character as long as it is NOT a, k, or m. |
C, f, G, Am (because it is two characters) |
A, k, m |
[0-9] |
Any number |
0, 4, 7 |
A, #, z |
[a-z][a-z] |
Any two letters |
AB, BC, DE |
A (only one letter) A1 12 |
An Asterisk ("*") follows an expression and indicates that the preceding expression can occur zero or more times.
Expression |
Meaning |
Matches |
Does Not Match |
Ab*c |
"A" followed by zero or more b's, with a C on the end. |
Ac Abc Abbbbbbbc |
Bbb Abcd |
[a-z]* |
Any number of any letters (equivalent to say, only letters &endash; but includes zero letters) |
A Bob AAAAA Steel <Blank> (because * can indicate zero occurrences) |
STEEL230 12 AA-## |
A Plus Sign ("+") follows an expression and indicates that the preceding expression can occur one or more times.
Expression |
Meaning |
Matches |
Does Not Match |
Ab+c |
"A" followed by one or more b's, with a C on the end. |
Abc Abbbbbbbc |
Ac Bbb Abcd |
[a-z]+ |
Any number of any letters (equivalent to say, only letters) |
Bob AAAAA Steel |
STEEL230 12 AA-## <blank> |
A Question Mark ("?") follows an expression to indicate that the preceding expression was optional.
Expression |
Meaning |
Matches |
Does Not Match |
Ab?c |
"A" followed by an optional "b" with a C on the end. |
Ac Abc |
Abbc Abcd |
390-[a-z][a-z]? |
"390-" followed by a letter, and a second optional letter. |
390-A 390-AB |
390-11 390- 390-ABC |
The pipe ("|") character operates as an OR between two expressions (usually enclosed in parentheses).
Expression |
Meaning |
Matches |
Does Not Match |
(390|241)-[a-z]+ |
Either a "390" or "241" followed by a "-" and one or more letters. |
390-A 241-A 241-AB |
200-A 241 241- |
As per (MS2377|CS123) |
"As per " followed by either "MS2377" or "CS123" |
As per MS2377 As per CS123 |
As per As per MS3222
|
390-([abc]|[123]) |
"390-" followed by an "a", "b" or "c" OR a "1", "2", or "3". |
390-A 390-3 |
390-F 390- |
If it is necessary to actually match against a character that is a "special character," a backslash in front of the special character tells Model Review that the character should be taken literally (rather than as the special character).
Expression |
Meaning |
Matches |
Does Not Match |
[0-9]\+ |
A number followed by a "+". |
1+ 2+ |
1 A 1+1 |
What\? |
"What" followed by a question mark. |
What? |
What's Up? |
A common task in Model Review is to "start with" or "end with" a particular value. This syntax is one difference for people used to "search" style regular expressions. The recommended approach is to use ".*" or ".+" on the front or back end of the expression to indicate Starts With or Ends With.
Expression |
Meaning |
Matches |
Does Not Match |
390-.* |
Starts with "390-" (ends with anything &endash; including blank). |
390-1 390-111 |
1390-1 |
390-.+ |
Starts with "390-" (ends with anything, but must be at least one char). |
390-1 390-111 |
1390-1 390- |
.*-[a-z] |
Ends with a "-" and a letter (starts with anything &endash; including blank). |
Revision-A Rev-A -A |
Revision-A1 Rev-1 123 -1 |
.+-[a-z] |
Ends with a "-" and a letter (starts with anything, but must be at least one char). |
Revision-A Rev-A |
Revsion-A1 Rev-1 123 -1 -A |
Regular Expressions are a powerful (but somewhat complex) approach to matching text. In order to address more complex requirements, you may need to become proficient putting together multiple expressions to make a complex expression.
Some examples of complex expressions would be:
Expression |
Meaning |
Matches |
Does Not Match |
[0-9]+[-]?[0-9]+ |
Numbers with an optional dash in the middle. |
123-45 12345 |
12A32 1232-A A |
.*[^_] |
Cannot end with an underscore (_). |
123324 PART1 |
12343_ |
(390|231)-[a-z0-9]+-[0-9]+ |
Either 390 or 231, followed by a "-", an alphanumeric section of at least one character, then a "-" and at least one number. |
390-mypart-1 231-bracket-99 |
120-mypart-1 380- - |