JavaScript – Regular Expressions

Regular Expression

The regular expressions are used to form patterns that can be matched against strings. For example, suppose a text string needs to be validated to check whether its an email addresses, for this string object does not provide any method except to compare strings which is not accurate and to overcome this problem we use regular expressions. To create regular expressions we must use RegExp literal. To do this regular expression is assigned to a variable using forward slash ( / ) instead of quotations. For example:

var var-name = /pattern/flags;

Here pattern is the regular expression and flag is optional.

match() method compares a string or regular expression to see whether they match

replace() method search the regular expression and then replace it with new string.

search() method searches the match between regular expression and specified string.

 

The pattern matching can be performed using RegExp object and String object. For example:

var str = “abcxyz”;

var position = str.search(/cx/);

In the above code str variable is declared and position is searched for regular expression “/cx/”. The variable position will have the result 2. The search method returns the number found at position matching the regular expression or -1 if not found.

 

The regular expression consists of normal characters and meta characters. The meta characters used are:

\  |  (  )  [  ]  {  }  ^  $  *  +  ?  .

 

The meta character “.” period is used to match any character except newline. For example:

/prog./

will match progr, progs, proga but will not match aprogram.

 

The meta character “\” backslash is used as escape sequence. For example:

/3.4/ will match 374, 304, 314

/3\.4/ will match 3.4

 

The meta characters “[ ]” is used to match class of characters. For example:

[abc] will match the a, b or c characters.

[a-h] will match any lower case characters from a to h

 

The meta character “^” is used as negation. For example:

[^abc] will match any character except a, b or c; (Inverts the set)

 

There are some predefined character classes:

\d      [0-9] a digit

\D     [^0-9] not a digit

\w     [A-Z a-z 0-9] Alphanumeric characters

\s      [\r\t\n\f ] white space characters

\S     [^\r\t\n\f ] Not a white space characters

 

For example:

/\d\d\d/  matches three digits

/\d\.\d\d/ matches digit followed by “.” Followed by 2 digits

/\w\w\w\w/ matches any alphanumeric 4 character string

 

The meta characters “{  }” is used to match repeated part of pattern. For example:

/xy{4}z/ matches xyyyyz

 

There are three symbolic quantifiers:

*        used to match zero or more repetitions.

+        used to match one or more repetitions.

?        used to match one or none repetitions.

 

For example:

/\d+\.\d*/           one or more digits followed by period followed by possibly more digits

/[A-Z a-z]\w*]       a letter followed by zero or more alphanumeric characters.

 

The meta character “$” is used to match the end of string. For example:

/cse$/        “This is cse” matches

“This is cse department” does not match

 

More Examples:

/\d{3}-\d{4}$/      123-4567