# How to implement a regular expression matching

According to *Wikipedia*:

A

regular expression(shortened asregexorregexp; also referred to asrational expression) is a sequence of characters that define a search pattern. Usually such patterns are used by string-searching algorithms for “find” or “find and replace” operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory.

To me, the regular expression is magical. It has many rules and symbols to represent a long string. In this post, I’ll introduce a simple regular expression matching rule and implementation.

## Problem description

Given an input string (`s`

) and a pattern (`p`

), implement regular expression matching with support for `'.'`

and `'*'`

where:

`'.'`

Matches any single character.`'*'`

Matches zero or more of the preceding element.

The matching should cover the **entire** input string (not partial).

**Example 1:**

**Input:** s = "aa", p = "a*"

**Output:** true

**Explanation:** '*' means zero or more of the preceding element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

**Example 2:**

**Input:** s = "aab", p = "c*a*b"

**Output:** true

**Explanation:** c can be repeated 0 times, a can be repeated 1 time. Therefore, it matches "aab".

**Example 3:**

**Input:** s = "mississippi", p = "mis*is*p*."

**Output:** false

Try to solve it here: https://leetcode.com/problems/regular-expression-matching/

**Solution**

For the given strings `s`

and `p`

, in order to know whether `p`

is a regular expression of `s`

, we need to check every character in both strings. And for the symbols in the string `p`

, we need to check whether those symbols are valid. Here we’ll use **dynamic programming** to keep track of each validation. And we will use a **matrix** to store the boolean result.

## Step 1: matrix initiation

If the length of the string `s`

is `n`

, and the length of the string `p`

is `m`

, we’ll create a matrix with a size `(m+1)*(n+1)`

. For the element in the `matrix[i][j]`

, it’s representing at the current index i and j, the boolean result of wether `p[:j]`

can represent `s[:i]`

. First, let’s create such a matrix with the input `s = “aab”, p = “c*a*b”`

.

After we create the matrix full of default boolean value False. We need to initialize the matrix, that is to fill out the values of `matrix[i][0]`

. Which means that whether `p[:i]`

can represent an empty string.

- case 1:
`p = “”, s = “”`

,`p`

can represent`s`

,`matrix[0][0] = True`

- case 2:
`p = “c”, s = “”`

,`p`

can not represent`s`

,`matrix[1][0] = False`

- case 3:
`p = “c*”, s = “”`

, since* can repeat preceding character 0 times,`p`

can represent`s`

,`matrix[2][0] = True`

- case 4:
`p = “c*a”, s = “”`

,`p`

can not represent`s`

,`matrix[3][0] = False`

- case 5:
`p = “c*a*”, s = “”`

, similar to case 3,`p`

can represent`s`

,`matrix[4][0] = True`

- case 6:
`p = “c*a*b”, s = “”`

,`p`

can not represent`s`

,`matrix[5][0] = False`

## Step 2: fill out the remaining index

For the remaining matrix[i][j], the boolean value depends on each character in string p. There are two cases we need to consider:

- if the current character is
`.`

,`.`

can represent any single character,`matrix[i][j]`

has the same value with`matrix[i-1][j-1]`

- if the current character is
`*`

, * can repeat the previous character 0 to n times,`matrix[i][j]`

depends on`matrix[i-1][j] or matrix[i][j-2]`

The last element of the matrix is the result we want.

**Implementation**

Following is the implementation in Ruby:

def is_match(s, p)

return false if !s || !p

column = [false] * (p.size + 1)

dp = [] (s.size + 1).times do

dp << Array.new(column)

end

dp[0][0] = true for i in 1..p.size do

if p[i - 1] == '*'

dp[0][i] = dp[0][i-2]

end

end

for i in 1..s.size do

for j in 1..p.size do

if p[j-1] == s[i-1] || p[j-1] == '.'

dp[i][j] = dp[i-1][j-1]

elsif p[j - 1] == '*'

if p[j - 2] == s[i - 1] || p[j - 2] == '.'

dp[i][j] = dp[i-1][j] || dp[i][j - 2]

else

dp[i][j] = dp[i][j - 2]

end

end

end

end

dp[s.size][p.size]

end

Thanks for reading!