Pattern matching

Pattern matching

Pattern matching is a class of problems which finds a sequence of tokens (often described in a grammar) in another sequence of tokens. It differs from pattern recognition in a manner that it looks for exact match.

Finding a sub-string in a string is a typical example of this problem. Native pattern matching algorithms offer complexity of O(m * (n-m+1)) where m is the length of the pattern and the n is the length of the string.

Here are few algorithms which yield better complexity.

  • Suffix tree / array – It’s variant of trie. Idea is to build a representation containing all suffixes of the given string. Once built, pattern search can be done on it. Read more on suffix tree/array.
  • Finite automata – It’s a simple state machine used to match patterns in the given input string. It’s usually represented as state diagram. It has finite set of states with a special start state and and a set of accepting states. The arc between states represent the transitions. RegEx is often implemented using automata. Learn more about automata. GeekforGeek has a good write-up on this.

One thought on “Pattern matching

Leave a Reply

Your email address will not be published. Required fields are marked *