Finding Duplicates with a Regular Expression

Published by Brandon on

Today we are going to take a look at how to find duplicates in a regular expression with some examples. Let’s say we have a number, and we want to find a match where that number is following the same number. For example

11234567

In the number above, let’s write an expression that matches where there are two 1’s in a row. To do this, we will use what’s called backreferencing. Backreferencing basically means matching what was originally matched in a group. Sound confusing? Don’t worry, it will make more sense.

Here’s what we would write to match where there are two 1’s in a row.

(1)\1

Let’s talk about what’s going on in the regular expression above. There is a grouping in the parentheses with just a 1 in it, and then a backreference. A backreference is a backslash, followed by a number which relates to the number of the grouping. For us, we only have one grouping, which would be grouping 1. If there were multiple like this,

(1)(2)

then the grouping with (1) would be grouping number 1, and (2) would be grouping number 2. What the backreference is saying is whatever was matching in that grouping, refer to it. Let’s look at another example to get matches where two of the same digits are together.

Here is our string

1123455567

and here is our regular expression.

(\d)\1

If you try this at https://regex101.com/, you will see that there are matches at 11 and 55. This is saying the grouping of a digit (\d) with a backreference will get all occurrences of the same number twice in a row.

Duplicate Strings

We don’t have to just use these regular expressions with numbers – the same can be done with strings. Let’s have a string with two o’s in it and find a match with two o’s.

Our string

Look

And our regular expression

(o)\1

Try that, and notice there is now a match with “oo” in look.

Hopefully this helped you understand backreferencing in a regular expression!


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *