Python Regular Expressions

Meenakshi Agarwal
By
Meenakshi Agarwal
Hi, I'm Meenakshi Agarwal. I have a Bachelor's degree in Computer Science and a Master's degree in Computer Applications. After spending over a decade in large...
12 Min Read

This free course will walk you through Python Regular Expression a.k.a. RegEx. We have covered every little detail to make this topic simpler for you. You will find special mention of Python RegEx Search, Findall, Match, and Compile methods with full code examples.

Note: The syntax used here is for Python 3. You may modify it to use with other versions of Python.

Python Regular Expression History

Before we begin, have a look at the history of Python’s regular expressions briefly:

  • Start of Python (1980s): Python began without built-in regular expressions.
  • Adding Regex (mid-1990s): Python got basic regex support with the “regex” module.
  • Switch to re (1998): The “regex” module was replaced with the better re module.
  • Improvements (2000s-2010s): re got faster, more reliable, and Unicode-friendly.
  • Python 3.0 (2008): Importing re became standard.
  • Extras (Third-party): Some use third-party libraries like “regex” or “re2” for extra features and speed.

To Learn Python from Scratch – Read Python Tutorial

What is Regular Expression?

Regex (short for “Regular Expression”) in Python is a powerful tool for searching, matching, and manipulating text based on patterns.

It allows us to define patterns using a specialized syntax and then search for and manipulate text that matches those patterns.

Python Regular Expression Support

Python provides a re module that includes functions for pattern matching and manipulating the string characters.

The re module has RegEx functions to search patterns in strings. We can even use this module for string substitution.

This Python regular expression module (re) contains capabilities similar to the Perl RegEx. It comprises of functions such as match(), sub(), split(), search(), findall(), etc.

How to Use Regular Expression?

To use a regular expression, first, you need to import the re module. You also need to understand how to pass a raw string (r'expression') to a function. Another thing is to interpret the result of a RegEx function.

1. Import Re Module

When you want to use any functions present in the re module, you can access it with the below syntax

import re
re.function_name(list_of_arguments)

Or use this alternative approach.

from re import function_name
function_name(list_of_arguments)

2. Use Raw String Argument

You might need to use raw string to pass it as the pattern argument to Python regular expression functions. Follow the below code to learn how to use it.

search(r"[a-z]", "yogurt AT 24")

3. RegEx Function Return Value

If a Python RegEx function (mainly the search() and match() functions) succeeds, then it returns a Match object.

We can pass the object to the group() function to extract the resultant string.

The group() method takes a numeric value to return the output of the matched string or to a specific subgroup.

print("matchResult.group() : ", matchResult.group())
print("matchResult.group(1) : ", matchResult.group(1))

6 Most Useful Regular Expression Functions in Python

The two most important functions used are the search and match functions. When you wish to perform a regular expression search on a string, the interpreter traverses it from left to right. If the pattern matches perfectly, then it returns a match object or None on failure.

1. RegEx Search in Python

The search() function gets you the first occurrence of a string containing the string pattern.

Python RegEx Search, Findall, Match, and Compile methods with examples

The syntax for regular expression search is:

import re
re.search(string_pattern, string, flags)

Please note that you can use the following metacharacters to form string patterns.

(+ ? . * ^ $ ( ) [ ] { } | \)

Apart from the previous set, there are some more such as:

\A, \n, \r, \t, \d, \D, \w, \z etc and so on.

Let’s see the Python RegEx search() example:

from re import search
Search = search(r“[a-z]”, “yogurt AT 24”)
print((Search))

The output is as follows:

<_sre.SRE_Match object; span=(0, 1), match='y'>

Also Check: Search a Python Dictionary by Key with Example

2. RegEx Match

The match() function gets you the match containing the pattern from the start of the string.

Python Regular Expression - Match Function

The syntax for regular expression match in Python is:

import re
re.match(string_pattern, string, flags)

Let’s see the match() example:

from re import match
print(match(r"PVR", "PVR Cinemas is the best."))

The output is as follows:

<_sre.SRE_Match object; span=(0, 3), match='PVR'>

3. RegEx Split

It is used to split strings in Python according to the string pattern.

The syntax for the split() is:

import re
re.split(string_pattern, string)

Let’s see the split() example:

from re import split
print(split(r"y", "Python"))

The output is as follows:

['P', 'thon']

4. RegEx Sub String

It is used to substitute a part of a string according to a string pattern.

The syntax for the sub() is:

import re
re.sub(string_pattern, strings)

Let’s see the sub() example:

from re import sub
print(sub(r“Machine Learning”, “Artificial Intelligence”, “Machine Learning is the Future.”))

The result is as follows:

Artificial Intelligence is the Future.

5. Python RegEx Findall

It is used to find the occurrence of the string pattern anywhere in the string.

The syntax for findall() is:

import re
re.findall(string_pattern, strings)

Let’s see the Python RegEx Findall() example:

from re import findall
print(findall(r“[a-e]”, “I am interested in Python Programming Language”))

The output is as follows:

['a', 'e', 'e', 'e', 'd', 'a', 'a', 'a', 'e']

6. RegEx Compile in Python

It helps you create a string pattern for future purposes rather than on-the-fly string matching.

The syntax for compile() is:

import re
re.compile(string_pattern)

Let’s see the Python RegEx compile() example:

import re
future_pattern = re.compile(“[0-9]”) #This is a variable that can be stored for future use.
print(future_pattern.search(“1 s d f 2 d f 3 f d f 4 A l s”))
print(future_pattern.match(“1 s d f 2 d f 3 f d f 4 ”))

The result is as follows:

<_sre.SRE_Match object; span=(0, 1), match='1'>

Python RegEx Examples: Search, Findall, Match, and Compile

At this point, we can further explore some complex examples of using regular expressions in Python, covering re.search(), re.findall(), and re.compile(). We’ll start with a problem statement for each example:

Example#1: Use Python RegEx Search to Find Phone Numbers

Problem: Given a text, find and extract valid U.S. phone numbers in different formats (e.g., (123) 456-7890, 123-456-7890, 123.456.7890).

import re

# Sample text containing phone numbers
text = "Contact us at (123) 456-7890 or 123-456-7890. For support, call 555.555.5555 or (987) 654-3210."

# Define a regular expression pattern for matching U.S. phone numbers
phone_pattern = r'\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}'

# Search for and print all phone numbers in the text
start = 0
while True:
    match = re.search(phone_pattern, text[start:])
    if match:
        phone_number = match.group()
        print("Found phone number:", phone_number)
        start += match.end()
    else:
        break

Output:

Found phone number: (123) 456-7890
Found phone number: 123-456-7890
Found phone number: 555.555.5555
Found phone number: (987) 654-3210

Example#2: Use Python RegEx Findall to Parse HTML Tags

Problem: Extract and list all HTML tags from an HTML document.

import re

# Sample HTML document
html_text = "<h1>Hello, <b>World</b></h1> <p>Welcome to <a href='https://example.com'>Example</a></p>"

# Define a regular expression pattern for matching HTML tags
html_tag_pattern = r'<[^>]+>'

# Find and print all HTML tags in the document
html_tags = re.findall(html_tag_pattern, html_text)
print("Found HTML tags:")
for tag in html_tags:
    print(tag)

Output:

Found HTML tags:
<h1>
<b>
</b>
</h1>
<p>
<a href='https://example.com'>
</a>
</p>

Example#3: Use Python RegEx Match to Validate Passwords

Problem: Check if a password meets certain criteria – It is at least 8 characters long, contains at least one uppercase or a lowercase letter, and one digit.

import re

# Sample passwords
passwords = ["Passw0rd123", "Weakpass", "Secure@123", "ABCDabcd", "P@ss"]

# Define a regular expression pattern for password validation
password_pattern = r'^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)[A-Za-z\d]{8,}$'

# Validate passwords
for password in passwords:
    if re.match(password_pattern, password):
        print(f"'{password}' is a valid password.")
    else:
        print(f"'{password}' is not a valid password.")

Output:

'Passw0rd123' is a valid password.
'Weakpass' is not a valid password.
'Secure@123' is not a valid password.
'ABCDabcd' is not a valid password.
'P@ss' is not a valid password.

Also Read: Python Code to Generate Random Email in Python

Example#4: Use Python RegEx Compile to Validate Email

Problem: You need to validate a list of email addresses to check if they follow a valid email format. Valid email addresses should have the following characteristics:

  • They should contain an alphanumeric username (including dots and underscores) followed by ‘@’.
  • The domain name should contain alphanumeric characters, dots, and hyphens.
  • The top-level domain (TLD) should be 2-6 characters long and consist of only alphabetic characters.
import re

# Sample list of email addr
email_list = [
    "user@example.com",
    "user@my-website.net",
    "name.123@email.co.uk",
    "invalid.email@.com",
    "user@website.",
]

# Define a reg expr pattern for email validation
email_pat = re.compile(r'^[a-zA-Z0-9._]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}$')

# Validate email addr
for email in email_list:
    if email_pat.match(email):
        print(f"'{email}' is a valid email addr.")
    else:
        print(f"'{email}' is not a valid email addr.")

These examples demonstrate regular expressions to solve specific problems, including extracting phone numbers, parsing HTML tags, and validating passwords. They are a powerful tool for text manipulation and pattern matching in Python.

Further References

If you want to learn more about the module re in Python 3, visit the following link.

REF: https://docs.python.org/3/library/re.html

The link may be a bit too abstract for beginners or intermediate users. However, it will be worth referring to it for advanced users.

Lastly, our site needs your support to remain free. Share this post on social media (Facebook/Twitter) if you gained some knowledge from this tutorial.

Enjoy coding,
TechBeamers

Share This Article
Subscribe
Notify of
guest

0 Comments
Newest
Oldest
Inline Feedbacks
View all comments