Skip to content

Python Regex

Python's re module provides powerful tools for working with regular expressions. Let's explore some of the most commonly used functions building off of the previous lesson.

For setup, take the text from the prior lesson and store it as the variable text and import the Python module, re.

re.search

The re.search() function is used to search for a pattern within a string. It returns a match object if the pattern is found, otherwise, it returns None.

pattern = r"Virtual XE Software (\S+)"

result = re.search(pattern, text)
if result:
    print("Pattern found:", result.group())
else:
    print("Pattern not found")

re.Match Object

The re.Match object, which should not be confused with the re.match method, is an object that represents a match. It is the object type that is returned in both the re.match and re.search methods. The object does not simply provide the matching criteria, let's explore a bit.

Methods:

  • group(): Returns the actual text that matched the pattern.
  • groupdict(): Returns the actual text that matched the pattern when using named capture groups.
  • start(): Returns the start index of the matched text in the input string.
  • end(): Returns the end index (exclusive) of the matched text in the input string.
  • span(): Returns a tuple containing the start and end indices of the matched text.
  • group(index): Returns the text that matched a specific group within the pattern. If the pattern contains capturing groups (enclosed in parentheses), you can access them using this method.

Example:

pattern = r"Cisco (?P<os>.+) Software, Version (?P<version>\S+)"

result = re.search(pattern, text)
if result:
    matched_text = result.group()
    group_dict = result.groupdict()
    start_index = result.start()
    end_index = result.end()
    span_indices = result.span()

    print("Matched text:", matched_text)
    print("Start index:", start_index)
    print("End index:", end_index)
    print("Span indices:", span_indices)
    print("Matched groups:", group_dict)
else:
    print("Pattern not found")

The re.Match object allows you to access details about the matched portion of the input string, which can be useful for further processing or manipulation.

re.match

The re.match() function checks if the pattern matches at the beginning of the string. It returns a match object if successful, or None if the pattern is not found at the beginning.

match_text = "License Level: ax"
pattern = r"License Level: (\S+)"

result = re.match(pattern, match_text)
if result:
    print("Pattern found at the beginning:", result.group())
else:
    print("Pattern not found at the beginning")

re.findall

The re.findall() function returns a list of all occurrences of the pattern in the string.

pattern = r"(Cisco|cisco)"

results = re.findall(pattern, text)
print("Matches found:", results)

If there is multiple matches, there will be returned as a list of tuple's.

pattern = r"(Cisco|cisco) (\S+)"

results = re.findall(pattern, text)
print("Matches found:", results)

re.split

The re.split() function splits the string by occurrences of the pattern and returns a list of matches.

results = re.split(",", "NYC-RT01,NYC-RT02,SFO-SW01,SFO-RT01")
print("Split results:", results) 
pattern = r"(Cisco IOS Software.+)"
result = re.search(pattern, text)

results = re.split(",", result.group())
print("Split results:", ", ".join(results[2:4]))

re.sub

The re.sub() function is used for search and replace. It replaces occurrences of the pattern with the specified replacement string. It returns as a string.

new_text = re.sub("(ROUTER|RTR)", "RT", "NYC-ROUTER01,NYC-ROUTER02,NYC-RTR03")
print("Replaced text:", new_text)

Regex Python Challenge Lab

Leveraging the answers provided in the previous example, go through the following challenges:

  • Create a single string, that:
    • Provides the uptime in f"Uptime: {days}:{hours}:{min}"
    • Provides the version in f"Version: {version}"
  • Counts the number of times that cisco is in the output
  • Counts the number of times that Cisco is in the output
  • Counts the number of times that Cisco or cisco is in the output
  • Provide the output of total and kernel memory (e.g. 2078006K/3075K).
    • f"Total Memory: {total_memory}"
    • f"Kernel Memory: {kernel_memory}"

Note: Please use named capture groups to help find data easier later.