Course Topics
Python Basics Introduction and Setup Syntax and Indentation Comments and Documentation Running Python Programs Exercise Variables and Data Types Variables and Assignment Numbers (int, float, complex) Strings and Operations Booleans and None Type Conversion Exercise Operators Arithmetic Operators Comparison Operators Logical Operators Assignment Operators Bitwise Operators Exercise Input and Output Getting User Input Formatting Output Print Function Features Exercise Control Flow - Conditionals If Statements If-Else Statements Elif Statements Nested Conditionals Exercise Control Flow - Loops For Loops While Loops Loop Control (break, continue) Nested Loops Exercise Data Structures - Lists Creating and Accessing Lists List Methods and Operations List Slicing List Comprehensions Exercise Data Structures - Tuples Creating and Accessing Tuples Tuple Methods and Operations Tuple Packing and Unpacking Exercise Data Structures - Dictionaries Creating and Accessing Dictionaries Dictionary Methods and Operations Dictionary Comprehensions Exercise Data Structures - Sets Creating and Accessing Sets Set Methods and Operations Set Comprehensions Exercise Functions Defining Functions Function Parameters and Arguments Return Statements Scope and Variables Lambda Functions Exercise String Manipulation String Indexing and Slicing String Methods String Formatting Regular Expressions Basics Exercise File Handling Opening and Closing Files Reading from Files Writing to Files File Modes and Context Managers Exercise Error Handling Understanding Exceptions Try-Except Blocks Finally and Else Clauses Raising Custom Exceptions Exercise Object-Oriented Programming - Classes Introduction to OOP Creating Classes and Objects Instance Variables and Methods Constructor Method Exercise Object-Oriented Programming - Advanced Inheritance Method Overriding Class Variables and Methods Static Methods Exercise Modules and Packages Importing Modules Creating Custom Modules Python Standard Library Installing External Packages Exercise Working with APIs and JSON Making HTTP Requests JSON Data Handling Working with REST APIs Exercise Database Basics Introduction to Databases SQLite with Python CRUD Operations Exercise Final Project Project Planning Building Complete Application Code Organization Testing and Debugging Exercise

Regular Expressions Basics

Introduction

  • Why this topic matters: Regular expressions (often abbreviated as regex or regex) are a powerful tool used in text processing and search-and-replace operations. They allow you to match patterns in strings, making it easier to manipulate and analyze data.
  • What you'll learn: In this tutorial, we will cover the basics of regular expressions, including key terminology, practical examples, common issues, solutions, best practices, and key takeaways.

Core Concepts

  • Main explanation with examples: A regular expression is a sequence of characters that forms a search pattern. It can match strings of text according to specified criteria, such as finding all occurrences of a specific word or pattern within a larger body of text.
    ```python
    import re # Importing the built-in re module for working with regular expressions

pattern = r'\w+@\w+.\w{2,3}' # Example pattern that matches email addresses
text = "user1@example.com and user2@gmail.com are valid emails"

matches = re.findall(pattern, text) # Find all matches in the given text using the findall function
print(matches) # ['user1@example.com', 'user2@gmail.com']
`` - Key terminology: - **Pattern**: A sequence of characters that represents a search pattern for matching strings. - **Literal characters**: Characters in the pattern that are matched exactly as they appear, such asa,b, or1. - **Metacharacters**: Special characters with a specific meaning within regular expressions, like^,$,., and*. - **Grouping**: Groups are enclosed by parentheses (()`) and can be used to capture substrings matched by the group for further processing or manipulation.

Practical Examples

  • Real-world code examples:
    ```python
    # Find all phone numbers in a text with area codes (e.g., 123-456-7890)
    pattern = r'(\d{3})\s\d{3}-\d{4}'
    text = "Call me at 123-456-7890 or 456-7890-123"
    matches = re.findall(pattern, text)
    print(matches) # ['(123) 456-7890']

# Remove all HTML tags from a string (e.g.,

Hello World!

)
pattern = r'<[^>]*>' # Matches any HTML tag
text = '

Hello World!

This is a div
'
clean_text = re.sub(pattern, '', text) # Replaces all matches with an empty string using the sub function
print(clean_text) # 'Hello World! This is a div'
`` - Step-by-step explanations: In these examples, we use theremodule to find and manipulate patterns within strings. Thefindallfunction finds all non-overlapping matches of the pattern in the given string, while thesub` function replaces all matches with a specified replacement string.

Common Issues and Solutions (CRITICAL SECTION)

SyntaxError

What causes it: Using single quotes (') instead of double quotes (") around the regular expression pattern when importing the re module or defining variables containing patterns.

# Bad code example that triggers a SyntaxError
import re  # Wrong syntax for importing re
pattern = r'\w+@\w+\.\w{2,3}'  # Using single quotes instead of double quotes for the pattern definition

Error message:

  File "example.py", line 1
    import re  # Wrong syntax for importing re
               ^
SyntaxError: invalid syntax

Solution: Use double quotes (") to enclose the regular expression pattern.

# Corrected code
import re
pattern = r'\w+@\w+\.\w{2,3}'  # Using double quotes for the pattern definition

Why it happens: Python requires double quotes (") for import statements and string literals containing special characters. Single quotes are not accepted in these cases.
How to prevent it: Use double quotes when defining regular expression patterns or importing modules containing them.

NameError

What causes it: Attempting to use a variable that has not been defined yet within the current scope.

# Bad code example that triggers a NameError
pattern = r'\w+@\w+\.\w{2,3}'  # Defining the pattern
text = "user1@example.com and user2@gmail.com are valid emails"
matches = re.findall(pattern, text)  # Attempting to use the pattern variable without defining it
print(matches)  # NameError: name 'pattern' is not defined

Error message:

Traceback (most recent call last):
  File "example.py", line X, in <module>
    matches = re.findall(pattern, text)
NameError: name 'pattern' is not defined

Solution: Define the pattern variable before using it in the code.

# Corrected code
pattern = r'\w+@\w+\.\w{2,3}'  # Defining the pattern
text = "user1@example.com and user2@gmail.com are valid emails"
pattern_defined = True  # Adding a flag to indicate that the pattern has been defined
matches = re.findall(pattern, text)
print(matches)  # If pattern_defined is True, the code will run without errors

Why it happens: NameError occurs when you attempt to use a variable that has not been defined in the current scope. In this case, the pattern variable was not defined before being used, causing the error.
How to prevent it: Always define variables before using them in your code.

Best Practices

  • Professional coding tips:
    • Use raw strings (prefixing a string with an 'r') when defining regular expression patterns to ensure that backslashes are treated literally and not as escape characters.
    • Test your regular expressions on smaller input strings before applying them to larger bodies of text, to avoid performance issues or potential errors.
    • When working with complex regular expressions, consider breaking the pattern into smaller, more manageable groups for easier understanding and maintenance.
  • Performance considerations: Keep in mind that regular expressions can be computationally expensive, especially when dealing with large amounts of data. If you encounter performance issues, consider using alternative methods, such as string methods or libraries like pandas, to handle your text processing needs.

Key Takeaways

  • Regular expressions (regex) allow you to match patterns in strings and perform search-and-replace operations.
  • Import the built-in re module for working with regular expressions in Python.
  • Use raw strings (prefixing a string with an 'r') when defining regular expression patterns.
  • Test your regular expressions on smaller input strings before applying them to larger bodies of text.
  • When dealing with complex patterns, break them down into smaller groups for easier understanding and maintenance.
  • Be aware that regular expressions can be computationally expensive, so consider using alternative methods if performance is a concern.
  • Familiarize yourself with common issues, solutions, best practices, and key terminology to effectively use regular expressions in your code.

Next steps for learning: Explore more advanced regular expression features like character classes ([]), quantifiers (*, +, ?, {n}, {m,n}), lookahead and lookbehind assertions ((?=) and (?<=)), and named groups for even more powerful text manipulation capabilities.