;

C# regex


Regular expressions, commonly referred to as Regex, are a powerful tool for text parsing, searching, and manipulation. In C#, the System.Text.RegularExpressions namespace provides the Regex class, which allows you to work efficiently with regular expressions. This tutorial will walk you through the basics of regex in C#, the various methods and properties available, with examples and practical use cases to help you master using regular expressions.

Introduction to Regex

Regular expressions are patterns used to match character combinations in strings. They can be used for a variety of text-processing tasks, such as validation, searching, and replacing text.

C#'s Regex class in the System.Text.RegularExpressions namespace provides robust capabilities to implement regex-based operations seamlessly. By learning how to leverage the Regex class, you can efficiently perform string manipulations that might otherwise require cumbersome looping and conditional logic.

Why Use Regex?

  • Pattern Matching: Search for specific sequences of characters.
  • Data Extraction: Extract useful information, such as email addresses, phone numbers, etc.
  • Validation: Validate user inputs, such as phone numbers, dates, email addresses, etc.
  • Text Replacement: Find and replace patterns in text files or strings.

Regex Syntax

The syntax of regex involves a combination of characters that form a search pattern. Here are some common regex patterns:

  • .: Matches any character.
  • *: Matches zero or more occurrences of the preceding character.
  • +: Matches one or more occurrences of the preceding character.
  • [a-z]: Matches any lowercase letter.
  • \d: Matches any digit (0-9).
  • ^: Matches the start of a string.
  • $: Matches the end of a string.
  • |: Logical OR.

Example Patterns

  • Email Pattern: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
  • Phone Number Pattern: ^\d{3}-\d{3}-\d{4}$ (Matches a phone number in 123-456-7890 format)

Regex Class in C#: Methods and Properties

The Regex class in C# provides various methods to perform operations on text. Let’s explore these methods and properties.

Properties

Options: Gets the options that were passed to the Regex object.

Regex regex = new Regex(@"\d+", RegexOptions.IgnoreCase);
Console.WriteLine(regex.Options);  // Output: IgnoreCase

MatchTimeout: Gets the maximum time interval to execute a match.

Regex regex = new Regex(@"\d+", RegexOptions.None, TimeSpan.FromSeconds(1));
Console.WriteLine(regex.MatchTimeout);  // Output: 00:00:01

Methods

IsMatch(string): Checks if the regex matches a part of the given input string.

Regex regex = new Regex(@"\d{3}-\d{3}-\d{4}");
bool isMatch = regex.IsMatch("123-456-7890");
Console.WriteLine(isMatch);  // Output: True

Match(string): Searches an input string for a match.

Regex regex = new Regex(@"\d+");
Match match = regex.Match("My number is 42");
Console.WriteLine(match.Value);  // Output: 42

Matches(string): Returns all matches within an input string.

Regex regex = new Regex(@"\d+");
MatchCollection matches = regex.Matches("Numbers: 42, 56, 78");
foreach (Match match in matches)
{
    Console.WriteLine(match.Value);  // Output: 42, 56, 78
}

Replace(string, string): Replaces all matches within an input string with a specified replacement.

Regex regex = new Regex(@"\d+");
string result = regex.Replace("I have 2 apples and 3 oranges", "many");
Console.WriteLine(result);  // Output: I have many apples and many oranges

Split(string): Splits an input string based on the regex pattern.

Regex regex = new Regex(@"\s+");
string[] words = regex.Split("This is a test");
foreach (string word in words)
{
    Console.WriteLine(word);  // Output: This, is, a, test
}

Examples of Regex Usage

Example 1: Validate a Phone Number

To validate a phone number in the format 123-456-7890, you can use the following code:

using System.Text.RegularExpressions;


class Program
{
    static void Main()
    {
        Regex phoneRegex = new Regex(@"^\d{3}-\d{3}-\d{4}$");
        string phoneNumber = "123-456-7890";
        
        if (phoneRegex.IsMatch(phoneNumber))
        {
            Console.WriteLine("Valid phone number");
        }
        else
        {
            Console.WriteLine("Invalid phone number");
        }
    }
}

Explanation:

  • The pattern ^\d{3}-\d{3}-\d{4}$ checks for three sets of digits separated by hyphens.
  • IsMatch checks if the phone number matches this pattern.

Example 2: Extract All Numbers from a String

using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
        Regex regex = new Regex(@"\d+");
        string input = "The order numbers are 123, 456, and 789.";
        
        MatchCollection matches = regex.Matches(input);
        
        foreach (Match match in matches)
        {
            Console.WriteLine($"Found number: {match.Value}");
        }
    }
}

Explanation:

  • The pattern \d+ matches one or more digits.
  • Matches returns all the matches, which are then printed.

Example 3: Replace All Digits with Asterisks

using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
        Regex regex = new Regex(@"\d");
        string input = "Password123!";
        string replaced = regex.Replace(input, "*");
        
        Console.WriteLine(replaced);  // Output: Password***
    }
}

Explanation:

  • The pattern \d matches any digit.
  • Replace replaces all digits with *.

Example 4: Splitting a Sentence into Words

using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
        Regex regex = new Regex(@"\W+");
        string input = "Hello, world! Welcome to C#.";
        string[] words = regex.Split(input);
        
        foreach (string word in words)
        {
            Console.WriteLine(word);
        }
    }
}

Explanation:

  • The pattern \W+ matches one or more non-word characters (e.g., spaces, punctuation).
  • Split breaks the sentence into individual words.

Real-World Example: Email Validation

A common use of regex is to validate email addresses. Let’s look at an example that demonstrates email validation.

Step 1: Define the Regex Pattern

The pattern for validating emails is ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$.

  • ^[a-zA-Z0-9._%+-]+: Matches the local part of the email.
  • @: Matches the @ symbol.
  • [a-zA-Z0-9.-]+: Matches the domain name.
  • \.[a-zA-Z]{2,}$: Matches the top-level domain (e.g., .com, .org).

Step 2: Implement Email Validation in C#

using System;
using System.Text.RegularExpressions;


class Program
{
    static void Main()
    {
        Regex emailRegex = new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$");
        string email = "test@example.com";
        
        if (emailRegex.IsMatch(email))
        {
            Console.WriteLine("Valid email address");
        }
        else
        {
            Console.WriteLine("Invalid email address");
        }
    }
}

Explanation:

  • The IsMatch method checks if the email matches the specified pattern.
  • The regex pattern ensures that the email has a valid structure.

Use Case: This type of validation is widely used in web forms, sign-up pages, and any system requiring user input verification to ensure data consistency.

Key Takeaways

  • Regex is Powerful: Regex allows for efficient pattern matching, searching, and replacing text.
  • Regex Class Methods:
    • IsMatch: Check if a pattern matches a string.
    • Match and Matches: Extract specific information.
    • Replace: Replace matched text with a specified string.
    • Split: Split strings based on a pattern.
  • Real-World Utility: Regular expressions are ideal for validating email addresses, phone numbers, and for finding and replacing text.
  • Regex Patterns: Understanding regex syntax is crucial for defining effective search and replace patterns.

Summary

The Regex class in C# is a robust tool for text processing tasks such as searching, extracting, validating, and replacing text. By mastering the methods and properties provided by the Regex class, you can handle complex string operations easily, enhancing both the readability and maintainability of your code.

We covered the most commonly used methods (IsMatch, Match, Replace, Split) and properties of the Regex class, along with several practical examples to demonstrate their use. Whether you're validating user input like email addresses, extracting numbers, or splitting sentences into words, regex provides a versatile solution for a wide range of programming challenges. With these capabilities, you can write code that efficiently manages text-related tasks in C#.