What’s the problem with regular expressions?

Posted on

In this article we will learn about some of the frequently asked Php programming questions in technical like What’s the problem with regular expressions?. When creating scripts and web applications, error handling is an important part. If your code lacks error checking code, your program may look very unprofessional and you may be open to security risks. Error handling in PHP is simple. An error message with filename, line number and a message describing the error is sent to the browser. This tutorial contains some of the most common error checking methods in PHP. Below are some solution about What’s the problem with regular expressions?.

Regular expressions might seem tricky and hard to read and write especially for beginners.

Let’s see if it’s just a misunderstanding.



Getting started

A regular expression, also called regex or regexp, is a search pattern. This search pattern allows you to match a specific subsequence in a sequence of chars.

There are a lot of different engines you can use to match this subsequence with your pattern. The most popular are probably POSIX and PCRE.

Depending on what engine you use you may get different results but we won’t see that point in this tutorial.

PHP includes PCRE functions you can use to apply your regex to a sequence. For example, preg_match() searches subject for a match to the regular expression given in a pattern :

$chocolateString = "I want more chocolate in my chocolate.";
if (preg_match("/chocolate/", $chocolateString) === 1) {
    echo "I see some chocolate";
}
Enter fullscreen mode

Exit fullscreen mode

The “/” are delimiters for your pattern. Be careful, regex are case sensitive by default :

$chocolateString = "I want more Chocolate in my Chocolate.";
if (preg_match("/chocolate/", $chocolateString) === 1) {
    // this won't match here
} else {
    echo "I don't see any chocolate";
}
Enter fullscreen mode

Exit fullscreen mode

Fortunately, you can use the i modifier to fix this :

$chocolateString = "I want more Chocolate in my Chocolate.";
if (preg_match("/chocolate/i", $chocolateString) === 1) {
    echo "I see some chocolate";
}
Enter fullscreen mode

Exit fullscreen mode



More advanced examples

You can use meta characters to include alternatives and repetitions in your pattern :

$chocolateString = "I want more Chocolate in my Chocolate.";
if (preg_match('/(choco|late)/', $chocolateString) === 1) { 
    echo "I see some choco but it might be too late!";
}
Enter fullscreen mode

Exit fullscreen mode

The | meta character allows for alternative branches.

You can even define subpatterns :

<?php
$chocolateString = "I'm already late but I want more milk in my chocomilk.";
if (preg_match("/choco(late|milk)/i", $chocolateString) === 1) {
    echo "I see some milk or chocolate.";
}
Enter fullscreen mode

Exit fullscreen mode

Here we don’t care about the first “late”, the parentheses allow for matching both “chocolate”, “choco” and “chocomilk”. You can verify that with the following :

<?php
$chocolateString = "I'm already late but I want more milk in my chocomilk.";
$test = preg_match_all("/choco(late|milk)/", $chocolateString, $matches);
print_r($matches);
Enter fullscreen mode

Exit fullscreen mode

this would print :

Array
(
    [0] => Array
        (
            [0] => chocomilk
        )

    [1] => Array
        (
            [0] => milk
        )

)
Enter fullscreen mode

Exit fullscreen mode



Regex can be bad

As you can see, you have to use a specific syntax to make your search pattern work. It’s just a matter of practice and using the right modifier|quantifier at the end of the day.

But problems might happen. It might take much much longer to process your search if your pattern is bad. A bad pattern might trigger a lot of unnecessary operations instead of aiming at what you need.

This is not micro-optimization! Bad regexes can take up to thousands of milliseconds whereas several tens milliseconds at most with the right pattern.



Use regex101 to test your pattern

Regex101 is one of the most popular online testing tools. It has an extra cool interface with great features such as save and share, debugger and code generator.

Besides, it will provide some useful explanations to help you understand why your pattern is not working.



Don’t use regex for everything and nothing

PHP has built-in filters for validation. For example, instead of trying to write a custom regex pattern to validate e-mail addresses, use the following filter :

if (filter_var($email, FILTER_VALIDATE_EMAIL)) { }
Enter fullscreen mode

Exit fullscreen mode

There are filters to validate IPs, integers, URLs and even regexp.

If you need to make sure that a string contains some specific chars it’s probably a better idea to use something like strpos() :

$target = 'choco';
$string = 'chocolate, chocolate, chocolate';
$pos    = strpos($string, $target);

if ($pos !== false) {
    echo "This has some choco.";
}
Enter fullscreen mode

Exit fullscreen mode

To another extent, if you need to parse HTML or XML then using a parser is a much much better idea than trying to match things with a regex pattern.



But learn regex anyway

There are some great PHP functions or libraries out there. Most of the time, you can get what you want without any regex.

However, you must be able to read them. It’s not rare to have questions about regexes during interviews.

Besides, regexes are available across many languages, so statistically, you will have to deal with them.

Leave a Reply

Your email address will not be published. Required fields are marked *