Defining optional (sub-) URI paths in Nginx locations using PCRE regular expressions

Written by - 0 comments

Published on - Listed in Nginx Perl


Defining locations in Nginx supports PCRE regular expressions. This is nothing new, it has been around for many years.

Location regex for (file) URI suffixes

The most popular and widely mentioned use case is a location used for file extensions:

location ~* \.(png|ico|gif|jpg|jpeg|css|js)$ {
    ...
}

This case insensitive (~*) regular expression location matches requested paths ending in either .png, .ico, .gif, .jpg, .jpeg, .css or .js.

In PHP-FPM environments this is often used to define what should happen with requests to .php files:

location ~* \.php$ {
    try_files $uri =404;
    fastcgi_pass  fastcgi_backend;
    fastcgi_buffers 16 16k;
    fastcgi_buffer_size 32k;
    fastcgi_read_timeout 600s;
    fastcgi_connect_timeout 600s;

    fastcgi_index  index.php;
    fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
    include        fastcgi_params;
}

What about optional paths?

A much less documented way to use regular expressions in locations are "optional" paths. For example if you want to define a location /shop, you may also want to use the same location to match different languages: /de/shop, /en/shop, /fr/shop.

Of course you could do this by defining different locations for each language:

location /shop {
  proxy_pass http://upstream;
}
location /de/shop {
  proxy_pass http://upstream;
}
location /en/shop {
  proxy_pass http://upstream;
}
location /fr/shop {
  proxy_pass http://upstream;
}

Or you can use just one single location using regular expression and define optional (sub-) paths:

location ~ ^/(?:de|fr|en)(?:\/)shop/ {
  proxy_pass http://upstream;
}

Let's analyze this regex in detail:

  • ~ = we're going to use case sensitive PCRE regular expressions in this location snippet. ~* would be case insensitive.
  • ^/ = The requested URI starts with a forward slash followed by ...
  • (?:de|fr|en) = a "group" defined by the parantheses (...) of possible values "de" or "fr" or "en". The question mark and colon at the begin marks the whole group as optional.
  • (?:\/) = another optional group but with just one character (a forward slash) this time. The forward slash has been escaped with a backward slash to not confuse the PCRE parser.
  • shop/ = a fixed path of the location, terminating with a forward slash

Now looking at this from top to bottom reveals:

  1. A simple /shop/ will match
  2. But also a /fr/shop/ will match, as will /de/shop and /en/shop
  3. And of course everything following the last forward slash matches the location, for example /en/shop/myproduct.php



Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.