How to match case insensitive (uppercase and lower case) strings with regular expression in Bash

Written by - 0 comments

Published on September 4th 2020 - Listed in Coding Bash Linux


Bash is a great program language to quickly write scripts and tasks. But there is one particular thing Bash isn't well suited for: String comparisons (I'd use Perl for this). But with a little bit of Bash magic, this works well, too.

Basic string matching

There are string comparisons available using test. From the test man page:

        STRING1 = STRING2
              the strings are equal

       STRING1 != STRING2
              the strings are not equal

In general string comparisons work:

ckadm@mintp ~ $ var="linux"
ckadm@mintp ~ $ if [[ "$var" = "linux" ]]; then echo "yes"; else echo "no"; fi
yes

But what if one wants to know if a part of a string matches another string, kind of as a regular expression?

ckadm@mintp ~ $ var="linu"
ckadm@mintp ~ $ if [[ "$var" = "linux" ]]; then echo "yes"; else echo "no"; fi
no

Because the stored variable $var only contains a part of the string ("linu"), the string comparison does not work. test tries to compare both $var and the string and only if they match 100%, then the test would return true (or in this case "yes").

When running Bash in verbose mode (-x), the comparison can be seen:

ckadm@mintp ~ $ var="linu"
ckadm@mintp ~ $ set -x
set -x
ckadm@mintp ~ $ if [[ "$var" = "linux" ]]; then echo "yes"; else echo "no"; fi
+ [[ linu = \l\i\n\u\x ]]
+ echo no
no

The verbose mode clearly shows that [[ linu = \l\i\n\u\x ]] will not match (the backslashes can be ignored for the human eye).

Regular expression string matching

Bash however adds another feature to test: The regular expression hyphen (~). By using it in the test condition, Bash will do a regular expression matching.
But use caution! The position of the variable is highly important. If the variable $var is used on the left side, there will be no match:

ckadm@mintp ~ $ var="linu"
ckadm@mintp ~ $ if [[ "$var" =~ "linux" ]]; then echo "yes"; else echo "no"; fi
no

But if the variable is on the right side of the comparison, the regular match should work:

ckadm@mintp ~ $ var="linu"
ckadm@mintp ~ $ if [[ "linux" =~ "$var" ]]; then echo "yes"; else echo "no"; fi
yes

Important note: The regular expression hyphen (~) only works in test conditions with double brackets [[ condition ]].

Now to the next problem: Case insensitive matching!

Case insensitive regular expression string matching

The previous tests always used lowercase letters and string matching worked. But what if the variable contains a mix of lowercase and uppercase letters?

ckadm@mintp ~ $ var="LinuX"
ckadm@mintp ~ $ if [[ "linux" =~ "$var" ]]; then echo "yes"; else echo "no"; fi
no

Because of the upperscale L and X in "LinuX", the variable doesn't match "linux" anymore - even with regular expression.

To handle this, Bash needs to be told to switch into "nocasematch" mode. This can be done using the shopt command (a Bash specific command to set or unset Shell options, see Shopt on Cyberciti):

ckadm@mintp ~ $ var="LinuX"
ckadm@mintp ~ $ shopt -s nocasematch; if [[ "linux" =~ "$var" ]]; then echo "yes"; else echo "no"; fi
yes

What about just a part of the full string?

ckadm@mintp ~ $ var="LiN"
ckadm@mintp ~ $ shopt -s nocasematch; if [[ "linux" =~ "$var" ]]; then echo "yes"; else echo "no"; fi
yes

The condition now finally returns "yes" because a case insensitive regular expression matches.


Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.