Convert database and tables from utf8 to utf8mb4 and solve collation errors in Jira

Written by - 0 comments

Published on September 29th 2020 - last updated on October 3rd 2020 - Listed in Atlassian DB MySQL


After Jira's database was migrated from a MariaDB 10.3 to a MySQL 5.7 and Jira was updated from 8.5 to 8.12, issues started when creating a ticket with special characters (such as an emoticon) in it:

(SQL Exception while executing the following:INSERT INTO jiraaction (ID, issueid, AUTHOR, actiontype, actionlevel, rolelevel, actionbody, CREATED, UPDATEAUTHOR, UPDATED, actionnum) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) (Incorrect string value: '\xF0\x9F\x98\x8A \x0A...' for column 'actionbody' at row 1))

The reason for this is that Jira now supports UTF8MB4 character encoding in combination with MySQL 5.7 (see Connecting Jira applications to MySQL 5.7). In MySQL 5.6 this character encoding caused application issues and setting the Jira database to UTF8 was mandatory (see Connecting Jira applications to MySQL 5.6).

Jira and database / collation compatibility

As a general rule of thumb, the supported MySQL character sets and collations by Jira are (as of September 2020):

  • MySQL 8.0: utf8mb4_bin and utf8mb4_0900_ai_ci. The second is the new default collation in MySQL 8.0. MySQL 8.0 is supported since Jira 8.12.
  • MySQL 5.7: utf8mb4_bin.
  • MySQL 5.6: utf8_bin. MySQL 5.6 is not supported anymore in Jira 8.12 and later.

Additionally it depends on your Jira (and Confluence) version, whether or not it can communicate with a utf8mb4 MySQL database. The following list indicates the minimum release version to be able to use utf8mb4 from an application point of view:

  • Jira 8.x and later, running on MySQL 5.7 or later
  • Confluence 7.3 and later, running on MySQL 5.7.9 or later
Older releases, especially when using a MySQL 5.6 database, must continue to use utf8.

Changing the database to utf8mb4

Note: This basically applies to all databases, not only to Jira.

In this case we created a database dump, dropped the Jira database and re-created it with UTF8MB4 (as described here):

mysql> CREATE DATABASE Jira CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;

Afterwards the database dump was imported into this new empty database.

Note: It would also have been possible to simply alter the database using:

mysql> ALTER DATABASE Jira DEFAULT CHARACTER SET = utf8mb4 DEFAULT COLLATE = utf8mb4_bin;

See article How to change the collation of a MySQL database for more information.

This is of course much faster than to export and import a database dump.

Error: Unsupported collation: utf8mb4_bin

However after a Jira restart, the instance health checks showed an error in the database collation check:

Your mysql database is currently using an unsupported collation: utf8mb4_bin. You should change this to a supported collation: utf8_bin

Jira health check database error: unsupported collation utf8mb4_bin

In order that Jira starts to use the newer UTF8MB4 character encoding, it needs to be told to use a MySQL 5.7 (using mysql57 as database type) now. This needs to be defined in Jira's dbconfig.xml:

root@jira:~# head /var/atlassian/application-data/jira/dbconfig.xml
<!--?xml version="1.0" encoding="UTF-8"?-->

<jira-database-config>
<name>defaultDS</name>
<delegator-name>default</delegator-name>
<database-type>mysql57</database-type>
<jdbc-datasource>
<url>jdbc:mysql://dbserver:3306/Jira?useUnicode=true&characterEncoding=UTF8&sessionVariables=default_storage_engine=InnoDB</url>
<driver-class>com.mysql.jdbc.Driver</driver-class>
<username>jirauser</username>
</jdbc-datasource>
</jira-database-config>

Note: The previous database-type was just 'mysql'.

But this alone did not solve it...

Error: 'utf8_bin' is unsupported by Jira

After database-type was changed in dbconfig.xml and Jira restarted, the next error was shown in the database health check:

The table collation: 'utf8_bin' is unsupported by Jira. The Database collation: 'utf8mb4_bin' is supported by Jira.

Jira health check database error: unsupported table collation utf8_bin

Important here is the mention of "table collation". This error is showing because although the database's encoding was set to utf8mb4, all the tables (imported from the dump) still have the old utf8 collation active.

Verifying table encoding and collations

According to Jira's health check, the table collations are still set to utf8_bin. This can be verified using the following SQL query (limited to 15 tables):

mysql> SELECT DISTINCT table_name,table_collation FROM information_schema.tables WHERE table_schema = "Jira" LIMIT 0,15;
+--------------------------------+-----------------+
| table_name                     | table_collation |
+--------------------------------+-----------------+
| AO_013613_ACTIVITY_SOURCE      | utf8_bin        |
| AO_013613_EXPENSE              | utf8_bin        |
| AO_013613_EXP_CATEGORY         | utf8_bin        |
| AO_013613_FAVORITES            | utf8_bin        |
| AO_013613_HD_SCHEME            | utf8_bin        |
| AO_013613_HD_SCHEME_DAY        | utf8_bin        |
| AO_013613_HD_SCHEME_MEMBER     | utf8_bin        |
| AO_013613_PERMISSION_GROUP     | utf8_bin        |
| AO_013613_PROJECT_CONFIG       | utf8_bin        |
| AO_013613_SAVED_REPORT         | utf8_bin        |
| AO_013613_SAVED_REPORT_V2      | utf8_bin        |
| AO_013613_WA_SL_VALUE          | utf8_bin        |
| AO_013613_WA_VALUE             | utf8_bin        |
| AO_013613_WL_SCHEME            | utf8_bin        |
| AO_013613_WL_SCHEME_DAY        | utf8_bin        |
+--------------------------------+-----------------+
15 rows in set (0.01 sec)

But that isn't enough. Not only the tables have a collation - the columns inside the tables (may) also have character encoding defined. The following SQL query helps to find all the column names and the relevant tables (here limited to 15 results):

mysql> SELECT DISTINCT column_name,table_name,character_set_name,collation_name FROM information_schema.columns WHERE table_schema = "Jira" AND character_set_name IS NOT NULL LIMIT 0,15;
+-------------+----------------------------+--------------------+----------------+
| column_name | table_name                 | character_set_name | collation_name |
+-------------+----------------------------+--------------------+----------------+
| SOURCE_ID   | AO_013613_ACTIVITY_SOURCE  | utf8               | utf8_bin       |
| SOURCE_TYPE | AO_013613_ACTIVITY_SOURCE  | utf8               | utf8_bin       |
| CREATED_BY  | AO_013613_EXPENSE          | utf8               | utf8_bin       |
| DESCRIPTION | AO_013613_EXPENSE          | utf8               | utf8_bin       |
| SCOPE_TYPE  | AO_013613_EXPENSE          | utf8               | utf8_bin       |
| NAME        | AO_013613_EXP_CATEGORY     | utf8               | utf8_bin       |
| ITEM        | AO_013613_FAVORITES        | utf8               | utf8_bin       |
| TYPE        | AO_013613_FAVORITES        | utf8               | utf8_bin       |
| USER        | AO_013613_FAVORITES        | utf8               | utf8_bin       |
| DESCRIPTION | AO_013613_HD_SCHEME        | utf8               | utf8_bin       |
| NAME        | AO_013613_HD_SCHEME        | utf8               | utf8_bin       |
| DESCRIPTION | AO_013613_HD_SCHEME_DAY    | utf8               | utf8_bin       |
| NAME        | AO_013613_HD_SCHEME_DAY    | utf8               | utf8_bin       |
| TYPE        | AO_013613_HD_SCHEME_DAY    | utf8               | utf8_bin       |
| USER_KEY    | AO_013613_HD_SCHEME_MEMBER | utf8               | utf8_bin       |
+-------------+----------------------------+--------------------+----------------+
15 rows in set (0.02 sec)

This means: Not only the tables need to be altered for the new UTF8MB4 encoding, but also the table columns shown above.

Note: The advantage here is that all existing tables and columns use utf8. If some of the encodings would still use latin1, this might cause migration problems (and should probably first be migrated from latin1 to utf8).

Time to backup!

Before continuing and starting the character conversion in the MySQL database, Jira should be stopped and a full database dump/backup should be created.

root@jira:~# mysqldump --routines --events --single-transaction --quick --max_allowed_packet=512M Jira | gzip > Jira.20200929.sql.gz

utf8 to utf8mb4 conversion

A couple of good examples on how to migrate from utf8 to utf8mb4 can be found in the responses of a DBA Stackexchange question. A good way was shared by Stackexchange user mrjingles87, but it needs to be slightly adjusted for Jira.

First create a file, we name it migrate-to-utf8mb4.sql, with the following content:

root@jira:~# cat migrate-to-utf8mb4.sql
use information_schema;
SELECT concat("ALTER DATABASE `",table_schema,"` CHARACTER SET = utf8mb4 COLLATE = utf8mb4_bin;") as _sql
FROM `TABLES` where table_schema like "Jira" and TABLE_TYPE='BASE TABLE' group by table_schema;
SELECT concat("ALTER TABLE `",table_schema,"`.`",table_name,"` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;") as _sql  
FROM `TABLES` where table_schema like "Jira" and TABLE_TYPE='BASE TABLE' group by table_schema, table_name;
SELECT concat("ALTER TABLE `",table_schema,"`.`",table_name, "` CHANGE `",column_name,"` `",column_name,"` ",data_type,"(",character_maximum_length,") CHARACTER SET utf8mb4 COLLATE utf8mb4_bin",IF(is_nullable="YES"," NULL"," NOT NULL"),";") as _sql
FROM `COLUMNS` where table_schema like "Jira" and data_type in ('varchar','char');
SELECT concat("ALTER TABLE `",table_schema,"`.`",table_name, "` CHANGE `",column_name,"` `",column_name,"` ",data_type," CHARACTER SET utf8mb4 COLLATE utf8mb4_bin",IF(is_nullable="YES"," NULL"," NOT NULL"),";") as _sql
FROM `COLUMNS` where table_schema like "Jira" and data_type in ('text','tinytext','mediumtext','longtext');

Note: It's very important to use COLLATE utf8mb4_bin. The original collate of the Stackexchange article above uses utf8mb4_unicode_ci, which is not compatible with Jira.

Now run the sql file against the MySQL server:

root@jira:~# mysql < migrate-to-utf8mb4.sql | egrep "^ALTER" > finalalterquery.sql

This generates the ALTER TABLE queries for each table and each column (needing a conversion) and saves it in an additional file (finalalterquery.sql). The queries also keep existing data type limits, such as varchar character limits:

root@jira:~# grep varchar finalalterquery.sql |tail
ALTER TABLE `Jira`.`userpickerfiltergroup` CHANGE `groupname` `groupname` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL;
ALTER TABLE `Jira`.`versioncontrol` CHANGE `vcsname` `vcsname` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL;
ALTER TABLE `Jira`.`versioncontrol` CHANGE `vcsdescription` `vcsdescription` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL;
ALTER TABLE `Jira`.`versioncontrol` CHANGE `vcstype` `vcstype` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL;
ALTER TABLE `Jira`.`workflowscheme` CHANGE `NAME` `NAME` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL;
ALTER TABLE `Jira`.`workflowschemeentity` CHANGE `WORKFLOW` `WORKFLOW` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL;
ALTER TABLE `Jira`.`workflowschemeentity` CHANGE `issuetype` `issuetype` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL;
ALTER TABLE `Jira`.`worklog` CHANGE `AUTHOR` `AUTHOR` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL;
ALTER TABLE `Jira`.`worklog` CHANGE `grouplevel` `grouplevel` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL;
ALTER TABLE `Jira`.`worklog` CHANGE `UPDATEAUTHOR` `UPDATEAUTHOR` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL;

The finalalterquery.sql file can now be run against the MySQL server. Before you do that, make sure that the application using this database (Jira in this case) is stopped. An ALTER TABLE will lock the table and the application might stop working correctly.

root@jira:~# mysql < finalalterquery.sql

This took 8 minutes to complete on a ~4GB Jira database.

Jira start and health checks are green

After Jira was started again (without any changes in dbconfig.xml), the instance health checks finally showed all checks passed.

Atlassian Jira or Confluence Hosting needed?

Looking for professional Atlassian Jira or Confluence server hosting in Switzerland? Head on over to Infiniroot!


Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.