MySQL and German Umlauts in a Unique Constraint

I recently came across some very odd behaviour in MySQL with German Umlauts (ö ä ü ß) and Unique Constraints. The problem is even documented as Bug #57860 on mysql.com. In short, MySQL (or rather utf8_unicode_ci) would suggest that foobär is the same as foobar. So the statement
INSERT INTO test (test) VALUES ('foobar'),('foobär');
where test is a column with a unique index/constraint would fail.

This behaviour might be desired in some languages, but particularly for the german language this behaviour is not optimal. I'm sure if you find this blog post, you came across the same problem so I do not need to come up with another example.

The solution is simple, though you should think twice before you use it:
Instead of utf8_unicode_ci, you could use utf8_general_ci or even the newer and more appropriate utf8_german2_ci (available starting with MySQL 5.6).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.