So apparently everyone but myself knew that in MySQL, the utf8 character set is not true UTF-8, but a broken subset that supports only 3-byte characters. Also apparently utf8_general_ci is hopelessly defective and should never be used.

Hmmmph.

What you really want is utf8mb4 and utf8mb4_unicode_ci. This gives you true UTF-8 support and standards-compliant sorting.

Spent part of an afternoon updating all my code and converting my databases to support the correct charset. Should be good-to-go now.

Also updated the database-creation code in my LAMP Server Setup Guide.