Real-world String Comparison: How to handle Unicode sequences correctly

Research output: Contribution to journalArticlepeer-review

Abstract

In many languages a string comparison is a pitfall for beginners. With any Unicode string as input, a comparison often causes problems even for advanced users. The semantic equivalence of different characters in Unicode requires a normalization of the strings before comparing them. This article shows how to handle Unicode sequences correctly. The comparison of two strings for equality often raises questions concerning the difference between comparison by value, comparison of object references, strict equality, and loose equality. The most important aspect is semantic equivalence.
Original languageEnglish
Pages (from-to)107-116
JournalQueue
Volume19
Issue number3
DOIs
Publication statusPublished - Jul 2021

Cite this