Joe Honton
Jan 15, 2021

--

I'm a big fan of UTF-8, because it gives me the ability to do so many things. Your criticisms don't hit the mark for me. For example, the ternary-trie search (https://github.com/readwritetools/ternwords) fully implements word look-ahead on the entire UTF-8 character set.

For an in-the-wild demonstration, try clicking on the 🔎 button on https://readwritestack.com/components/search.blue

  • Search for words beginning with š (a diacritic not typically included in codepages) and you'll discover šefik.
  • Search for words beginning with γ (a Greek letter) and you'll discover γεωγραφία.
  • Search for Hebrew words beginning with ט״ (remember that Hebrew is a right-to-left language) and you'll discover ט״ו and ט״ז.
  • Search for Japanese words beginning with 漢 (this is a multi-byte character) and you'll discover 漢字.

--

--

Joe Honton
Joe Honton

Written by Joe Honton

Princeps geographus, Read Write Tools

No responses yet