I have a Unicode string in Python, and I would like to remove all the accents (diacritics).
I found on the Web an elegant way to do this in Java:
- convert the Unicode string to its long normalized form (with a separate character for letters and diacritics)
- remove all the characters whose Unicode type is "diacritic".
Do I need to install a library such as pyICU or is this possible with just the python standard library? And what about python 3?
Important note: I would like to avoid code with an explicit mapping from accented characters to their non-accented counterpart.