Removing digits, symbols, and the letters not used in the languages.
Behdad Esfahbod says Assamese is the same as Bengali, so this just uses bn.orth.