Posted On: Apr 15, 2021
Characters that comprise text are represented as numbers to computers. Languages such as English, French, Russian, and Hebrew use character sets of fewer than 256 characters that can be encoded in a single byte. Languages with many characters (e.g., Japanese) require more numbers, which are represented by multi-byte encoding. pg_bigm allows you to create a 2-gram (bigram) index for faster full text search of text encoded with multi-byte characters.