• Pl chevron_right

      Christian Hergert: Translating French

      news.movim.eu / PlanetGnome • 9:50 • 1 minute

    I have been spending more time learning French lately, and as often happens, that turned into a small side project. liblingua is not intended to be a big mainstream translation platform. It is a fun GLib/GObject library for experimenting with local machine translation from applications.

    The library uses Bergamot from Mozilla for the translation backend. Instead of sending text to a web service, liblingua resolves the language pair you ask for, downloads the required language model into the local user cache, and then performs translation locally.

    The high-level API is built around a few small objects: LinguaRegistry discovers available translation profiles, LinguaProfile represents a model that can be loaded or downloaded, LinguaProgress reports download progress, and LinguaTranslator performs the translation.

    All potentially blocking work is exposed as DexFuture , so it fits naturally into libdex based applications. If you are already using fibers, the code can stay linear and easy to read with dex_await_object() .

    A Small Example

    Here is the basic shape of translating French into English:

    #include <liblingua.h>
    
    static void
    translate_example (void)
    {
      g_autoptr(LinguaProgress) progress = lingua_progress_new ();
      g_autoptr(LinguaRegistry) registry = NULL;
      g_autoptr(LinguaProfile) profile = NULL;
      g_autoptr(LinguaTranslator) translator = NULL;
      g_autoptr(LinguaTranslation) result = NULL;
      g_autoptr(GListModel) profiles = NULL;
      g_autoptr(GError) error = NULL;
    
      if ((registry = dex_await_object (lingua_registry_new (), &error)) &&
          (profiles = lingua_registry_resolve (registry, "fr", "en")) &&
          (profile = g_list_model_get_item (profiles, 0)) &&
          (translator = dex_await_object (lingua_profile_load (profile, progress), &error)) &&
          (result = dex_await_object (lingua_translator_translate (translator, "Bonjour"), &error)))
        g_print ("%s\n", lingua_translation_get_translation (result));
      else
        g_printerr ("Error: %s\n", error->message);
    }
    

    The first time a model is needed, loading the profile may download it. After that, the model is reused from the local cache. That makes liblingua useful for little tools, demos, and desktop experiments where local translation is preferable to wiring everything through a remote service.

    In the future this is probably the type of thing we would want as a desktop service to avoid duplicating caches amongst Flatpak applications. It would also be extremely useful to do live translation in Camera and Image Preview apps. I played a bit with that using Tesseract for OCR and it worked better than expected.