andrenatal

André is a multi-awarded software engineer with over 20 years of experience in developing, architecting, managing and maintaining large software projects, having worked in various segments of the economy, such as the internet, legal/government, television networks, oil & gas, hardware, finance, and mobile. Possesses knowledge of various technologies and platforms such as distributed computing techniques for high-performance development for smartphones, software architecture, databases, network, and communication protocols, web, and especially speech and conversational-based applications and intelligent agents by utilizing artificial intelligence and machine learning techniques

Firefox Translations: A Game-Changer for Secure, Multilingual Web Browsing

I am overjoyed to share with you a milestone in the journey of a project close to my heart. If you’ve been following my work, you know that I’ve been passionately developing an add-on for Mozilla Firefox, known as Firefox Translations. Today, I am proud to announce that Firefox Translations is no longer just an add-on—it’s now a built-in feature in Mozilla Firefox.

What does this mean for you as a user? Essentially, this translates (pun intended) to a safer and more seamless browsing experience. With the integration of Firefox Translations, you can now navigate the multilingual web effortlessly, without the need to worry about your data being sent to external servers for translation. Your data stays on your device, where it belongs.

From the inception of this project, my vision was clear. I wanted to offer a translation tool that respects user privacy and provides an unparalleled browsing experience. Today, we are one step closer to realizing this vision. As you surf the web, Firefox Translations will quietly work in the background, bridging language barriers and helping you connect with the content you wish to access, all while keeping your data secure on your device.

I want to express my deepest gratitude to everyone who has supported this project. This is a shared victory for all of us who value privacy and accessibility on the web. The reception and feedback from each one of you have been instrumental in refining Firefox Translations and helping it evolve.

Going forward, we’re not resting on our laurels. We continue to be dedicated to enhancing Firefox Translations, ensuring it stays at the forefront of privacy-friendly tools. We’re excited about the journey ahead and can’t wait for you to experience the next iterations of Firefox Translations.

Thank you for being part of this journey. Here’s to a more secure, accessible, and multilingual web!

Firefox Translations, how we did it.

Introduction

In September 2020 Mozilla put me in charge of leading our responsibilities of project Bergamot, which consisted of a partnership between Mozilla and The University of Edinburgh, Univerzita Karlova, University of Tartu, and The University of Sheffield. Being a globetrotter myself and after multiple years leading our speech programs, I accepted the opportunity of building such an extremely useful feature. I understand how important it is, especially due to its privacy preserving aspects.

    The project was funded through grants offered by the European Horizon 2020 program whose main goal was to build a set of neural machine translation tools with the ultimate objective of enabling edge based in-page translation of web pages in Firefox which was accomplished in June 2022.

    The Firefox Translations web extension is available to installation on Firefox Addons, its code published on Github, and our final report available to download from the Horizon website. This article contains a summarized high level description of how we did it while highlighting the features and offering insights about the technology stack utilized and how it can be utilized for further development.

Technology stack

Mozilla had three major responsibilities in the grant: to encapsulate the machine translation framework around a high-level API and make it usable by 3rd-party applications (which we refer in this article as engine), to build a reproducible model training pipeline, and to develop the translation application in the form of a web extension.

Engine

The open-source machine translation framework chosen for the job is called Marian NMT and the very first task was to build a high level API around it so it could be imported as a shared library by the extension (and other applications). And that’s where it enters the bergamot-translator, whose task is to expose high-level functions and route those calls to Marian. 

Bergamot-translator, like Marian, was written in C++.  Based on that we needed to make a decision in terms of how to integrate it with the translation application in the browser:

  • We could merge (or link) bergamot-translator (and consequently Marian) entirely in Firefox’s codebase tree (also called as gecko-dev) and build it natively. That option was quickly dismissed after conversations with our security and platform’s teams: it was impractical to maintain such a large 3rd party codebase as part of Firefox builds. Also, Marian itself had limitations preventing it from running on ARM devices. That option was a no-go.
  • We could build a separate native application in C++, sideload it via the extension, and make them communicate along native messaging. That sounded more approachable and posed fewer security risks (being a separate process), but it would still be impractical to both ship and maintain a second application (remember that Firefox is built for multiple platforms). Also, the ARM incompatibility would still be present. The friction of asking the user to download yet another application could hurt adoption, since it wouldn’t be as friendly as simply installing a Web Extension from Mozilla Addons website. So we ruled that out too.
  • Third option would be to transpile bergamot-translator entirely to WebAssembly and simply import it directly into the Web Extension’s JavaScript stack and have cross-platform compatibility for free. In that scenario, the engine would run in the WebExtension process reducing the security risks, and we wouldn’t need to perform any extreme modifications of Firefox codebase. The WebAssembly’s module could be shipped along the extension. That of course would come with some performance penalties which we would need to take action to mitigate, but it was more feasible, especially with the short period of time we had available.     

We wanted to build something self-contained, plug-and-play, and seamless to install by the users. Options one and two required either too much friction or too much fiddling with Firefox internals, so having the engine as WebAssembly artifact bundled in the extension and shipped on addons.mozilla.org as a standalone addon seemed the best scenario. With that, I personally made the decision of choosing the 3rd option. We still needed to perform a number of optimizations, specially to speed-up the engine’s matrix multiplication operations andother adaptations to make this port fully compatible and operational.

That task was performed by my teammate Abhishek Aggarwal along with our SpiderMoney team and the consortium of universities. That job has an entire article on its own, and I really recommend reading it to fully understand how we did it.

Training pipelines

The second responsibility Mozilla had was to build a comprehensive and reproducible training pipeline so we could have independence to train our own models.

What we found when we started working on it was a collection of scattered scripts lacking documentation that prevented non-Marian maintainers from using it in a seamless way, so we decided to develop a workflow script using snakemake and integrated with a slurm orchestration engine in order to facilitate scaling and deployability of the training jobs on training clusters reducing the manual maintainability and overhead to run experiments.

Marian uses transformer models that are trained utilizing traditional Teacher-Student distillation, and the training steps are similar to other popular deep learning based NMT architectures. It starts with a data acquisition step, which in our case consisted of open source datasets.  That goes through a series of data cleaning techniques and augmentation using a common strategy called back-translation. With that set, a very large transformer model is trained and fine tuned, which is then used in the compression step to create a small student model by using a knowledge distillation technique in conjunction with quantization. The compressed student models can get up to 47 times smaller and 37 times faster than the teacher models, with not significant quality penalties.

The whole code for the training pipeline is open-source and is available on our Github, along with the trained models, and a complete description of the process was published by my colleague Evgeny Pavlov on a dedicated Hacks Post.

Extension 

With that determined was time to develop the extension. Like mentioned above, we wanted something that was seamless to install, without the need of installing and sideloading any native applications. We also decided to not use any external modules or dependencies: since this was going to be a system’s privileged extension, we needed to have full control of the entire codebase, not only for performance but also security implications, and we also didn’t want to incur on any sort of dependency hell.

  • Architecture and Design Pattern

We adopted a pattern that mixes the Model-View-Controller (MVC) Pattern and a mediator which is responsible for orchestrating the communication across the components. In the Web Extension world, a background script has access to the entire set of WebExtensionAPIs and in this specific case to the experiments API, which is a specific set of Firefox APIs exposed to only privileged extensions.  So we needed a mediator to dispatch and handle messages between the background script and the content scripts which were injected to the page and responsible for rendering views, loading the translation workers, manipulating the DOM and rendering the translations, i.e all the visual functionalities.

  • Engine integration

The WebAssembly port of Marian is shipped along the extension, whose total size is only 3.1 megabytes. The engine is then loaded inside a WebWorker which is injected into the page as a content script in order to perform the translations. An algorithm responsible for parsing the page traverses the DOM and enqueues HTML tags to the translation worker, which processes, translates, and submits them back to the parser who then replaces them properly in-place in the DOM.

  • Language Detection

We utilize the FastText engine to perform automatic language detection. Language detection is utilized to determine if the translation interface should automatically be summoned to suggest translation, and also to auto-detect the language input in the free-form translation popup. FastText contains a WebAssembly port that’s shipped along with its model within the extension.

  • Models Download

The models consist of Numpy arrays wrapped in a custom binary format in order to optimize loading, and each pair of language models has a compressed file size of around 40 megabytes.  Logically we needed to download them on demand, which is performed whenever a specific language pair is selected by the user, and then cached using Web Extension’s localstorage API. English is used as a pivot, so if for example a translation from Portuguese to Spanish is selected, a translation from Portuguese -> English -> Spanish is performed.  

  • DOM manipulation

As mentioned above, the DOM manipulation is performed entirely in the page’s content script by an algorithm that uses both MutationObserver and TreeWalker native WebAPIs to traverse the DOM and then enqueue the HTML tags using batching techniques into the translation worker, who then calls the engine’s high-level APIs. 

The trick here is that Marian itself can’t translate well when HTML tags are involved. Bergamot-translator itself contains code to handle the re-alignment of the content of these HTML tags properly to preserve their style and attributes properly when re-rendered after being replaced in the DOM.

The visible viewport is also prioritized that way the user can have a faster perception of the translations being performed, while hidden parts of the page are translated next.

  • Localization

The extension was entirely localized by our localization community through Mozilla Pontoon, which is a tool used to facilitate the localization of Mozilla products by our global community. 

The extension had a total of almost 900 strings translated to 21 languages in record time thanks to our volunteers who did an amazing job in order to have the product properly localized to the languages we support.

  • User Interface

We had an original plan of utilizing a user interface presented in the form of a pageAction’s popup, but Firefox did not support programmatically opening a popup at the time, and we wanted to be able to suggest to the user when the page being visited could be translated.

Firefox already contained a translation interface designed and integrated into Gecko from previous attempts of developing such functionality, which allowed the use case of not only auto-opening it but a seamless integration with the browser, so we naturally decided to just re-use that and overload with our new functionality set.

Features

The particularity of local based translations offered by our product allowed us to innovate and build unique features thanks to its privacy preserving characteristics, and the fact we didn’t need to worry about incurring expensive cloud costs to host an online service. We were required to develop a set of features demanded by the grant, but we went ahead and added extra ones, which we’ll cover in this section.

  • Page translation

Translation of pages is offered automatically to the user when it is determined that the page’s language is supported by the extension. For the cases when the interface is not auto-displayed, the user can summon it by clicking the translations icon in the navigation bar. The source and target language can be selected and the translation then starts after the models are properly downloaded or loaded from the cache.

  • Dynamic pages

The extension observes the pages for DOM changes and translates newly added nodes making content that’s dynamically added to pages always auto-translated.

  • Auto-translation while browsing

You also have the option of choosing that pages be automatically translated. That way you can browse websites in languages you can’t read and have it automatically translated to your language as you browse.  This offers the unique experience of seamlessly transforming a non-understandable website into one that is as you navigate through it.

  • Translation of forms

The extension offers translation of input forms which allows you to fill forms in the page’s language by writing in your own language. That feature also offers the ability of having the translated content to be translated back to your language, allowing you to inspect if it’s understandable and so that no mistakes are being input.

  • Error highlighting

The extension offers the ability of highlighting translation utterances returned by the engine with a low confidence score. That way you can have a better indicator if something is wrong with the translated sentence.

  • Free form translation

The extension also offers free form translations easily accessible in the browser. A popup containing a complete translation tool, also entirely local based, is displayed by clicking in the Firefox Translations’s browser action. It allows you to switch languages and also detects the language of the input text.

  • Translation of selected content

Sometimes you don’t want to have the entire page translated, but just a portion. That way you can highlight a specific part of the page and use the context menu to have it automatically translated by the translation popup. 

  • Android

Firefox Translations also have experimental support on Firefox for Android, Nightly, and Beta versions. By following these instructions you can have it installed in your phone. It currently supports page translations and free form translations. 

Other applications

The same set of tools used by the extension allowed us to experiment with different applications:

  • Translations Website

We built a translation website using the same local-based concepts: you choose the source and target languages, the engine and models are loaded in the page and you just start typing without requiring installing any extension. 

The entire site is static and hosted on Github pages. Try to switch off the internet after selecting the languages while having the models installed and then translate some content so you can notice the magic.

  • Translations service

We encapsulated the translation engine into a docker container hosting a HTTP service that exposes a REST API. If you need to have translations seamlessly integrated into a back-end solution at scale you can start with this project. We import the native bergamot-translator binaries, so it runs at full performance.

  • Bergamot Translator

If you are interested in the low-level translations engine that empowers all these products above you can just use bergamot-translator. You can either directly link it directly using a compatible C/C++ compiler, import the WebAssembly module, or its Python bindings.

Conclusion

When we started the grant project we had a very clear set of requirements in terms of features and requirements, but as is common on academic research, we weren’t required to ship a working product to our end users: just running controlled experiments and producing qualitative and quantitative reports would suffice for the scope of the research. 

I found that frustrating and didn’t want to settle for less, so from the very first day we pushed and envisioned having it publicly available to all our users in the Release channel.  On June 2nd 2022, we released Firefox Translations to the general audience on Mozilla Addons. It’s been a hit since then with constant traction and steady growth millions of translations offered so far and thousands of articles published about it.  

We were deeply satisfied that in the end, we delivered the research funded by the Horizon Europe programme, added value to our users through a much needed and long overdue Firefox’s capability, and built a set of tools for the tech community to build upon. 

That proves running high quality machine learning technologies on the edge is a feasible reality even on very constrained platforms, and that it’s possible to deliver products of quality to users without requiring them to give up their privacy.

Mozilla releases local machine translation tools as part of Project Bergamot

In January of 2019, Mozilla joined the University of Edinburgh, Charles University, University of Sheffield and University of Tartu as part of a project funded by the European Union called Project Bergamot. The ultimate goal of this consortium was to build a set of neural machine translation tools that would enable Mozilla to develop a website translation add-on that operates locally, i.e. the engines, language models and in-page translation algorithms would need to reside and be executed entirely in the user’s computer, so none of the data would be sent to the cloud, making it entirely private.

In addition to that, two novel features needed to be introduced. The first was translation of forms, to allow users to input text in their own language that is dynamically translated on-the-fly to the page’s language. The second feature was quality estimation of the translations where low confidence translations should be automatically highlighted on the page, in order to notify the user of potential errors.

This set of requirements posed a number of technological challenges to the team: the translation engine was entirely written in programming languages that compile to native code. We needed a way to streamline the distribution of the project in order to avoid the overhead involved in providing builds compatible with all platforms supported by Firefox — that would be impracticable to scale and maintain. Also, the engine needed to perform fast enough on CPUs and not rely on GPUs like is traditionally required by deep learning solutions.

Our solution to that was to develop a high-level API around the machine translation engine, port it to WebAssembly, and optimize the operations for matrix multiplication to run efficiently on CPUs. That enabled us to not only develop the translations add-on but also allowed every web page to integrate local machine translation, like in this website, which lets the user perform free-form translations without using the cloud.    

The translations add-on is now available in the Firefox Add-On store for installation on Firefox Nightly, Beta and in General Release. We are looking for users’ feedback and in the add-on, you’ll see a button to fill out a survey that will help Project Bergamot collaborators understand which direction we should take the product. 

To empower the community to contribute with new languages we also developed a comprehensive training pipeline to allow enthusiasts to easily train new models, helping expand the add-on reach. 

This work aligns with Mozilla’s commitment to keeping the web accessible to everyone regardless of their language while also building open-source projects of value to our community with a focus on privacy. Please join us and send suggestions — we need all of your voices to make this add-on truly accessible for all. 

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825303.  🇪🇺