Skip to main content

An Introduction to AI Translation

As AI translation evolves and makes our lives easier, many questions arise about its impact and future potential. This article gives the essentials you need to know.
Created on February 9|Last edited on August 14
Human languages are complex structures full of intricacies, rules, nuance, and exceptions. For a long time, linguists believed that no other living creature could analyze human language and sentence structures. The promise of natural language processing — and specifically AI translation — has challenged that belief, and many are curious about how it works.
In this article, we give a brief introduction to AI translation. Here's what we'll be covering:

Table of Contents



Let's get going!

What Is Translation In Artificial Intelligence

Having evolved from rule-based machine translation (RMBT) and statistical machine translation (SMT), we now have the neural machine translation (NMT) that uses an Artificial Neural Network (ANN) to mimic the human brain cells, making it easy to receive feedback, adapt, and improve accuracy.
The prime example is SYSTRAN’s AI translator, which uses deep learning translation to create an ANN. Based on a structure called sequence-to-sequence, one recurrent neural network (RNN) encodes the L1 sentence, while the second RNN decodes the text to generate the L2 sentence.

Can AI Learn a Language?

AI can learn languages. The NMT machine teaches the ANN how to translate, much like we do with our brains. All neurons within the ANN are connected via parameters, and as the network receives more data, it automatically corrects those parameters, improving the final output.
To control the learning process, you can set hyperparameters beforehand. One example of a hyperparameter is the learning rate that determines how fast a parameter is corrected.

How Does AI Translation Work?

AI translation works by using an ANN to put the sequence of words into a sequence of numbers. Here, the encoder RNN encodes the L1 sentence into a number. Then, a neural translation model produces output numbers, allowing the decoder RNN to decode the output numbers into the L2 sentence.
Example of a simple English-to-Spanish translation process:
“I have a cat” is encoded into 251, 2134, 953, 4.
The output is 4356, 7654, 76, 5325.
The decoder RNN turns the output into “tengo un gato”.

Is NLP Used For Translation?

NLP is crucial to AI translation as it turns texts into binary code. It does so in two steps: First, it reprocesses data to extract relevant information from the text and then runs data analysis and classification using different algorithms.
In the first step, NLP uses techniques like stemming/lemmatization (finding a word’s base structure), tokenization (turning characters, words, and sub-words into tokens), and Part of Speech (POS) tagging to identify verbs, adjectives, adverbs, etc. in the text.
NLP uses different algorithms to analyze the preprocessed data, with rule-based systems and Machine Learning systems being two of the more popular ones.

Is AI Translation Accurate?

Thanks to significant improvements, AI translation is faster than any human translator, being able to translate one L1 into several L2s. It's efficient in repetitive tasks, but mostly in those that don’t require a high level of accuracy as the quality is far below human translators.
AI translators are effective at cross-format translations as well, the prime example being Google Translate which is one of the best image-to-text translation AI tools. With the current pace of advancements, the quality is expected to improve dramatically.

Will AI Replace Translators?

Despite producing natural-sounding texts, AI translation struggles with accuracy because engines select word nodes that may not be correct or pertinent, producing errors. The best solution is to use NMT to enhance human translation rather than see AI translate instead of a human.
As AI translation saves time for human translators by handling repetitive, simple tasks, human translators improve the NMT output accuracy by post-editing the text. This is what happens through computer-assisted translation (CAT) tools.
Also, there are 3,000+ unwritten languages in the world without a common written form that currently cannot be translated without the presence of a human agent. Meta’s No Language Left Behind (NLLB) currently covers 200 languages. It will eventually be extended to cover those unwritten languages.

What Is The Best AI Translator?

One of the most advanced AI translators is OpenAI’s GPT-3 model, which rivals the best machine translation software. It’s a language prediction model using NLP and natural language generation to identify the correct output based on its vast internet database.
Another example is DeepL Translator, using ANNs with millions of targeted text data points (mostly major corporations) and billions of parameters to offer higher quality.
MemoQ is a well-known translation software provider with a Pro version that incorporates its memory and term base capabilities to offer quality outputs.

OpenAI's GPT-3 Translation: An Example

Using generative pre-training, GPT-3 trains itself on data sets with 175+ billion ML parameters on sources such as Common Crawl and Wikipedia. To translate with this tool, you can provide the Prompt (the task description + the text) and let GPT-3 generate the translation. Here’s an example:
Prompt
Translate this into 1. French, 2. Spanish and 3. Japanese: “What rooms do you have available?”
Sample Response
1.Quels sont les chambres que vous avez disponibles?
2. Qué habitaciones tienes disponibles?
3. どの部屋が利用可能ですか?

Since GPT-3 was not trained in this task, this is called a zero-shot translation. But, to improve the quality further, you can provide one (one-shot) or several (few-shot) examples within the prompt section.
import os
import openai

openai.api_key = os.getenv("OPENAI_API_KEY")

response = openai.Completion.create(
model="text-davinci-003",
prompt="Translate this into 1. French, 2. Spanish and 3. Japanese:\n\nWhat rooms do you have available?\n\n1.",
temperature=0.3,
max_tokens=100,
top_p=1.0,
frequency_penalty=0.0,
presence_penalty=0.0
)

Conclusion

While AI translation is still far from matching the quality of human translators, we know that it is no longer unreasonable to imagine AI translating legal documents, papers, and even speeches with excellent quality.
Iterate on AI agents and models faster. Try Weights & Biases today.