Samsung Electronics showcases award-winning machine translation at WMT – Samsung Global Newsroom
At the Workshop on Machine Translation (WMT), one of the largest machine translation research events, Samsung Electronics joined the ranks of researchers from around the world to discuss new and innovative ways to understand human language. using machines and computer programs.
Samsung Research and Samsung R&D Institute Poland (SRPOL) participated in a competition between scientific groups and laboratories to compare the quality of their translation tools. Teams from around the world participated in the eight machine translation task competitions, ranging from those representing widely known companies to research groups from various universities.
The language lab at Samsung Research Global AI Center participated in the biomedical translation task, which aims to evaluate sentence translation systems in the biomedical field. The task covered a total of 14 language pairs, including English, French, German and Spanish. The team won first prize for effective translation of two language pairs: English → Spanish and Spanish → English. This was a particularly impressive feat due to the frequent use of the field’s terminology in the biomedical field.
In the case of domain-specific translation, one of the main factors that determines the quality of the translation is the terminological translation. Even with the same word, the translated word can vary depending on the field, and compared to general terms, technical terms are less frequently used, which makes learning them difficult. Given these limitations, the Language Lab at Samsung Research Global AI Center improved domain-specific translation performance by incorporating soft-constrained terminology translation, which provides the target language’s terminology constraints as input with source sentences. as a hint, and improved the terminology of the domain to be reflected as much as possible in the translation results. Currently, Samsung Research is conducting domain-specific translation research, including providing patent translation service (Korean—English) on Samsung Research’s “SR Translate” translation service (https://translate.samsung.com).
SRPOL also participated in two general machine translation tasks, achieving high rankings by placing second for English → Russian and English → Croatian.
During competitions, WMT only provides teams with a limited number of corpora, collections of structured texts, to analyze for their translation model. Therefore, the SRPOL team attributed its success to improving the quality of the corpora through processes such as data preprocessing and filtering. Additionally, the team focused on optimizing its model architecture and AI training process.
Using the improved corpus, the machine translation team at SRPOL built a classifier using a machine learning framework called BERT (Bidirectional Encoder Representations from Transformers). This classifier has successfully classified millions of sentences from the corpus into different domains. As a result, SRPOL was able to create templates not only for general but also medical and legal translation.
SRPOL has achieved good results in the field of machine translation, winning the challenges of the International Workshop on Spoken Language Translation (IWSLT), one of the world’s oldest workshops on machine translation, for four consecutive years of 2017 to 2020.
Today more than ever, the goal of reaching a human level of understanding language seems to be within reach. As machine translation and language understanding slowly become part of our daily lives, Samsung will remain at the forefront of this technology to build the tools to overcome language barriers and improve your daily life.