Press Release

Twigfarm is awarded ‘Data Construction Project for Machine Learning’ of KRW 8 billion, creating 10 million sentence-corpus.


Twigfarm (CEO Sunho Baek) is awarded 'Data Construction Project for Machine Learning’, a project lead by “Ministry of Science and ICT” and the “National Information Society Agency (NIA)”. Twigfarm will undertake the sub-categories 'Korean-English Colloquial and Technical Science Translation Corpus’and  'Broadcasting and Specialized Multilingual Translation Corpus’.

Established in 2016, Twigfarm researches neural network-based language processing engine that includes a data refining, data inspecting, and translating technology which are all critical in the ongoing project.  The same technology is implemented on Twigfarm’s own products ‘Gcon Studio’, a translation platform and ‘heybunny’, an e-mail-based newsletter platform.  

This project is centered on Twigfarm and Activo. Twigfarm, focuses on R&D of neural network-based translator and operates Gcon Studio, a translation platform. Activo focuses on professional business management. In addition, professional translation companies such as Lexcode, Furmo DT, and Eqqui Korea, are involved as well. Professors and students of Hankuk University of Foreign Studies and Chung-Ang University will be translating the data.  Both universities have excellent translation and interpretation training program.  All the while the data will be thoroughly inspected by Korean Standards Association.

Sunho Baek, CEO of Twigfarm, said, “Data preprocessing has been largely automated, but it is true that still much of it is manual work. Together with F&J, we plan to provide education and jobs to socially disadvantaged and women with career breaks.” F&J is a social enterprise.

Twigfarm is actively engaged in the translation corpus building project. Since Twigfarm is a company developing machine translation, high-quality translation corpus is essential to advancing the neural network-based translator. They intend to pick-out and refine text data useful for learning and construct data composed of original text and translated text all the while intending to build labeling data that can be used for machine learning.

CEO Baek Sun-ho said, “Due to the lack of data it is difficult to research and develop neural network-based machine translators. Our goal is to improve the research environment for Korean machine translation.”

← Go to News list