Data for Machine Learning Gets a New Lease on Language

A recent press release reports, “Venga has released its third solution in its rapidly growing suite of products for Natural Language Processing (NLP) Data Collection. The new addition to the family is InVimage – a cloud-based solution for annotating text within images. With each annotation, we automatically capture the X and Y coordinates, OCR (Optical Character Recognition) the annotation and have the option to machine or manually translate the captured text. Through our Human-in-the-Loop step, both the OCR and translation text can be reviewed and edited. This ensures Venga’s clients receive the cleanest datasets possible for their training models. InVimage was built with scalability in mind and can handle hundreds of thousands of images daily. At the beginning of 2019, Venga released a completely redesigned version of our first solution InVtext, a solution that eliminates many of the quality issues that have plagued data set text translation. This was followed shortly by InVvoice that summer which simplified the management and translation of voice data.”

The release continues, “Venga started working on data collection projects back in 2017. Some of the larger data collection buyers were not getting the improvements expected in their models from other providers and wanted to test Venga is this field. Venga is the first to admit it wasn’t smooth sailing and suffered from delivery issues early on but learned very quickly and overcame many of the challenges that were causing models to stagnate in their development… The three data tools InVtext, InVvoice, and InVimage have been designed based on specific customer needs but are flexible enough to adapt to project-specific requirements.”

Data Topics

Data for Machine Learning Gets a New Lease on Language

Leave a Reply Cancel reply