Portfolio News

Daktilograf: Deep learning speech-to-text solution for Slavic languages

Nov 09 , 2022
Om3ga Daktilograf prototype
On this page

The UNICEF Venture Fund is featuring members of our A.I (Artificial Intelligence). & Data Science Cohort. In this interview, learn more about OM3GA, a Serbia-registered startup developing a deep learning, virtual speech-to-text solution integrated with a chatbot builder. The app currently supports Serbian, Bosnian, Croatian, and Montenegrin, with Russian in development.  

 

Tell us about your startup and what you are building: 

Daktilograf, a product of OM3GA, is a machine-learning text conversion engine for Slavic languages which recognizes an unlimited number of words and can work offline and online. Daktilograf is currently available in Serbian, Bosnian, Croatian, and Montenegrin, with Russian in development. 

 

How would you describe your solution to a non-technical person? 

The online mode, as well as dictation and transcription mode, demonstrate Daktilograf’s potential as a conversation-facilitating software, learning aid, or education assistant during online classes and in integrated classrooms. Offline mode can facilitate transcription in remote areas without an internet connection and on the move. Daktilograf is highly scalable, considering that we provide light installation into local systems, i.e., doesn't require expensive servers and an online connection. Daktilograf can be used in several industries, such as media, education, in-car systems, home automation, etc. 

 

What social issue/ impact area does your solution aim to solve? 

There are more than 350 million people worldwide who speak Slavic languages, but there is no speech-to-text software specifically created for them and accessible to everyone.  

Among Slavic speakers, there are 14 million people with hearing loss and over 2 million with vision loss.  

 

The top pain point for individuals with disabilities in schools is physical inaccessibility to study material, especially audio and video options. Voice commands and captioned material made with pre-existing speech-to-text technology can bridge that barrier, but not for students in less digitally competitive countries.  

 

Speech-to-text programs have never been more needed than they are now, especially when we have been faced with several waves of COVID-19 when everyone wore masks for a long time.  

 

Speech-to-text solutions from major global companies are specialized in detecting English-speaking users and are designed to work only with an internet connection.  

 

Besides Slavic languages, Daktilograf can be trained to learn up to 56 languages, with user-adjusted accuracy depending on the needs of users. 

 

Why use AI/Data Science? What’s promising about this technology? 

The reason we decided to use deep learning technology is closely tied to our vision to make such advanced solutions accessible and affordable to everyone to fulfill their purpose.  

"Our speech-to-text solution has the potential as a conversation-facilitating software, learning aid, or education assistant during online classes and in integrated classrooms. By becoming open source and opening datasets for speech-to-text training in Slavic languages to the public, we are giving an opportunity for others in need to build on top of our solution and develop similar products that will help the disabled community and the youth."

What is unique about your solution and how is it different from what currently exists? 

Schools in the Balkans traditionally do not have access to an internet connection, especially in rural areas. We believe that virtual assistants, subtitlers, and transcription should be accessible anywhere and not depend on the quality of the devices students use.  

 

There is a very light installation requirement for our software. This means that it can be implemented on a wide range of systems, from an individual old phone to a school's internal system. 

 

Why does being Open Source make your solution better? 

Open-source technology allows us to go back to our roots and make STT (speech-to-text) and TTS (text-to-speech) technology accessible to south Slavic-speaking people, enabling others who develop user tools as a basis for further development and encouraging voice commands to be part of every product.  

 

It also provides opportunities for the community to improve the product with knowledge and advice, opening opportunities to expand community support in gathering and improving training materials, community support in the development of new languages, and the use of publicly available materials. 

 

How did you come up with your solution and what inspired you to form your company? 

Daktilograf was founded ten years ago as a speech-to-text algorithm integrated into a prototype of a social network for children with special needs. After many updates, it became the independent speech-to-text engine for the Slavic languages.  

The team behind this project has worked together for several years. Daktilograf's development wasn't easy since it was an ambitious project. From building and collecting datasets from scratch to our first contracts and clients, our business relations, as well as friendships, were battle-tested in a highly competitive market.  

 

Tell us more about your team. What makes your team diverse? 

Our team comes from diverse backgrounds: linguistics, education, software development, online security, advocacy, and journalism.  

OM3GA team: Amela Bicic (top left), Snjezana Gomilanovic and Amil Cengic (top right), Darko Vucurovic (bottom left), Denis Vacurovic (bottom right)

Why is diversity important for your startup? How does it add value? 

We believe that diversity generates more innovation daily. Different professional and personal backgrounds give us different perspectives on the same subject, and trust and friendship bring our ideas to success. Additionally, South Slavic languages are rich in dialects and variations, and our team is regionally and linguistically varied. We have team members from Montenegro, Serbia, Bosnia... This means that we’ve covered most of our dialects in our training process. 

 

What do you plan to do with UNICEF’s Venture Fund investment and how will you use that to leverage raising follow-on investment? 

With the UNICEF Venture Fund’s investment, we will expand our network and become more visible, and that validation and expansion of our portfolio will help us raise follow-on investment in developing solutions for the market. 

“The five-year-long journey from bootstrapping, first investments, and development led us to the support of UNICEF and a step closer to bringing our vision to fruition. UNICEF fund opens a door for us to connect to a larger group of researchers and potential investors and expand our development in order.”

What challenges are you currently facing in building your solution and/or startup?  

Since we have been working for more than five years on developing our product, during that time we have found that developing such solutions is expensive and demanding. Our biggest challenges were a lack of understanding of the technology in low-digitized countries and a lack of financial support. 

 

How can others support you in working towards overcoming these challenges? 

We found that increasing the visibility of the solution leads to more financial support and a better understanding of machine learning technology. One advantage is that there is a broad spectrum of industries where Daktilograf can be used. However, we need more visibility to reach a specific audience of system integrators and developers who can build on top of our solution and speed up digital development. 

 

OM3GA company profile here.

Share this story