Conversational AI tech improving by the day, will have huge impact: Ravi Saraogi, co-founder, Uniphore | Technology News


Ravi Saraogi co-founded Uniphore Software Systems, to solve what he claims is a ”frontier problem for the world” – bridging the gap between humans and machines through voice and AI.

Uniphore is a pioneer in conversational AI and its products are being used by over 1,500 customers and clients in more than 20 countries. It is one of the world’s largest AI-native companies and finds mention in the Deloitte Technology Fast 500, a ranking of the fastest-growing technology and telecom companies in North America. Their products combine Generative AI, Knowledge AI, Emotion AI, and workflow automation to guide companies deal with their customers. They bring to the table their capacity to capture and analyse voice, video and text data and their products can track and monitor conversations by intent, sentiment, emotional and tonal analysis to help enterprises leverage data from conversations.

Uniphore’s Emotion AI not only tracks one-on-one interactions, but also how a large number of participants are feeling, like in a conference room. In addition to facial analysis, their tech analyses body language, sentiment, back channel and talk-overs, disposition and tone.

Saraogi spoke to indianexpress.com on the challenges and opportunities of conversational AI in an Indian context, the way the industry is progressing, and how it could prove impactful in an Indian milieu. Edited excerpts:

Venkatesh Kannaiah: Can you tell us broadly how conversational AI could be used to impact education and health outcomes in India?

Festive offer

Ravi Saraogi: You must understand that AI in its present form would have a much bigger impact on society than the earlier innovations of internet, cloud and mobile combined. Whether it will turn out to be positive or negative depends on the kinds of rules and regulations that our governments and we as a society come up with to monitor, audit and manage the same.

Conversational AI is a layer on top of AI per se. You should recognise that machines are becoming more and more capable, with the amounts of data that we are putting into them, and with more and more AI ready data that is inputted, they become more capable. We humans need to interact with these machines, and they need to understand how we converse. Conversation has three parts, firstly the input, whether voice or text, second is the emotion and meaning behind the interaction which arises out of the emphasis we put on various words. The next layer is the human expression, which comes from video, and if we are able to capture all of it, we capture the full nuance of the conversation. Machines are now ready and able to understand the conversation in all the three aspects, language, voice and video.

Story continues below this ad

We can now track the tonal intonations, capture the intent of a customer call or a conversation, gauge the overall emotion of the person who is making the call through voice analysis, and if video is available, monitor facial expressions and come to a conclusion on the sentiments or the emotional status of the caller. Add to this the earlier data of the customer interactions, we have a full picture of the whole conversation.

Now, this is what Uniphore can do in the normal course for an enterprise customer. This kind of tech has huge opportunities in the field of education and health and in solving other social problems.

A simple application would be to say that during online classes, we can monitor student behaviour, intent and their levels of concentration or real-time attentiveness and the teacher can focus more on the students who are underperforming, or those who are not able to understand the concept. We can equip the teacher with potential questions and responses, based on ongoing interactions, and even when there is no teacher involved, an AI application can understand the question and answer the student.

Uniphore “English would be the language where most applications are being built first and since there is a global focus, conversational AI in English is fairly advanced,” Sarogi says. (Express photo by Jithendra M)

Right now in telemedicine, a lot of machine translation is being used. When a patient asks a question in his native language, there could be real-time translation for the doctor, who then answers in his own language and it is given to the patient. Assuming we have enormous amounts of data on facial expressions and voice samples of patients in pain or under distress, we can load it into AI applications and doctors will start to have a much better understanding and these would become crucial decision support tools for doctors. What we need is enormous amounts of data, both voice and video that needs to be inputted. This is not a very futuristic scenario, the tech is progressing fast and within the next three to five years, there could be a large number of applications.

Story continues below this ad

Venkatesh Kannaiah: How relevant is conversational AI tech in an Indian language context?

Ravi Saraogi: English would be the language where most applications are being built first and since there is a global focus, conversational AI in English is fairly advanced. We have captured multiple variations of speech, and we operate in all English speaking markets. In Indian languages, the tech has matured when it comes to Hindi and Tamil, and in other major languages it is progressing fast.

Ravi Saraogi Ravi Saraogi

With Indian languages the challenge is that the dialects change every 100 kilometres and so there are wide variations of expression even within the same dialect. But when we get to emotional analysis of voice or of facial expressions, we are a bit behind English, as voice samples in real life situations come with wide variations. There could be a lack of clarity of the voice itself or background noise and a multitude of other things. However, the more data we get and input, the faster the machines learn and it gets better by the day. The science of speech is so complex that we can research it forever.

Venkatesh Kannaiah: Can you tell us about Emotion AI and how it could be used for impact. What could be the likely scenarios for its use in India?

Story continues below this ad

Ravi Saraogi: The model we work with summarises human emotion into six basic categories – fear, anger, joy, sadness, disgust and surprise — and 200 secondary emotions. We train our Emotion AI on such models to find the raw emotion. And then we explore how deep that emotion is.

Right now, banks and airlines are using some of the Emotion AI features. For them, there are two issues to monitor. One is to find out how agitated or angry a customer is on a particular issue and take remedial action and secondly, it is also used to monitor how the call centre agent is interacting with the customer. Was the tone, content of the message and the communication right and appropriate for the occasion? This is what enterprises are looking for.

But in the Indian context, this tech has interesting applications. It can perhaps be used to monitor patient-doctor interactions at the first point of contact to find out how responsive the doctor is, or say in situations of student-teacher interactions, to find out how interested the student is to acquire knowledge or how responsive the teacher is to the questions of the student. At the end of the day, the machine needs to understand human emotion in the right way for it to analyse and help us, and it is for us to teach it the same.

Venkatesh Kannaiah: Can you tell us about Knowledge AI and its implications in the Indian govtech context?

Story continues below this ad

Ravi Saraogi: Governments are very good at building large platforms for citizens. If you look at the government as an enterprise, it is sitting on tons of data, either it be Aadhaar data or education, healthcare, traffic and mapping. It can look to set up a core processing infrastructure, align the whole stack of data that it has, enable Knowledge AI on top of it, make the data AI ready and then open it to universities and startups, to build applications and large language models with generative AI capabilities. This will help create citizen facing applications, in a much more calibrated manner, say in traffic data, town planning or helping connect patients and doctors in a much more AI-enabled manner. Such applications can help a wide range of stakeholders like farmers, students and patients.

Venkatesh Kannaiah: Do you work with governments in India?

Ravi Saraogi: In partnership with IIT Madras, we have worked with the government of Tamil Nadu to build applications for providing farmers with information on agricultural practices, natural calamities, prices, on availability of agriculture equipment, and prices of seeds and agriculture inputs. This initiative has reached almost 40 lakh farmers. We have also worked with the election commission of Tamil Nadu for providing information for voters.

Venkatesh Kannaiah: What is right and what is wrong with the AI ecosystem in India?

Ravi Saraogi: The governments both at the Centre and the states are forward-looking. India has come up with guidelines on the use of AI. They are investing a lot and are looking at AI in a positive manner and with an open mind. They are also hopeful that it would help solve large social problems. There is also a focus on incubators and accelerators. So overall it is a positive signal for AI-based startups and entrepreneurs.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top