Who doesn’t know that? No matter whom you want to call, online shops, mobile phone providers, banks, insurance companies – you end up at a call centre. Before you are put through to a human employee, there are annoying queries from a computer as to why you are calling, then you hang in the queue forever and finally have an annoyed call center employee on the other end who has absolutely no understanding for the fact that you are not in a good mood and cannot even help at the end, because he again has absolutely no idea. However, the competence and professionalism of the service agents used for telephone hotlines have a major influence on customer satisfaction. In the fifth and final part of our intelligent speech recognition series, we look at how this technology can improve the situation for customers and call center staff alike.
“Currently, the choice for which agent receives which call is randomly assigned,” says Dagmar Schuller, CEO and co-founder of the Munich-based start-up audEERING. “On the other end of the line, I have a customer who in 99% of cases doesn’t call because he wants to say thank you for something or just wants to wish the call center employee a nice day. Most of the time these are people who have just had an accident, where the cat has thrown down the Ming vase, where the window has been broken or people who have been annoyed by their reckoning, which may be wrong. Which means they’re already stressed anyway and probably not emotionally in a great state.”
If there is also an equally stressed agent in the call center, problems are inevitable. In order to improve the quality of service, call centre agents must, therefore, receive individual training – not only in technical matters but above all with regard to their general appearance towards customers. This is done on the basis of evaluated calls that are recorded. Of course, speech, voice and emotion recognition are essential and this is where intelligent speech recognition comes into play.
Less stress on both sides
“Our software is called Callyser and it opens up the possibility of analysing follow-up calls and creating new key figures for the call centre,” explains Dagmar Schuller. At the moment, evaluations would usually only take place according to the duration and number of calls of an agent. “But that doesn’t really do the agent justice, because if I have an agent who perhaps has fewer conversations a day, but is one who can turn a negative customer into a positive one simply because he listens well or articulates well or gives the customer a good feeling, he is worth much more than one who has 20 more conversations but doesn’t manage that.”
The new software, which is currently being tested as a beta version at a large German and Swiss company, opens up opportunities to develop new types of key figures. It can be determined “in what condition the customer comes in, and which agent is best equipped for this caller. In addition, there are many more ways to determine what the course of the conversation is. If I use the system in real time, I even have the possibility to intervene, i.e. to recognize early whether the conversation slips further into the negative and if necessary, switch on the supervisor, who takes over, because he is more experienced”.
The system could also suggest to the agent to offer the customer a voucher to appease him and the agent would not have to justify himself to his superior later on, as the neutral system had advised him to do so. “One must also think of the agent who has such constant conversations. At some point, the caller may not hear how desperate the other end of the line is after the seventh or eighth call. But it’s about making the person who calls happy and helping them to solve the problem,” explains Schuller, “because it’s much more costly to win a new customer than to keep an old one. “And every customer is different. One is very emotional, the other is less emotional. To tune in to the respective customers and listen to them accordingly, stops at some point after the fourth, fifth or sixth conversation. The call centre analyses we do show, for example, how long the conversation has lasted, how often one has given in the other’s voice, how the entire emotional course has turned out in order to give the opportunity to intervene early, to make the customer happy and to relieve the agent on the other hand also stresstechnically”. This helps both sides, the customer and the agent.
Intelligent routing
Another step might even be possible. We are talking here about intelligent routing, which, thanks to the technology that only takes two or three seconds to evaluate what a voice sounds like, could be a great relief for callers and agents. “It not only hears how happy someone is, but also whether he is male or female, how old and where he comes from. The system then does not make a random routing to any agent, but selects one according to the analysis, who is of similar age, maybe even of the same sex, comes from the same area and therefore has the same accent”.
That is, if someone from Bavaria calls, he is not put through to an agent from Dresden, because the chance that someone feels understood when he talks to a “fellow countryman” is much higher than when he talks to someone who speaks Saxon, for example. And there’s more. “If I remember that this is an agent who has already had seven stressful conversations today and I give him the eighth now, then he will probably be ill tomorrow. So I prefer to give it to those who have only had two stressful conversations”, explains Dagmar Schuller. “All the intelligence of the machine in the background can be used to optimize both sides. This can help to reduce sickness and stress levels, get feedback and provide coaching for the agents – not just once every six months, but permanently. “Then both the client and the agent will feel better, and I’ve won a lot.”
In the long term, it could even happen that there are no human agents in a call centre because it does not pay off, especially with low-price products. “Then I can also give my bot the intelligence that he actually recognizes the emotional states and – just like the K.I.T.T. in the Knight Rider series – can answer intelligently in the vehicle and make sure that he has understood everything correctly,” predicts Schuller. This scenario is not only for future dreams but foreseeable. “We’re not talking about ten years, we’re talking about three to five years. Technically, we’re ready to deliver it.”
With text-to-speech systems like Google WaveNet, speech synthesis already works quite well today, even if everything still sounds a bit choppy. But research continues and in the so-called Wave-GANs (Generative Adversarial Networks) two neural networks are connected to each other, “which play with each other and in which there are a discriminator and a generator. The discriminator must always distinguish between real and false data and then send it back to the generator, which then produces new data again and again until discriminator can no longer distinguish between real and false data. This technology is now being used in speech synthesis to generate new speech from a speaker that no longer contains human speech.”
Dagmar Schuller believes that such a speech synthesis will come to us in the next three years. “This gives us the opportunity to add an emotional component to it so that you can actually communicate with the machine as if you were talking to a person who understands you.” For the human side, there is even a great advantage in communicating with a neutral machine, she says, because in a human-to-human conversation you always automatically try to find out what the other person’s mood is. “You don’t have that problem. You can directly say, ‘this annoys me now that I came out with you, I have insane problems with your product, it should be built in minutes and I’ve been trying to build my stupid closet for half a day already. Help me.’ The intelligent bot then says, ‘Don’t be so annoyed, I understand you I had this last time as well’, and the customer laughs and the mood is already better.”
Photos and graphics: audEERING, Pixabay (title image)