Apr 21, 2021|
In-depth Report: IEEE Fellows Series – Dr. Xiaodong He: Using AI to Break down Communication Barriers
by Ella Kidron
“Before, on the research side, we used to think, ‘With technology, anything can be taken care of,’” explained Dr. Xiaodong He (pronounced “heh”, not “he”) in a conversation on Apr. 2. Dr. He is a vice president at JD.com and the deputy managing director of JD AI Research. An IEEE Fellow, he has decades of experience in most game-changing areas of the field of AI. In joining JD, he made a transition from doing pure, cutting edge research, to doing extensive commercial applications, in addition to top of the line research.
Dr. He specializes in using AI, specifically conversational AI – a collection of technologies including natural language processing (NLP), natural language understanding, speech recognition, and machine learning techniques necessary for continued improvements – to improve services. The technologies themselves enable AI such as chatbots to interact with people in a humanlike way. At JD, this is taken to the next level. The team, which used to be 10 people and has since expanded to 300, has built a highly complex and sophisticated AI chatbot system which can not only efficiently handle customer queries but also detect emotion and respond to consumers in an emotionally appropriate way.
In a word, Dr. He and the team at JD are using AI to build more trust into the communication process, first from the perspective of human-to-machine, which later translates into human-to-human interactions. “If I had to summarize what we are doing in one sentence, we are using AI to help our client and their customer to feel closer,” he explained.
Dr. Xiaodong He
Research to commercial application
On-the-ground, firsthand observation has been instrumental in getting the product right. “We went to the frontlines to study from the top customer service professionals, and realized that before even solving a technical problem, the first thing they do is to understand where the customer is emotionally and get on the same page. Only after they have established emotional resonance do they attempt to address the technical elements of the customers’ query.” This is how the most effective, and mutually beneficial conversation occurs, and creating scenarios which are as layered and complex as everyday real customer queries are not easy to do in a lab. “There are many scenarios which cannot be simulated virtually. You need real world applications in order to determine the right [technical] solution to the problem.”
The immediate real-world application scenario of having over 471 million (as of Dec 2020) active customers from all regions across the country with diverse backgrounds, experiences has done wonders for JD’s conversational AI technology and helped it be competitive in world-leading AI competitions. “Through real-world application, we’ve been able to address some questions that the entire field only looking at and technology didn’t consider before.”
Last year, JD’s AI Research team was ranked No. 1 at Stanford University’s QUAC (Question Answering in Context) competition, taking down teams from a handful of leading companies around the world. JD has also submitted academic papers to top AI conferences hosted by the likes of the Association for the Advancement of Artificial Intelligence (AAAI) and the Association of Computational Linguistics (ACL). In these papers are discoveries which have been overlooked by entire industries, such as, how to make a deep learning model manageable and explainable.
Dr. He and team at JD.com
The technology has also provided social value, and even been lifesaving in some cases. The Fine Granite Emotion Detection system that JD has developed ensures hypersensitivity to user queries. For example, last year a customer asked how many sleeping pills would be required to take one’s life. After noting the abnormality of the query, the system redirected it to JD’s “Life Channel” suicide hotline team. Comprised of staff with years of front-line service, the team is trained in psychological counseling to soothe customers that express suicidal thoughts during communication with them, and to take appropriate action to preserve their safety. Early intervention and quick response has saved tens of lives.
Commercial application to reproducibility
Dr. He equates the advanced customer service system JD has built with a Ferrari. The system’s sophisticated emotion detection system is able to not only understand but also respond with a tone that matches or acknowledges the tone of the user. It is also fully-customized to suit the unique needs of JD’s retail business down to the minutest, most granular details.
The great technical challenge comes in transforming this into a product which is widely commercialized, in essence turning it into the equivalent of a Toyota Camry – a very effective and reliable tool that is also cost effective and widely accessible. This is what Dr. He’s team has done.
There are three stages of development according to Dr. He: 1) technological breakthrough, 2) creating a product that businesses can use and 3) creating value for the end user. The first stage, technological breakthrough involves breaking world records at the very frontiers of technology and increasing a single percentage point.
JD’s voice shopping, customer service chatbot and emotion detection capabilities have been broken up into modular APIs which business can directly use in different scenarios. “We fully understand that the end user is unlikely to do any additional development on top of the product we provide. It needs to be immediately usable. The technology is at the heart of guaranteeing efficiency, but we need to do more work on top of that,” explained Dr. He.
“[The end user] doesn’t need know what voice detection is, or what our analysis model is in order to use the app effectively.” The final step is to turn this ability into a product where no knowledge of technology or what’s happening behind the scenes is necessary. “You need to create a closed loop where users who don’t understand technology and don’t even understand business can use it effectively – ‘I speak and you understand’ – that’s how you create real value.” This value translates into the three elements at the heart of retail and service – lowering costs, increasing efficiency and enhancing experience.
The technology requirement to achieve this is much higher. It’s a shift from going to the theoretical ends of the Earth to capture a 1% efficiency improvement to ensuring self-adaptation, robustness and diverse scenario application. Dr. He likens it to Elon Musk’s mission to commercialize and make accessible traveling by rocket to Mars. Researchers have two main motivations in technology development – technology for technology’s sake, or satisfying our internal curiosity about how far the envelope can be pushed, and creating value for society. “How you create value for society is simple – can society benefit from this product or not?” This involves reducing costs, scaling and ensuring reproducibility. When done right, it can transform industries ranging from municipal services to finance and more.
JD’s AI team recently announced the application and integration of its intelligent chatbot technology into the resident service hotline (12345 in China) for the city of Datong, a prefecture level city in Shanxi province.
Prior to the collaboration with JD, the hotline mainly relied on traditional customer service, resulting in a large number of missed calls and vague answers as service staff would rely on standard replies, impacting efficiency. Since applying JD’s technology, the number of daily calls have increased 31.7% with a pickup rate of 100%. The queries from Datong residents are first handled by the robot customer service and seamlessly transferred to human customer service as necessary. The service is also customized to recognize the local dialect, thanks to semantic analysis technology.
The application of JD’s technology has enabled a more service-oriented approach to local government service of residents while reducing costs. It’s also making it easier for all populations, such as seniors, to get their problems solved. Some seniors cannot use a computer or a phone or even type in Pinyin (the phonetic way to type Chinese) in order to write characters. “We want to provide high quality and attentive service to these people in a very natural way.”
According to Dr. He, it’s not about overcoming the so-called digital gap as much as breaking down communication barriers. With JD’s service, an senior resident in Datong who dials 12345 to report a broken traffic light at the end of her street can provide an address, arrange a person to fix it, and also receive a call a few days later to check in on whether it has been fixed, all within the span of a phone call.
An unlimited market
The application scenarios for such technology are seemingly endless. In the early 1940s, IBM’s president, Thomas J. Watson said there was a world market for about five computers. At the time, computers were exorbitantly expensive and required a highly technical background in order to operate. Now, with smart phones, over 3.2 billion people carry the equivalent of a powerful computer in their pockets. High quality customer service can be looked at the same way.
“In the future, not only will products have agents (to market themselves to consumers) but people will too.” They won’t just have one agent though. Dr. He envisions a world of multiple smart agents, similar to the way people use multiple apps on their phones. “We will have agents to manage our health, finances, education, and more.” He joked: “We might even have an agent for communication with our bosses, and an agent to manage our agents.”
Dr. Xiaodong He at JD’s office in Beijing
Breaking down barriers to high quality service
It may sound space age, but such a future could be closer to becoming reality than one might think. There have been huge breakthroughs from a tech perspective in the past five years, according to Dr. He. In 2016, AlphaGo’s defeat of 18-time world champion of the game Go, Lee Sedol of South Korea, demonstrated the sheer might of technology. Just last year, GPT-3 (Generative Pre-trained Transformer 3), an AI developed by OpenAI that is better at creating content that has a language structure – human or machine language – than anything that has become before it, took the world by storm. The next five years will be about product and ability breakthroughs. “We are moving from ‘Art and Science’ to engineering,” explained Dr. He. Every time there’s a new breakthrough, a US$ 100 billion company is created. “We hope we’re the next one,” he said with a chuckle.
Breaking down the barrier to entry for high quality service is just the beginning. “It used to be that only upscale places could afford to provide such service. Once the technology progresses in reducing costs, we will see the demand for humanized customer service mushroom.”
It is an immense technical challenge poised to provide massive value. “At this scale, we are doing something that nobody has ever done before. It’s incredibly exciting.”