Aug 6, 2019|
How JD.com uses Vitess to manage scaling databases for hyperscale
The following case study by the Cloud Native Computing Foundation originally appeared here: https://www.cncf.io/jdcom-case-study/
Challenge
China’s largest retailer, JD.com serves more than 300 million active customers with its e-commerce business. “A few years ago, it became apparent that as our data became more extensive, our MySQL databases became larger, resulting in declining performance and higher operations and maintenance costs,” says JD Retail Chief Architect Haifeng Liu. “We needed a solution that would enable us to easily and quickly scale MySQL, facilitate operation and maintenance, and reduce hardware and labor costs.”
Solution
JD chose Vitess for scalable management of large-scale database services and the support of online expansion of complex transactional data in MySQL. “We now run MySQL databases in containerized environments with Kubernetes and use Vitess for scalable cluster management and handling large volumes of complex transactional data,” says Liu.
Impact
With Vitess, JD has seen improvements in the scalability and elasticity of the database clusters. Increased resource utilization and efficiency and the automation of operation and maintenance functions have also led to reductions in labor and resource costs.
China’s largest retailer and the world’s third-largest Internet company by revenue, JD.com serves more than 300 million active customers with its e-commerce business.
The company also owns China’s largest e-commerce logistics infrastructure, which covers an incredible 99% of the country’s population and has achieved delivery rates of more than 90% of orders delivered same- or next-day.
“A few years ago, it became apparent that as our data became more extensive, our MySQL databases became larger, resulting in declining performance and higher operations and maintenance costs,” says JD Retail Chief Architect Haifeng Liu, who leads the Technological Infrastructure Department responsible for driving innovation in containerized infrastructure and developing the hyperscale, containerized, Kubernetes-based platform that powers all facets of JD’s business. “We needed a solution that would enable us to easily and quickly scale MySQL, facilitate operation and maintenance, and reduce hardware and labor costs.”
The company previously used JProxy, a cobar-based database middleware system for MySQL database management. After a thorough evaluation process, Liu says, “we eventually chose Vitess since it was the most suitable solution to address the biggest challenge we were facing: scalable management of large-scale database services and the support of online expansion of complex transactional data in MySQL. We now run MySQL databases in containerized environments with Kubernetes and use Vitess for scalable cluster management and handling large volumes of complex transactional data.”
Being a very early adopter of Vitess—and with one of the largest and most complex deployments of the technology, to boot—came with some challenges. “The re-sharding process was initially manual, the performance was poor, and the orchestrator could fail in large clusters with more than 5,000 instances,” says Liu.
To make sure that Vitess would work at JD’s scale, Liu’s team made numerous improvements and changes, including bug fixes and new functionalities and features. They have also developed performance optimization and automated management tools for the company’s JDOS Kubernetes platform. Among them:
- JTransfer, an online data synchronization and transmission tool that migrates data from JD’s traditional MySQL databases to the Vitess cluster in real time. All topology information in Vitess is stored in etcd.
- BinLake, a MySQL collection tool for real-time collection of Binlog in Vitess and traditional MySQL database services, and publication of the collected Binlog to the Message Queue (JMQ) service. Integrated with Vitess, “BinLake provides intelligent and highly available binlog collection services in clusters,” says Liu. “If there is resharding or failover in Vitess, Binlake will automatically adjust the database instance of the binlog.”
- Mole, a Vitess management system with a GUI console, which improves Vitess service management. “With Mole, we can easily create, reshard, monitor, and back up Vitess’s key spaces,” says Liu.
“Most of our improvements have been contributed back to the Vitess code base for other developers in the community to benefit from,” says Liu.
“We eventually chose Vitess since it was the most suitable solution to address the biggest challenge we were facing: scalable management of large-scale database services.” — Haifeng Liu, Chief Architect at JD Retail
With Vitess, JD has seen improvements in the scalability and elasticity of the database clusters. Increased resource utilization and efficiency and the automation of operation and maintenance functions have also led to reductions in labor and resource costs.
Additionally, “Vitess helped the team grow their technical knowledge and strength in the areas of scalable management and elastic database,” says Liu. “The fact that Vitess is a CNCF project means we can significantly benefit from working with a large number of developers and end users in the most active and fast-growing open source community. With CNCF’s endorsement, Vitess can gain increasing awareness, attract more end users and bring together developers to the project. This is very beneficial for Vitess and its end users, which is important to us.”
Liu and his team are excited about the future of Vitess. In addition to migrating its systems to the newly-released Vitess 3.0 (which Liu calls “a significant improvement”), they’re working on developing some common functions for the latest version and developing a more complete operations and maintenance monitoring system.
“The fact that Vitess is a CNCF project means we can significantly benefit from working with a large number of developers and end users in the most active and fast-growing open source community.”
— Haifeng Liu, Chief Architect at JD Retail
For other organizations considering Vitess, Liu offers this advice: “The most value can be derived from Vitess when it is used in concert with Kubernetes. Before using Vitess, it is imperative to perform more testing and research to determine if Vitess is the right solution for your business, and to better understand what adjustments you might have to make to integrate it with your existing systems.”
For JD, Kubernetes, Vitess, and other cloud native technologies have been a game-changer. “Modern service platforms need to be scalable and extremely efficient and agile,” says Liu. “Cloud native technology is well-suited to handle these ever-changing environments. It offers flexibility, efficiency, scalability, independence, continuous integration and delivery for software services, all of which further enhance the quality of software services and resource efficiency. Kubernetes has become the de facto standard and cloud native is a sure thing to bet on for the future.”