Recently, one of our talented engineers, Elliott Coleman, took a deep dive into PlanetScale and its capabilities for scaling MySQL databases. This article lays out all his findings and insights that may help bring our clients to the next level. These are the following key pieces of information he’s proud to share.
Should you use it?
PlanetScale is a hosted MySQL platform, designed to be suitable for applications of any size — “from idea to IPO”, with a pricing model that’s meant to reflect this. It’s constantly evolving, with new features implemented regularly — such as an HTTP API released during the time of writing this post!
Among other features, it has a git-like workflow through schema branching, perfect for collaboration within larger teams. Members of the team can make non-blocking changes in isolated branches before requesting reviews from other members of the team — a tried and tested workflow in the software development realm.
What actually is PlanetScale?
PlanetScale describes itself as a scalable serverless database platform built on top of the Vitess project. PlanetScale aims to bring the power of Vitess alongside additional functionality with an easy-to-use, deploy-in-10-seconds database platform.
Vitess is one of the 16 graduated projects from the Cloud Native Computing Foundation (CNCF) — which, fun fact, YLD is a Silver member of. Built as a side project from Google to scale YouTube, Vitess joined CNCF in 2018, where Vitess has since helped scale organisations such as Slack to handle billions of queries a day.
What can PlanetScale provide?
The key feature offered by PlanetScale is the ability to branch schema changes. This allows developers to change the schema in isolated branches, similar to the workflow prescribed in git. These branches can then be merged via a Deploy Request, where other developers of a team can review the schema changes, similar to a Pull Request.
As a result, teams can be more efficient when updating database schemas, and changes can be tracked and reverted with Version Control.
PlanetScale also provides a layer of abstraction between customers and Vitess, which like any large software deployment, can be challenging to set up and maintain in production — from software upgrades, configuration, and compatibility to hardware management. Instead, PlanetScale hides all this through a managed cloud-based platform accessible via an intuitive CLI.
What can the CLI do?
From my experience with PlanetScale, I’ve got the impression that it was built CLI first. It offers comprehensive functionality, assisting the possibility of complex build pipelines to be configured. I noticed a production bug in their UI, reported it, and it was fixed within 2 hours. However, the CLI provided me with an alternative interface to achieve the same goal.
The CLI allows you to branch, manage deployment requests, make schema modifications, and deploy those changes. Working locally, you can authenticate via the web browser via pscale login. Alternatively, CLI Service Tokens can be used where pscale login cannot be used — such as in CI/CD pipelines.
For local development, PlanetScale offers a CLI command to proxy secure connections to your PlanetScale database pscale connect
This allows you to connect via localhost, where the connection is forwarded onto PlanetScale. Assuming the proxy is configured to use the default MySQL port, your database will be accessible via 127.0.0.1:3306or mysql://root@127.0.0.1:3306/ That way, you can use any tool that can work with the MySQL Connector Protocol (i.e. TablePlus).
Building a simple pipeline
To understand the possibilities of the PlanetScale CLI, I decided to integrate it with AWS CodeBuild to automate database changes. PlanetScale has a great official tutorial describing the full process of creating an AWS build pipeline, which you can find here.
My idea of the pipeline was that a PlanetScale branch with a name corresponding to a code-base change would automatically get merged into production.
- The developer creates PR for a branch on GitHub
- The developer creates a branch on the PlanetScale database
- Once the PR is merged, the PlanetScale branch is automatically deployed to production
Since CodeBuild provides the Pull Request number as an environment variable CODEBUILD_WEBHOOK_TRIGGER=pr/[pr-number] during the PR trigger, this was used to correspond with the name of a PlanetScale branch.
The first step CodeBuild needed to do was to create a deploy request (DR) based on the pull request trigger. The database name is provided alongside the name of the branch — generated based on the webhook environment variable. Since this will be running remotely, the service token details are provided, and the result is formatted as JSON to be interpreted by further commands.
The next step is to retrieve info about the DR just created — such as its state and whether there are changes to be deployed. The DR number is extracted from the previous create DR step.
If the branch has schema changes, the DR can be deployed into the production branch.
Can it scale?
Yes, it’s in its name, after all! Because PlanetScale is built on top of Vitess, it harnesses the speed offered through MySQL connection pooling. Thousands of lightweight connections are pooled into RAM-intensive MySQL connections. With AWS RDS, connections are limited to 16 thousand, whereas PlanetScale is limited to 250 thousand connections, although they say it’s virtually unlimited. To achieve 16k connections with RDS, manual configuration is required to enable pooled connections, whereas PlanetScale provides this scale out-of-the-box.
What else can Vitess offer?
A key feature offered by Vitess is horizontal database sharding. This allows databases to span across multiple data servers. Each shard contains the same columns of data, but different rows, allowing server load to be spread across multiple machines. In the example below, the original table on the left has been split into two partitions, which could be stored in different data centres worldwide.
Vitess also offers high availability through the primary replica functionality. By duplicating the data into additional read-only data centres across the globe, data can be accessible in more locations with little need for configuration through PlanetScale.
How about Edge computing?
It’s tricky (if not impossible) to get native drivers to work on modern edge environments. Besides, it’s not even recommended to persist or pool connections on very elastic platforms (like Lambda or Workers). For those use cases, you can use PlanetScale’s new Fetch-compatible driver, which can run anywhere and interfaces with a plain HTTP API.
How can YLD help?
If you’re looking to get into the latest technology or want to join a collaborative team of great engineers and designers that work on impressive client projects, feel free to contact us at YLD at talent@yld.io for open jobs or hello@yld.io to discuss the prospect of a potential project.