Today, I resized a 40GB DigitalOcean droplet through their dashboard. I'm documenting some of the things I learned along the way.
A few months back we released Hyvor Talk v2, which was re-written from scratch and had a new database schema, hence a new database. For the database, I chose to have it self-hosted rather than using managed database solution from DigitalOcean, because our database usually gets high traffic, so we usually want to change MYSQL config (such as InnoDB configurations), which are not provided by the managed services. So, in our case, MYSQL was installed in a droplet.
For this new database, I chose a 16GB Memory Optimized droplet. This was a bad decision. We received high CPU usage alerts more than 10 times in the last two months. This caused the system to slow down. One thing I learned is that a general-purpose database needs a good balance between CPU and RAM.
Before Hyvor Talk v2, we had used an 8GB Shared droplet, which worked great. Surprisingly, it was faster than this new 16GB memory-optimized droplet. So, we thought that we have to migrate to a new Shared droplet or resize the current one to a shared droplet. We had two options.
Create a new droplet, install MYSQL, and transfer data there.
Shut down the droplet, make a snapshot from it, and create a new droplet from the snapshot.
Shut down the droplet, and resize it.
1. Create a new droplet, install MYSQL, and transfer data there.
I first thought of this process:
I set up the new droplet and install MYSQL.
Stop the application so that new data is not written to the database.
mysqldump the data
Import the dumped to the new database
Change application configuration to use the new database
However, our database was around 40GB (data + indexes). So, mysqldump took almost 15 minutes to complete. Then, I tried importing those data to the new database. After 45 minutes, I could only import up to our pages table (which has 22 million rows). The next one was pageviews table, which has more than 100 million rows. So, this will probably take hours. We cannot afford that much downtime. So, I thought of another plan: Replication.
I could set up MYSQL Replication. This was possible but was hard, therefore, can cause a lot of downtimes if something goes wrong. I have no previous experience with MYSQL replication. Therefore, even this might work, I didn't go for it. I had some other problems too.
The current database had 85% of its disk used. To set up Replication, I have to turn on the binlog in MYSQL. binlogs are files that save all database transactions, and for a 40GB database, this could be very large. I could connect a DigitalOcean volume (an external data storage) to the droplet and save binlogs there, but it is still more work.
Unlike importing from a full data export, if something goes wrong with replication, it is very hard to detect and fix.
I skipped my "replication" idea for a few days and checked other options.
2. Shut down the droplet, make a snapshot from it, and create a new droplet from the snapshot.
I created a test droplet with dummy data of 40GB. Then, I turned it off and started making a snapshot through the DO dashboard. Unfortunately, it took more than 40 minutes for the snapshot. So, this option won't work. We cannot afford a downtime of 40 minutes.
3. Shut down the droplet, and resize it.
This was actually the first option I considered. However, DigitalOcean documentation says, "Allow for about one minute of downtime per GB of used disk space, though the actual time necessary is typically shorter.". This was unfortunate. 40 minutes of downtime for 40GB of data is just not acceptable. And, they have not specified how "shorter" the required time can be. Only because of this sentence, I had to check the other two options which I explained earlier.
However, after I realized those options are either not going to work or hard to do, I came back to this option. I created a new droplet of the same type (same RAM/CPU/DISK) as the current droplet, and added dummy data to fill the disk up to 90% (Close to 50GB). Now, if the documentation was right, it should take about 50 minutes for the resize. I gave it a try, and surprisingly, the resize was finished in just less than 1 minute. I tried it another time on a new droplet to make sure that the last result was not a "random" case. Nope, looks like I'm lucky.
So, today, I resized the production database, and the downtime was just 2 minutes (shutting down + resizing + turning it on again).
If you want to resize a droplet, resizing through the DigitalOcean dashboard seems to be the best option. Even the documentation says it will take 1 minute per 1GB, in my case, it took less than 1 minute for 40GB. However, do not take this as a fact. Do your own testing before resizing a production droplet (such as a database), which cannot afford long downtimes.
Create a new test droplet with the same resources (disk, RAM, and CPU, and Droplet type).
Fill it with dummy data. Alternatively, you can create this new droplet from a snapshot of the old droplet. But, you have to shut down the old droplet to take snapshots, and it takes too long to complete. So, filling it with dummy data is fine.
Then, resize your test droplet to the same size you are planning to resize your production droplet to.
If takes a short time, you are good to resize the production droplet with a minimum amount of downtime.
If you have any questions, feel free to comment below.