When I prepare the "What's new on the cloud..." series, I'm pretty sure that for Azure the most updates will go to the Azure SQL service. The main idea of the service is simple but if you analyze it more deeply, you'll find some concepts that might not be the easiest to understand at first.
Data Engineering Design Patterns
Looking for a book that defines and solves most common data engineering problems? I'm currently writing
one on that topic and the first chapters are already available in π
Early Release on the O'Reilly platform
I also help solve your data engineering problems π contact@waitingforcode.com π©
First and foremost, why Azure SQL can be confusing? After all, it's supposed to provide a managed version of the RDBMS, isn't it? Yes, but it has multiple flavors that can be a bit troubling if you've just started to discover the service:
- SQL Database which is a managed service including serverless compute.
- SQL Managed Instance which is a fully managed SQL Server instance.
- SQL Server on Azure VMs which is an ideal solution for the lift-and-shift SQL Server migrations from on-premise to the cloud.
Unfortunately, that's not all! The service has some other interesting and frequently updated parts.
Managed instance
The Managed Instance is a fully managed SQL Server. It's probably enough to identify it as a solution in Azure SQL but not enough to understand it fully. There are some other concepts to know:
- vCore. The name comes from virtual core and represents a logical CPU. It's also the definition of the hardware used by the database instance.
- Tiers. The service offers General and Business tiers. The former is adapted for the classical workloads, whereas the latter was designed to handle low I/O latencies and reduce the maintenance impact. To be more precise, the Business tier ensures this better performance and reliability by using SSD-based storage, Always-On availability groups, and an in-memory OLTP.
- Always up-to-date... Unlike a self-managed SQL Server, the service benefits from automatic updates that might introduce some compatibility differences with the on-premise instances in terms of obsolete, retired or replaced features.
- ... but different. Additionally, there are also some more technical differences, such as network connectivity (TCP-only for MI), Azure Data Factory runtime for SSIS packages, or the lack of physical paths support (e.g. BULK INSERT working only with Azure blobs).
Serverless Azure SQL
Serverless is a special tier available in Azure SQL Database where you control the throughput with min/max vCore number and an auto-paused delay. As for the Managed Instance, let's see these and several other points more in detail:
- The auto-pause delay defines how long the database should be up without any activity. This feature can be disabled if you want to avoid the warm-up cost.
- Warm-up. Although the Serverless offering is great for unpredictable workloads, it suffers from a warm-up period and can introduce a significant delay for latency-sensitive applications.
- Minute-based scaling and second-based billing. As for the Managed Instance, the main compute configuration also relies on the vCores. When the database hasn't enough capacity to meet the current traffic demand, the service usually autoscales within minutes.
Flexible sever
Flexible Server is a special deployment mode for MySQL and PostgreSQL databases. Even though Azure SQL has a Single Server deployment mode for MySQL and PostgreSQL, the Flexible Server mode is recommended for all new projects. Why?
- Costs. You can control the cluster lifecycle with the stop/start feature and therefore, lower the costs.
- More granular control and flexibility over the configuration settings. Flexible Server supports single-zone or multiple-zones High Availability mode, whereas the Single Server can only work with a single-zone. It also enables some interesting features by default, like PgBouncer for a more efficient connections pooling.
- Zone colocation. In the new features compared with the Single Server, the Elastic Server mode supports colocating the database instance with the client applications.
Hyperscale
The last Azure SQL component to present is Hyperscale. It's also vCore-based but has some significant differences compared to the already presented modes:
- Scaling. Hyperscale can go far beyond the scaling limits of the Critical and Business service tiers in Azure SQL Database.
- Big Data. Even though the article presents a service apparently dedicated to classical and transactional workloads, Hyperscale can be considered as one of the possible Big Data databases. It offers up to 100TB storage, rapid scale up and scale out actions (horizontal + vertical), and more efficient transaction log throughput.
- OLTP. Although the previous point may invoke the analytics purpose, Hyperscale still performs better for OLTP and hybrid transaction and analytical (HTAP) workloads.
I only presented here some high level concepts of the Azure SQL with the goal to organize a bit better my understanding of the service. I know, certainly the service itself is not the first database option you think while designing a data architecture. But as you can see, it offers some interesting methods for the transactional workloads that can also scale.