Sitemap

AWS Redshift: Serverless vs Provisioned Cluster

Overview of AWS Redshift Serverless along with its comparison with provisioned cluster.

5 min readJul 27, 2023
Photo by Choong Deng Xiang on Unsplash

AWS introduced Amazon Redshift Serverless in the year 2022. With this, you don’t have to worry about setting up, configuring, managing clusters, or even tuning the data warehouse. This automatically provisions and scale the underline infrastructure for you.

The post assumes, you have a prior knowledge of AWS Redshift. If not, it will be difficult for you to grasp the content here. If you are interested in learning the topic of AWS Redshift, you can enroll for the Udemy course — AWS Redshift — A Comprehensive Guide.

Redshift serverless clearly brings multiple benefits for the organizations and individuals. With this, they can focus on their core needs, which is deriving the data insights. The organizations need to pay only for what they consume. This can help them, in managing the costs better.

A different provision style of serverless, does not impact the redshift features or performance. You will continue to run analytics use cases, with high performance, rich SQL features, and integration with other services, similar to the option of provisioning a redshift cluster on your own.

AWS is advocating to give serverless, as the first preference. It recommends to use the provisioned cluster option, only if you need fine-grained control of redshift platform.

Lets try to understand, how the serverless is different than the provisioned cluster, from a holistic perspective.

Managing Redshift

Workgroup and Namespaces

Fig: AWS Redshift: Isolating Workloads using Workgroup, Namespace

In a provisioned cluster you manage the workloads directly. You manage the compute nodes and the leader node as part of the cluster. With serverless, to isolate workloads and manage resources, you can create workgroups, and namespaces.

The workgroup helps you in ensuring, you have the required infrastructure in place. As part of this, you can specify your compute needs, in the form of R-P-U, the Redshift Processing Units. You can define the network, and security configuration here, including what VPC it belongs to, the subnets and the security groups.

Each workgroup needs to be mapped to a namespace, which provides a mechanism, to group your database objects. It can hold tables, schemas, data backups and other database resources.

Node Types

In a provisioned cluster, you decide on the node type and node size, based on your data size, query workloads and cost. When you work with Serverless, you don’t choose nodes.

The serverless automatically provisions the infrastructure for you. It can look into your query prcessing loads and scale up or scale down the required infrastructure.

Optionally, you can provide the base and maximum capacity, which is measured in RPU, Redshift Processing Units. More the RPUs, more the performance you will get. At the same time, cost is also associated with how many RPU hours are consumed.

Concurrency Scaling and Workload Management

With a provisioned cluster, you can enable concurrency scaling. This helps you to deal with large payloads which are less frequent or unpredictable.

As discussed earlier, Serverless automatically does this for you. Based on the workloads, it can scale, with more efficiency. No need to enable concurrency scaling with this. Optionally, you can define the maximum capacity, in RPU, to control the costs.

Resizing

In a provisioned cluster, you can do the cluster resizing. With this, you can add or remove nodes, to optimize your performance or costs.

With Serverless, resizing becomes non-applicable. However, you can change the base capacity to meet your more specific performance or cost requirements.

Pausing and resuming

You can pause a provisioned cluster when you don’t have workloads to run. And resume it back when you need it. This helps in optimzing the costs at your end.

With Redshift Serverless, you pay only when queries run, which means there is no need to pause or resume the datawarehouse.

Maintenance window

With a provisioned cluster, you specify a maintenance window when the software patching can occur. For instance 12.00 AM daily. Typically, you choose a recurring time when the traffic is low.

With Serverless, there is no maintenance window. Updates are done behind the scenes , seamlessly, with no compromise on availability of your data warehouse.

Provisioned clusters support switching between current and trailing tracks whereas serverless has no concept of a track. Versions and updates are managed implicitly.

Monitoring & Logging

Query monitoring

Query monitoring provides a time-based view of queries run. In provisioned clusters, it does not show all data in system tables.

Query monitoring in serverless requires users to connect to the database to use system tables. Thus, query monitoring and system tables are in sync. Queries of system tables in Amazon Redshift Serverless use the database user mapped to the IAM user for using query monitoring.

Audit Logging

Audit logs provide valuable information about data-warehouse connections and user activities.

If its a provisioned cluster, AWS S3 (cloud storage) based audit log delivery is typically used. With Redshift Serverless, CloudWatch is a destination for audit logs. S3 based audit log delivery is not supported with Serverless.

Event notifications

In the provisioned cluster, you can create event subscription to manage notifications.

With Redshift Serverless, it uses AWS EventBridge to manage event notifications. This helps in more up-to-date notifications against change in your data warehouse.

Pricing & Billing

Compute-resource billing

With a provisioned cluster, billing occurs per second. Whereas with Serverless, you pay for the workloads you run, in RPU-hours (Redshift Processing Units consumed per hour) on a per-second basis.

Storage billing

If this is an RA3 based cluster, managed storage comes into account. Otherwise the storage cost is calculated based on GB consumed per month.

Security

Encryption

With provisioned cluster, encryption is optional. whereas with serverless data is always encrypted.

The data can be encrypted with AWS managed or customer managed keys. With serverless, by default the AWS managed keys are used.

Requirement for credentials on sign in

Access to Redshift requires sign-in credentials from a user associated with an IAM role. The IAM role has specific permissions attached for a provisioned data warehouse. Once authenticated, the user can connect directly to the database, to the Redshift console, and to query editor v2.

For Amazon Redshift Serverless, you don’t have to enter credentials in every instance.

Recommendation & References

Redshift Serverless provides an easy to manage cloud-native data-warehouse platform. This reduces the operational overhead and let you focus on the workloads and data insights.

Serverless also helps in managing the costs better as you only pay for the RPU consumed.

--

--

agentred
agentred

Written by agentred

Curiosity Crafted, Knowledge Unveiled !

Responses (1)