Sometimes people could neglect the security aspect of their IT resources. You might put more emphasis on the implementation and performance since they can have a more visible impact immediately. However, you shouldn’t forget that security is one of the pillars that support the success of any IT project. That is especially true when we talk about the security of the data warehouse where massive amounts of data reside. In this article, we will discuss the best practices for Amazon Redshift database security management.
The article divides the Redshift database security recommendations into three major categories. First, it will cover the macro-level security that talks about environmental security topics. Second, we discuss the micro-level security within Redshift. Finally, we will have a look at monitoring and logging that can be used to block threats.
Macro-Level Redshift Database Security
Let’s remind ourselves of the fact that Redshift runs within the AWS ecosystem. Before enhancing security within Redshift, it is important to ensure a secure environment where Redshift can safely thrive.
Create a New Redshift Admin and IAM users
When you create your first AWS account, the default account is given unlimited access and control for entire AWS resources. Also, the default account has permission to terminate your account, which will wipe out your entire AWS infrastructure and data within. This root user or superuser can bypass all permission restrictions. Furthermore, superusers retain all privileges regardless of GRANT and REVOKE commands.
Thus, it is not a good practice to keep using a superuser for frequent daily tasks. Instead, create a new user that has the root permissions limited to Redshift and the relevant resources. For other users, it is a best practice to use IAM to control their permissions securely instead of giving them secret and access keys. These keys need to be used only for applications and other programmatic usages not for users.
Tighten Inbound Fraffic to Redshift Using Security Group
By default, when you first provision a Redshift instance, it will be attached with a default cluster security group. The default security group doesn’t contain any rule, which blocks all inbound traffic. When you start adding inbound IP address ranges, be specific and restrictive. Investigate which IP ranges will need to be allowed. For example, you might want to allow traffic only from your company’s employees, clients, and other related systems.
Use VPC for Secure Redshift Network Environment
Another best practice for a secure network for Redshift is adding another layer. In AWS, you can build a virtual fence by using AWS VPC (Virtual Private Cloud). With VPC, you can define your topology including gateways, routing tables, and public and private subnets. This allows you to create a private and secured environment for Redshift instances.
Encrypt Load Data in S3 Bucket
One of the more powerful Redshift features is allowing users to load big data from S3 bucket directly into Redshift storage. Also, you can unload data from Redshift into S3. When you export and import data in files, it is a good practice to encrypt them at rest in S3. An easy way to encrypt your file data is by server-side encryption. You can encrypt your data at rest in S3 either by AWS API or the management console.
Micro-Level Redshift Database Security
Once we harden the security outside Redshift, we also need to ensure security within the Redshift database.
Consider AWS Security Guidelines
AWS shares the core security guidelines for every user. No matter how advanced the security practices you want to apply, you will want to remember these fundamentals concepts:
- Redshift, by default, grants privileges only to the object owner.
- You can grant explicit privileges by assigning the privileges to a user account.
- You can grant implicit privileges by assigning the privileges to the group of the user. When a new user becomes a member of the group, the new user automatically is given the same privileges.
- Groups are collections of users that can be collectively assigned privileges. This way, you can manage privileges more easily for the users within a group.
- Schemas consist of collections of database tables and other database objects. Schemas are similar to file system directories, except that schemas cannot be nested. Users can be granted access to a single schema or to multiple schemas.
Grant Restrictive Privileges to Groups and Users
Before your Redshift users grow too fast, it is a good practice to design groups and users and define permissions as planned. Giving unnecessary DML and DDL permissions to users can lead to unexpected incidents. When there is a change made to your data in Redshift by mistake, that may not be recoverable. Even if there is a way to restore, for example, by using a snapshot image, you might end up losing some data that was generated during the snapshot schedule interval. To prevent that, you will want to list groups and users and then give privileges.
You can create groups by the following commands:
create group salesusers; create group financeusers; create group marketingusers; |
Creating users in each group looks like this:
create user salesusers_person1 password ‘salespassword1’ in group salesusers;
create user financeusers_person1 password ‘financepassword1’ in group financeusers; |
You can grant schema usage permission to a group as follows:
grant usage on schema sales to group salesusers; |
Implement View to Encapsulate and Limit Scope
You can be conservative regarding table security. If you want to hide raw tables and the relations among them, you can consider using View. Using View is similar to the concept of encapsulation in object-oriented programming. You hide the underlying logic and surface the outcome for security. Using Redshift View can be a secure practice if your clients want to write queries by themselves to extract data as needed. Also, you can consider using View (or Materialized Views for performance) for dashboard and analytics.
Furthermore, when you use View, you can efficiently implement logical separation. The following View create statement explains that:
CREATE OR REPLACE VIEW monthly_performance_fashion AS SELECT sales_sum_qty, sales_sum_amount, sales_store_code FROM sales_performance WHERE team = ‘FASHION’; |
By forcing the where condition, although the underlying table stores the sales data of all teams, users will be able to see the records of the fashion team. This helps to apply secure data isolation to meet the needs of different groups.
Consider Column-Level Access Control
When you grant select privilege to different groups for a single table, you may want to have more fine control at column-level. In Redshift, you can grant column-level access for different users and groups. Using the grant command below, you can give either SELECT or UPDATE privilege for specific columns to specific users or groups. This is a good way to protect your data at fine level.
GRANT { { SELECT | UPDATE } ( column_name [, …] ) [, …] | ALL [ PRIVILEGES ] ( column_name [,…] ) } ON { [ TABLE ] table_name [, …] } TO { username | GROUP group_name | PUBLIC } [, …] |
Security Monitoring
After you apply secure practices to your Redshift resources, the last important task is to keep your eyes on what’s happening around Redshift. By implementing this last security piece, you can have a complete and secure Redshift environment on all fronts.
Set and Get Alerts from CloudWatch
In AWS CloudWatch, you can set certain thresholds with metrics and trigger an alert to you. This allows you to take swift actions before an incident leads to a security vulnerability, and to focus on what matters the most. The Amazon Redshift metrics lets you monitor various aspects of Redshift resources and queries. Some of them include:
- CommitQueueLength
- CPUUtilization
- DatabaseConnections
- HealthStatus
- NetworkReceiveThroughput
- NetworkTransmitThroughput
- PercentageDiskSpaceUsed
- QueriesCompletedPerSecond
These are some of the metrics you can use to monitor Redshift. Using them, you can identify abnormal activities or potential threats to your data warehouse and take swift actions.
Monitor A Person’s Activities with CloudTrail
You can use CloudWatch to monitor hardware resources and query behaviors. CloudTrail gives you an ability to track the activities of AWS users. Using it, you can monitor API requests to Redshift, IP addresses that the requests made, who made the requests and when, and more details. This helps you to follow suspicious behavior and take preemptive measures.
Conclusion
This article discussed best Redshift security practices. Since Redshift is part of AWS cloud service, users need to first set up a secure environment. Within Redshift, there are also security measures you can take by limiting privileges and implementing Views. Lastly, with the use of CloudWatch and CloudTrail, you can detect suspicious activities and prevent them from growing into security vulnerabilities.
We built Intergrate.io as a powerful Redshift analytics platform to help you keep an eye on your AWS environment. Sign up today for a free trial and experience Intergrate.io for yourself.