WHAT
From a security POV, Storage Integration is a very useful feature provided by Snowflake. At it’s core, Storage Integration –
- Is a first class object i.e. – It’s an object that resides independently. It isn’t tied to a database, schema or a table.
- Holds connection credentials to an S3 bucket (or its Azure/GCP equivalent)
- Defines which buckets to connect to and which not to
Note – Only an Account Admin is allowed to create and give permissions to other roles within Snowflake using Storage Integrations.
HOW
How to create a Storage Integration?
Syntax
CREATE [ OR REPLACE ] STORAGE INTEGRATION [IF NOT EXISTS]
<name>
TYPE = EXTERNAL_STAGE
cloudProviderParams
ENABLED = { TRUE | FALSE }
STORAGE_ALLOWED_LOCATIONS = ('<cloud>://<bucket>/<path>/', '<cloud>://<bucket>/<path>/')
[ STORAGE_BLOCKED_LOCATIONS = ('<cloud>://<bucket>/<path>/', '<cloud>://<bucket>/<path>/') ]
[ COMMENT = '<string_literal>' ]
Example
CREATE OR REPLACE STORAGE INTEGRATION helloworld_storage_int
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = S3
ENABLED = TRUE
STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::001234567890:role/snowflake_role'
STORAGE_ALLOWED_LOCATIONS = ('s3://mybucket/');
WHY
A Storage Integration object locks away your ARN (or Tenant id for Azure) into an account level object. Think of it as an abstraction that developers can use without knowing the connection details. As a result, external stages can be built using the Storage Integration object without having to enter your access key every time.
Wait…There’s More
You may be wondering how the Storage Integration object establishes the trust and the connection with the external cloud provider. This is where you leverage the power of the cloud ecosystem’s security.
Here’s how your policy JSON should look like at a minimum.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion"
],
"Resource": "arn:aws:s3:::mybucket/*"
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::mybucket"
}
]
}
When you create the role, ensure you provide your own account ID and check the box requiring an external ID. Enter a dummy value initially
Once you create the Storage Integration object using the newly created role, use the below command to get the object’s ARN and External ID –
desc integration helloworld_storage_int;
The STORAGE_AWS_IAM_USER_ARN and STORAGE_AWS_EXTERNAL_ID are what you need to use to establish the Trust Relationship with AWS.
Head back to AWS IAM, click on the newly created role and then on Trusted Relationships. Click on Edit Trust Relationships and paste your object’s ARN and External Id like below –
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "<Object ARN>"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "<External Id>"
}
}
}
]
}
And that’s it! Your Storage Integration object is configured to access your S3 bucket.
Github Code – https://github.com/DataSherlock/Snowflake/blob/main/storage_integration_pipe.sql
[…] What, How, Why : Storage Integration Object in Snowflake […]