placeholder

Step-by-step guide to creating a catalog in Databricks unity catalog on AWS

In this walkthrough, I’ll explain how to create a new catalog in Databricks Unity Catalog using the Databricks on AWS UI.

OCT. 24, 2025
10 Min Read
by
Dilorom Abdullah
In a following article, we’ll go over how to complete the same setup through Terraform for automation.
Here’s what we’ll cover:
  • What is Unity Catalog?
  • Create an S3 bucket
  • Create an IAM role
  • Create a storage credential
  • Create an external location
  • Create a catalog
  • Manage ownership, assignments, and access control for the catalog
  • Create a schema and a table within the catalog
  • Summary

What is Unity Catalog?

Unity Catalog is a cornerstone of data governance in Databricks, offering a centralized framework for managing permissions, access policies, and lineage across multiple workspaces and data personas, whether they’re engineers, analysts, or data scientists.
Instead of applying permissions separately in each workspace or dataset, Unity Catalog lets you define fine-grained access control at the catalog, schema, and table levels. It also includes built-in support for auditing and compliance tracking.
This approach simplifies security administration, improves transparency, and enables easier data discovery. All while supporting multi-cloud environments.
In essence, if you’re building a scalable and compliant data platform, Unity Catalog isn’t optional. It’s a foundational component.

Create an S3 bucket

When working with Databricks Unity Catalog on AWS, you’ll need an S3 bucket to serve as the catalog’s storage layer. This bucket will hold two types of information:
  1. Metadata — details about the catalog, its schemas, and objects (like tables or functions).
  2. Data — the actual stored content generated by or written to those objects.
When creating schemas under a catalog, you can assign them to different S3 locations if required, offering flexibility for managing separate data domains or environments.
To create your S3 bucket, log into your AWS console and create it manually, or request assistance from your Cloud or DevOps team if you don’t have the required permissions.
Your bucket path will look something like: s3://my-bucket-name For example: s3://databricks-development
If your team doesn’t yet have a naming convention, establish one early on. Consistent naming makes it much easier to track, audit, and manage buckets across teams and environments later.

Create an IAM role

Next, you’ll need an IAM role that allows Databricks to access the S3 bucket you just set up. If you don’t have IAM privileges, ask your Cloud, DevOps, or Infrastructure team to create them for you.
Once the role is created, make sure to note down its ARN (Amazon Resource Name). You’ll need it shortly. It will look something like this:
arn:aws:iam::<account-id>:role/databricks-development
Replace <account-id> and the role name with the values specific to your AWS account.

Create a storage credential

After setting up your S3 bucket, the next step is to create a storage credential that authorizes Databricks to access it securely.
To do this, go to your Databricks workspace. For Unity Catalog objects (such as credentials, external locations, and catalogs), the specific workspace you use doesn’t matter, as Unity Catalog operates as a centralized metadata service.

Steps:

  1. Open the Catalog section from the left sidebar in your Databricks workspace.
  2. Click the plus (+) icon next to Catalog.
  3. A dropdown menu will appear with options like:
    • Create a catalog
    • Create an external location
    • Create a credential
These are the core objects we’ll work with for this setup.
Note: If you don’t see these options, it likely means you aren’t a metastore admin.
  • If you’re an account admin, assign yourself as a metastore admin.
  • Otherwise, ask your account admin to add you to the metastore admins group.
  1. Click Create a credential to begin.

Configure the storage credential

At the top of the form, select Storage Credential, and under Credential Type, choose AWS IAM Role.
  • Enter a name for your credential.
  • In the IAM Role ARN field, paste the ARN of the IAM role you created earlier.
  • Optionally, add a short description under Comment to document the credential’s purpose.
Expand Advanced Options if needed — here, you can choose to make the credential read-only.
  • Enabling this will allow Databricks to only read from the S3 bucket.
  • Leave it unchecked if you want Databricks to both read and write data.
Once configured, click Create in the bottom-left corner.

Post-creation validation

After creation, Databricks will display an External ID and Trust Policy. If your IAM role doesn’t already include this trust relationship, copy it and have your Cloud or DevOps team update the IAM role accordingly.
Click Done to finish. You’ll then land on the Credential Details page.
To validate the credential:
  • Click Validate Configuration (top right corner).
  • If all checks pass, your credential is ready for use.
By default, the person who created the credential becomes its owner, but ownership of critical Unity Catalog resources should always belong to admin groups rather than individuals.
To reassign ownership:
  • On the Credential Details page, click the pencil icon next to Owner.
  • Set ownership to groups like account_admins or metastore_admins.
Next, go to the Permissions tab to:
  • Grant ALL PRIVILEGES and MANAGE to admin groups.
  • Assign additional permissions to users or groups as required.
Finally, under the Workspaces tab, assign which workspaces can use the credential, either all or a subset, depending on your governance model.

Create an external location

With a storage credential in place, the next step is to define an external location in Unity Catalog.
  1. In your Databricks workspace, open the Catalog tab.
  2. Click the plus (+) button next to Catalog.
  3. Select Create an external location from the dropdown menu.
  4. You’ll be offered two options: AWS Quickstart or Manual setup.
If you have AWS admin access, you can use the Quickstart option for a streamlined process. However, since many data engineers don’t, this guide follows the Manual path.
Select Manual and click Next.
Now fill out the required fields: 5. Enter a name for the external location. 6. Choose S3 as the storage type. 7. Enter your S3 bucket path — for example:
s3://databricks-development
  1. Under Storage Credential, select the credential you just created.
  2. (Optional) Add a description for clarity.
You can explore Advanced Options for more granular settings, though we’ll skip those here.
When done, click Create to finalize.
For managing ownership, workspace access, and permissions, follow the same process as described earlier for storage credentials. The workflow and UI are virtually identical.

Create a catalog

Now that the foundational components are ready, you can create your catalog.
  1. In Databricks, go to Catalog in the sidebar.
  2. Click the plus (+) next to Catalog.
  3. Select Create a catalog.
  4. Enter a name for your catalog.
  5. Choose Standard as the catalog type.
  6. For Storage Location, pick the external location you just created.
  7. (Optional) Define a subdirectory or path under that storage location.
Tip: Defining a sub-path can be especially useful if you plan to manage multiple catalogs using the same S3 bucket.
For instance, if your development environment has one shared bucket, you might create separate paths like team_a_sandbox or team_b_sandbox to organize data. These logical folders make maintenance, cleanup, and debugging much easier.
Once everything’s set, click Create at the bottom-left corner to finalize the catalog.

Manage ownership, assignment, and access controls

With the catalog created, you’ll want to configure ownership, workspace assignments, and permissions to ensure proper governance.
  1. Set group ownership
    • The creator becomes the default owner.
    • Reassign ownership to an admin group (account_admins or metastore_admins) by clicking the pencil icon next to Owner.
2. Add tags
  • Use tags for classification, discoverability, or governance labeling.
3. Assign workspaces
  • Navigate to the Workspaces tab and assign which workspaces can access the catalog.
  • You can choose between read-only or read and write access.
4. Grant permissions
  • Under the Permissions tab, assign privileges like USE CATALOG, SELECT, CREATE, or ALL PRIVILEGES to relevant users, groups, or service principals.

Create a schema and a table in the catalog

Now, let’s create a schema inside your catalog.
  1. Go to your catalog’s page in Databricks.
  2. Click Create schema on the right side.
  3. In the pop-up, enter a name and choose a storage location from the dropdown.
If you’ve already assigned your external storage to the catalog (for example, development), select that location. You can also specify a different storage location if needed. For instance, to isolate data for specific teams or projects.
This flexibility is helpful when you want to separate storage across domains, teams, or environments (e.g., staging vs production).
Another key feature is the ability to define a sub-path within the S3 bucket. Although S3 doesn’t use a real directory structure, you can emulate one by using prefixes,  effectively creating an organizational hierarchy.
For instance:
team_a/project_x/
This helps group related data for easier maintenance and auditing later.
Once your schema is created, you can add a table inside it. Use the fully qualified naming convention: development.demo_schema.your_table
You can create it using SQL:
CREATE TABLE development.demo_schema.your_table AS 
SELECT * FROM some_source_table;
Or through PySpark:
df.write.saveAsTable("development.demo_schema.your_table")

Summary

In this post, we covered the entire process of setting up a new Databricks Unity Catalog on AWS through the UI.
Starting from creating an S3 bucket and IAM role, we moved through configuring a storage credential and external location, then created a catalog. We also reviewed best practices for managing ownership, assigning permissions, and linking workspaces.
Finally, we built a schema and a table within that catalog, exploring how to logically organize object storage using sub-paths for better manageability.
This guide provides a solid, hands-on foundation for establishing Unity Catalog in a governed, secure, and scalable way.