Skip to main content

Overview

This guide walks you through connecting your Amazon S3 bucket to Permutive so you can import data for audience building and activation. You’ll learn how to structure your S3 bucket, configure the required permissions, and create a connection in the Permutive dashboard.
Prerequisites:
  • An AWS account with access to the S3 bucket you want to connect
  • Permission to modify the S3 bucket policy
  • Your data organized in the required directory structure (see below)

Key Concepts

Before setting up your connection, familiarize yourself with these terms:
TermDescription
Bucket RootThe root name of the bucket without the s3:// prefix and without any trailing slashes or prefixes
S3 PrefixA path within your S3 bucket where tables are stored
SchemaA group of tables, represented by an S3 prefix location
TableA single table within Permutive, represented by a prefix under the schema prefix
Data fileThe files containing your data (CSV or Parquet format)
Hive PartitionAn S3 prefix in Hive partition format (e.g., date=2025-01-01 or region=EU)

Step 1: Set Up Your Bucket Structure

Permutive uses the concept of a Schema (containing multiple Tables) to organize your data. Since S3 doesn’t have native schema or table concepts, you’ll need to structure your bucket in a specific way.

Schema Directory Structure

Organize your bucket so that each table is a directory under your schema prefix:
s3://<bucket_name>/<prefix>/<table_1>/
s3://<bucket_name>/<prefix>/<table_2>/
s3://<bucket_name>/<prefix>/<table_n>/
When you provide the prefix to Permutive, every directory under that prefix is treated as a table.
You can have multiple prefixes representing different schemas, each with multiple tables. Each schema prefix requires a separate Connection in Permutive.

Table Directory Structure

Within each table directory, you can organize your data files in one of two ways:

Data Format Recommendations

We support:
  • .csv (uncompressed CSV)
  • .gz (gzipped CSV)
For CSV files, especially large datasets, gzipping is highly recommended to reduce storage costs and enhance processing speed.
All tables under a schema should use the same data format (either all CSV or all Parquet).

Step 2: Configure Bucket Permissions

Permutive needs permission to read data from your S3 bucket. You’ll add an S3 Bucket Policy that grants Permutive read-only access.
1

Start Creating the Connection

In the Permutive dashboard, go to Connectivity > Catalog and select Amazon S3. Begin entering your connection details (covered in Step 3). Once you enter your bucket name, Permutive will generate a bucket policy for you.
2

Copy the Generated Policy

Copy the S3 Bucket Policy displayed in the Permutive dashboard. It will look similar to this:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<PermutiveAWSAccountId>:root"
      },
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::<YourBucketName>",
      "Condition": {
        "StringEquals": {
          "aws:PrincipalArn": "arn:aws:iam::<PermutiveAWSAccountId>:role/<PermutiveCustomerSpecificRole>"
        }
      }
    },
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<PermutiveAWSAccountId>:root"
      },
      "Action": "s3:GetObject",
      "Resource": [
        "arn:aws:s3:::<YourBucketName>",
        "arn:aws:s3:::<YourBucketName>/*"
      ],
      "Condition": {
        "StringEquals": {
          "aws:PrincipalArn": "arn:aws:iam::<PermutiveAWSAccountId>:role/<PermutiveCustomerSpecificRole>"
        }
      }
    }
  ]
}
This policy grants Permutive the following permissions:
  • s3:ListBucket — List the contents of your bucket
  • s3:GetObject — Read objects from your bucket
3

Add the Policy to Your Bucket

  1. Open the AWS Console and navigate to your S3 bucket
  2. Go to the Permissions tab
  3. Click Edit on the Bucket Policy section
  4. Paste the policy generated from the Permutive dashboard
  5. Save your changes
If you’ve already added the policy to your bucket and want to use a new location within the same bucket, you don’t need to re-add the policy.

Step 3: Create the Connection

1

Select Amazon S3 from the Catalog

In the Permutive dashboard, go to Connectivity > Catalog and select Amazon S3.
2

Enter Your Connection Details

Fill in the following fields:
FieldDescription
NameA descriptive name for your connection in Permutive
AWS Bucket RegionThe region where your bucket is located (only supported regions are shown)
AWS Bucket NameThe bucket name without any prefixes or suffixes (e.g., for s3://my-bucket/*, enter my-bucket)
AWS Bucket Schema PrefixThe prefix path to your schema location, without a leading slash (e.g., data/audiences/)
Data FormatChoose Parquet (recommended) or CSV
Data PartitioningSelect whether all tables are partitioned or no tables are partitioned
Data Partitioning Behavior:
  • If set to “All tables are partitioned” — non-partitioned tables will be ignored
  • If set to “No tables are partitioned” — partition prefixes will be ignored and treated as regular directories
3

Add the Bucket Policy

Before completing the connection, ensure you’ve added the generated bucket policy to your S3 bucket (see Step 2).
4

Create the Connection

Click Save to create the connection. It will appear on your Connections page with a “Processing” status while Permutive validates access. Once validated, the status changes to “Active”.

Step 4: Create an Import

Once your connection is active, you can create imports to bring data into Permutive.
1

Navigate to Imports

Go to Connectivity > Imports and click Create Import.
2

Configure the Import

  1. Select Amazon S3 as the source type
  2. Select your S3 connection
  3. The schema prefix will be pre-selected (there’s only one per connection)
  4. Choose from the list of discovered tables
  5. Continue with the standard import configuration
For more details on configuring imports, see Imports.

Troubleshooting

If your connection remains in “Processing” status or fails:
  • Verify the bucket policy has been correctly applied
  • Check that the bucket name and region are correct
  • Ensure the schema prefix exists and contains table directories
Solution: Double-check your AWS bucket policy in the S3 console and verify the bucket name matches exactly what you entered in Permutive.
If you don’t see expected tables after creating the connection:
  • Verify your directory structure matches the required format
  • Check that data files exist under each table directory
  • Ensure the data format setting matches your actual file format
Solution: Review your S3 bucket structure and ensure each table is a direct subdirectory of the schema prefix. After making changes, run a schema resync in Permutive to refresh the available tables.
If partition columns aren’t appearing in your data:
  • Verify “All tables are partitioned” is selected in Data Partitioning
  • Check that partition directories use the correct Hive format (column=value)
Solution: Update your connection settings or restructure your partition directories.

Next Steps