AWS S3

DirectionBidirectional

Environment

WebiOSAndroidCTVAPI Direct

Capability

ConnectivityRouting

SDK RequiredNo

Product(s) Required

Core PlatformRouting

AWS S3 allows publishers to securely store and manage large volumes of advertising and audience data in the cloud.

Setup

Troubleshooting

Overview

The AWS S3 integration enables publishers to leverage Permutive’s bi-directional data capabilities with their S3 storage. This integration operates in two modes: Routing (Destination): Export first-party event data from Permutive to S3 buckets. Permutive offers two distinct routing modes:

S3 Streaming: Near real-time streaming, ideal for low-latency data pipelines and analytics
S3 Batch: Daily scheduled exports, suitable for data warehouse ingestion and batch processing workflows

Environment Compatibility

Environment	Supported	Notes
Web	Yes	—
iOS	Yes	—
Android	Yes	—
CTV	Yes	—
API Direct	Yes	—

Prerequisites

For Routing (exporting data to S3):

AWS account with permissions to create S3 buckets and IAM users/policies
S3 bucket created in the appropriate AWS region
IAM user with programmatic access credentials (Access Key ID and Secret Access Key)
Ability to configure S3 bucket policies with specific permissions
Secure method to share AWS credentials with Permutive (1Password or GPG encryption recommended)

Setup

Routing Streaming Setup
Routing Batch Setup
Connectivity Setup

S3 Streaming routing exports your first-party event data to an S3 bucket in near real-time as GZIP-compressed JSONL files with Hive-style partitioning. Data arrives with approximately 5-minute latency, making it ideal for low-latency data pipelines and ingestion into AWS services such as Athena, Redshift, or EMR.Setup requires coordination with Permutive support. You will need to prepare your AWS environment (S3 bucket, IAM user with programmatic access) and then share your bucket details and credentials with the Permutive team.

Prerequisites

AWS account with permissions to create S3 buckets and IAM users/policies
An S3 bucket in a region-specific location (e.g., us-east-1, eu-west-1) with public access blocked
A dedicated IAM user with s3:List*, s3:Get*, s3:Delete*, and s3:Put* permissions on the bucket
A secure method to share AWS credentials with Permutive (1Password or GPG encryption recommended)

For complete setup steps, see Setting up S3 Streaming Routing.

What Happens After Setup

Once routing is active:

Files stream to S3 in near real-time with approximately 5-minute latency
Hive-style partitions are created automatically by hour
Event data is written as GZIP-compressed JSONL files
File naming follows the pattern {timestamp}-{hash}-{worker_id}.jsonl.gz

See the Streaming Schema section below for detailed schema information.

Prerequisites

An AWS account with access to the S3 bucket you want to connect
Permission to modify the S3 bucket policy
Your data organized in the required directory structure

Step 1: Set Up Your Bucket Structure

Permutive uses the concept of a Schema (containing multiple Tables) to organize your data. Structure your bucket so that each table is a directory under your schema prefix:

s3://<bucket_name>/<prefix>/<table_1>/
s3://<bucket_name>/<prefix>/<table_2>/

Partitioned tables (recommended):

s3://<bucket_name>/<prefix>/<table_n>/<partition_name>=<value>/<data_file>.parquet

Non-partitioned tables:

s3://<bucket_name>/<prefix>/<table_n>/<data_file>.parquet

We recommend using Parquet format with ZSTD compression for optimal performance. CSV (including gzipped) is also supported.

Step 2: Configure Bucket Permissions

Start Creating the Connection

In the Permutive dashboard, go to Connectivity > Catalog and select Amazon S3. Begin entering your connection details. Once you enter your bucket name, Permutive will generate a bucket policy for you.

Apply the Bucket Policy

Copy the generated S3 Bucket Policy from the Permutive dashboard:

Open the AWS S3 Console and navigate to your bucket
Go to the Permissions tab
Under Bucket policy, click Edit
Paste the policy provided by Permutive
Click Save changes

Step 3: Create the Connection in Permutive

Enter Connection Details

Fill in the following fields:

Field	Description
Name	A descriptive name for your connection
S3 Bucket Name	The bucket name (without `s3://` prefix)
S3 Bucket Region	The AWS region where your bucket is located
Schema Prefix	The path within your bucket that contains your tables
Data Format	Choose Parquet (recommended) or CSV
Data Partitioning	Select whether tables are partitioned or not

Save the Connection

Click Save to create the connection. It will appear on your Connections page with a “Processing” status while Permutive validates access.

Step 4: Create an Import

Once your connection is active, go to Connectivity > Imports and click Create Import, then select your S3 connection.For the complete setup guide with detailed instructions, see Connecting to Amazon S3.

Data Types

Streaming Schema

Event Data (events)

Events in S3 Streaming are exported in newline-delimited JSON format with the following structure:

time

string

required

Unix timestamp in milliseconds as a string

event_id

string

required

Unique identifier for this event

user_id

string

required

Permutive user ID

event_name

string

required

Name of the event (e.g., Pageview, slotclicked)

organization_id

string

required

Organization identifier

project_id

string

required

Workspace/project identifier

session_id

string

Session identifier (optional)

view_id

string

Page view identifier (optional)

source_url

string

Source URL (optional)

segments

array[integer]

Array of segment IDs the user belongs to

properties

object

Custom event properties as key-value pairs

Example Event

{
  "time": "1665851625945",
  "event_id": "c0b8266d-3c4d-43d6-8855-6f42d657adda",
  "user_id": "87bcd76b-5eb6-4c46-afa8-017d1e7148ca",
  "event_name": "slotclicked",
  "organization_id": "be668577-07f5-444d-98e0-222b990951b1",
  "project_id": "72f6d4b5-1e85-4c79-b4f9-da2dd1f3be6d",
  "session_id": "4a96de87-f8b1-4240-a1a8-7b9c6cff569a",
  "view_id": "16f2af62-f38d-44d1-bcea-ba5b4da39be2",
  "source_url": null,
  "segments": [],
  "properties": {
    "campaign_id": 2387641642,
    "line_item_id": 4792767025
  }
}

Identity Sync Data (sync_aliases)

Identity synchronization events contain cross-device and identity mapping data.

time

string

required

Unix timestamp in milliseconds as a string

user_id

string

required

Permutive user ID

organization_id

string

required

Organization identifier

project_id

string

required

Workspace/project identifier

aliases

array

required

Array of alias objects, each containing:

id: The alias identifier value
tag: The alias type (e.g., email_sha256, device_id)

Example Sync Alias

{
  "time": "1665663771749",
  "user_id": "b5653712-26ee-41a8-8b30-c128092df93b",
  "organization_id": "be668577-07f5-444d-98e0-222b990951b1",
  "project_id": "be668577-07f5-444d-98e0-222b990951b1",
  "aliases": [
    {"id": "a1b2c3d4e5f6...", "tag": "email_sha256"},
    {"id": "device_12345", "tag": "device_id"}
  ]
}

Segment Metadata (segment)

Segment metadata snapshots containing segment definitions. These files are NOT date-partitioned.

string

required

Segment UUID

code

integer

required

Segment number/ID used in the segments array of events

name

string

required

Human-readable segment name

Example Segment

{
  "id": "5289b895-4ee7-44f8-81a6-1899142ed2d2",
  "code": 1057,
  "name": "High Value Users",
  "tags": [],
  "metadata": {},
  "workspace": "45582cb9-bb5c-4eb4-9c0d-7a2cebf4eeb1",
  "ancestors": ["45582cb9-bb5c-4eb4-9c0d-7a2cebf4eeb1", "be668577-07f5-444d-98e0-222b990951b1"],
  "workspaceState": "Active",
  "deleted": false
}

Batch Schema

Event Tables (e.g., pageview_events)

Batch exports create separate tables for each event type (e.g., pageview_events, videoview_events). All event tables share a common structure:

time

timestamp

Timestamp for when the event was received by Permutive (in UTC)

event_id

string

Unique identifier for each individual event

user_id

string

Identifier unique to a particular user

session_id

string

Identifier unique to a user’s session. Sessions last 30 minutes unless a user stays on site

view_id

string

Identifier unique to a particular page or screen view

workspace_id

string

Identifier for the workspace which the event originated from

segments

array[integer]

A list of all segment IDs the user was in when the event fired

cohorts

array[string]

A list of all cohort codes the user was in when the event fired

properties

object

Event-specific properties as a nested object. Structure varies by event type.

Example Pageview Event

{
  "time": "2026-01-15T14:30:00Z",
  "event_id": "c0b8266d-3c4d-43d6-8855-6f42d657adda",
  "user_id": "87bcd76b-5eb6-4c46-afa8-017d1e7148ca",
  "session_id": "4a96de87-f8b1-4240-a1a8-7b9c6cff569a",
  "view_id": "16f2af62-f38d-44d1-bcea-ba5b4da39be2",
  "workspace_id": "72f6d4b5-1e85-4c79-b4f9-da2dd1f3be6d",
  "segments": [123, 456],
  "cohorts": ["abc123", "def456"],
  "properties": {
    "client": {
      "domain": "example.com",
      "type": "web",
      "url": "https://example.com/article",
      "referrer": "https://google.com",
      "title": "Example Article",
      "user_agent": "Mozilla/5.0..."
    }
  }
}

Aliases Table (aliases)

Identity data and alias mappings for cross-device tracking.

time

timestamp

required

Timestamp when the alias was captured

event_type

string

Type of alias event

permutive_id

string

required

Permutive user identifier

string

required

External identity value (e.g., hashed email, device ID)

tag

string

required

Identity tag or namespace (e.g., email_sha256, device_id)

workspace_id

string

Workspace identifier

Example Alias

{
  "time": "2026-01-15T14:30:00Z",
  "event_type": "alias_sync",
  "permutive_id": "87bcd76b-5eb6-4c46-afa8-017d1e7148ca",
  "id": "a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2",
  "tag": "email_sha256",
  "workspace_id": "72f6d4b5-1e85-4c79-b4f9-da2dd1f3be6d"
}

Domains Table (domains)

Domain-level metadata. This is a snapshot table that is fully replaced with each export.

name

string

required

Domain name

workspace_id

string

Workspace identifier

Example Domain

{
  "name": "example.com",
  "workspace_id": "72f6d4b5-1e85-4c79-b4f9-da2dd1f3be6d"
}

Segment Metadata Table (segment_metadata)

Segment definitions and metadata. This is a snapshot table that is fully replaced with each export.

number

integer

required

Segment ID number

name

string

required

Segment name

Example Segment Metadata

{
  "number": 123,
  "name": "High Value Users",
  "tags": ["advertising", "premium"],
  "metadata": "{\"description\": \"Users with high engagement\"}",
  "workspace_id": "72f6d4b5-1e85-4c79-b4f9-da2dd1f3be6d"
}

File Formats and Compression

Streaming Format

Format: Newline-delimited JSON (.jsonl)
Compression: GZIP (.gz)
File Extension: .jsonl.gz
Character Encoding: UTF-8

Batch Formats

JSON Format

Format: Newline-delimited JSON
Compression: GZIP
File Extension: .json.gz
Character Encoding: UTF-8

Parquet Format

Format: Apache Parquet columnar format
Compression: Snappy
File Extension: .snappy.parquet
Schema: Derived from BigQuery table structure

Parquet format is recommended for data warehouse ingestion and analytics workloads due to better compression and query performance.

Troubleshooting

Permission Denied Errors

Symptom: Files are not appearing in S3 bucket, or logs show permission errors.Solution:

Verify the IAM user has all required permissions:
- s3:PutObject
- s3:GetObject
- s3:DeleteObject
- s3:ListBucket

Check that the bucket policy includes the correct bucket ARN:

"Resource": [
  "arn:aws:s3:::YOUR_BUCKET_NAME/*",
  "arn:aws:s3:::YOUR_BUCKET_NAME"
]

Verify the bucket-owner-full-control ACL condition is correctly configured
Ensure the IAM user credentials (Access Key ID and Secret Access Key) are current and not expired

If you recently rotated AWS credentials, contact Permutive support at [email protected] to update the stored credentials.

Data Not Appearing in S3 (Streaming)

Symptom: No files appearing in S3 bucket after setup, or files stopped appearing.Solution:

Verify the Permutive SDK is properly deployed and events are being collected (check Event Inspector in the Dashboard)
Low-traffic sites may see longer delays between files due to batch size thresholds
Verify the bucket region matches the configured region:
- Region must be specific (e.g., eu-central-1, not just EU)

Verify bucket path structure is correct:

s3://{bucket}/{prefix}/type=events/year=YYYY/month=MM/day=DD/hour=HH/

If issues persist, contact Permutive support at [email protected] with your integration details

Data Not Appearing in S3 (Batch)

Symptom: Daily batch exports are missing or delayed.Solution:

Batch exports run on 24-hour cycles. Check if sufficient time has passed since the last export window.
Verify the Permutive SDK is properly deployed and events are being collected
Contact Permutive support at [email protected] to check batch export job logs and status

Incorrect Bucket Path Structure

Symptom: Files appearing in unexpected locations or wrong folder structure.Solution:

Verify the bucketPrefix configuration:
- Should NOT include leading / unless intentional
- Should NOT include bucket name
- Example: permutive/ not /permutive/ or s3://bucket/permutive/
For Streaming, data uses Hive-style partitioning:
- type=events/year=2026/month=01/day=15/hour=14/
- This is expected behavior and cannot be customized
For Batch, data is organized by table name:
- data/{table_name}/year=2026/month=1/day=15/
- This is expected behavior and cannot be customized

Bucket Policy Validation Errors

Symptom: AWS returns validation errors when applying bucket policy.Solution:

Ensure the bucket policy JSON is valid:
- Check for missing commas, brackets, or quotes
- Use AWS Policy Generator or an online JSON validator
Verify ARN format is correct:
- Bucket ARN: arn:aws:s3:::BUCKET_NAME
- Object ARN: arn:aws:s3:::BUCKET_NAME/*
- Note the three colons ::: before bucket name

Confirm the StringEquals condition is correctly formatted:

"Condition": {
  "StringEquals": {"s3:x-amz-acl": "bucket-owner-full-control"}
}

Missing Event Types or Fields

Symptom: Some event types or fields are not appearing in exported data.Solution:

Verify events are being collected in Permutive:
- Check Event Inspector in the Dashboard to confirm events are tracked
- Use browser developer console to verify SDK is firing events
Check event schema matches expected structure:
- Events must include required fields: event_id, user_id, event_name, etc.
- Custom properties are in the properties object
Schema changes may require integration reconfiguration:
- Contact Permutive support if you’ve made significant schema changes

KMS Encryption Issues

Symptom: Errors related to KMS encryption when writing to S3.Solution:

If using customer-managed KMS keys, verify the Permutive IAM user has KMS permissions:

{
  "Effect": "Allow",
  "Action": [
    "kms:Decrypt",
    "kms:Encrypt",
    "kms:GenerateDataKey"
  ],
  "Resource": "arn:aws:kms:REGION:ACCOUNT_ID:key/KEY_ID"
}

Confirm the KMS key policy allows the Permutive IAM user to use the key
Verify the S3 bucket’s default encryption settings are compatible

AWS-managed S3 encryption (SSE-S3) is supported by default. Customer-managed KMS keys require additional configuration. Contact Permutive support for KMS requirements.

Changelog

No changes listed yet. For detailed changelog information, visit our Changelog.

Integrations

Identity

Contextual

Advertising

Marketing

Data Collaboration

Video

Social Media

Survey

​AWS S3

Setup

Troubleshooting

​Overview

​Environment Compatibility

​Prerequisites

​Setup

​Prerequisites

​What Happens After Setup

​Prerequisites

​What Happens After Setup

​Prerequisites

​Step 1: Set Up Your Bucket Structure

​Step 2: Configure Bucket Permissions

​Step 3: Create the Connection in Permutive

​Step 4: Create an Import

​Data Types

​Streaming Schema

​Example Event

​Example Sync Alias

​Example Segment

​Batch Schema

​Example Pageview Event

​Example Alias

​Example Domain

​Example Segment Metadata

​File Formats and Compression

​JSON Format

​Parquet Format

​Troubleshooting

​Changelog

AWS S3

Overview

Environment Compatibility

Prerequisites

Setup

Prerequisites

What Happens After Setup

Prerequisites

What Happens After Setup

Prerequisites

Step 1: Set Up Your Bucket Structure

Step 2: Configure Bucket Permissions

Step 3: Create the Connection in Permutive

Step 4: Create an Import

Data Types

Streaming Schema

Example Event

Example Sync Alias

Example Segment

Batch Schema

Example Pageview Event

Example Alias

Example Domain

Example Segment Metadata

File Formats and Compression

JSON Format

Parquet Format

Troubleshooting

Changelog