The AWS S3 integration enables publishers to leverage Permutive’s bi-directional data capabilities with their S3 storage. This integration operates in two modes:Routing (Destination): Export first-party event data from Permutive to S3 buckets. Permutive offers two distinct routing modes:
S3 Streaming: Near real-time streaming, ideal for low-latency data pipelines and analytics
S3 Batch: Daily scheduled exports, suitable for data warehouse ingestion and batch processing workflows
Routing capability requires the Routing package in addition to Core Platform. Contact your Customer Success Manager to enable Routing.
Connectivity (Source): Import audience data from your S3 storage into Permutive for cohort building and activation across your publisher inventory.Both Routing modes support exporting event data, identity data (aliases), and segment metadata to customer-owned S3 buckets with Hive-style partitioning and compression.
S3 Streaming routing exports your first-party event data to an S3 bucket in near real-time as GZIP-compressed JSONL files with Hive-style partitioning. Data arrives with approximately 5-minute latency, making it ideal for low-latency data pipelines and ingestion into AWS services such as Athena, Redshift, or EMR.Setup requires coordination with Permutive support. You will need to prepare your AWS environment (S3 bucket, IAM user with programmatic access) and then share your bucket details and credentials with the Permutive team.
Files stream to S3 in near real-time with approximately 5-minute latency
Hive-style partitions are created automatically by hour
Event data is written as GZIP-compressed JSONL files
File naming follows the pattern {timestamp}-{hash}-{worker_id}.jsonl.gz
See the Streaming Schema section below for detailed schema information.
S3 Batch routing exports your first-party event data to an S3 bucket on a scheduled 24-hour cycle. You can choose between JSON (GZIP compressed) or Parquet (Snappy compressed) format based on your data processing needs. Batch exports are suitable for data warehouse ingestion and batch processing workflows.Setup requires coordination with Permutive support. You will need to prepare an S3 bucket, choose your preferred data format, and apply a Permutive-provided bucket policy.
Understanding of your data format requirements (JSON vs Parquet)
Organization-Level Scope: S3 Batch routing operates at the organization level, exporting data for all workspaces within your organization, unlike streaming routing which is workspace-specific.
Permutive uses the concept of a Schema (containing multiple Tables) to organize your data. Structure your bucket so that each table is a directory under your schema prefix:
In the Permutive dashboard, go to Connectivity > Catalog and select Amazon S3. Begin entering your connection details. Once you enter your bucket name, Permutive will generate a bucket policy for you.
2
Apply the Bucket Policy
Copy the generated S3 Bucket Policy from the Permutive dashboard:
Open the AWS S3 Console and navigate to your bucket
Once your connection is active, go to Connectivity > Imports and click Create Import, then select your S3 connection.For the complete setup guide with detailed instructions, see Connecting to Amazon S3.
Confirm the KMS key policy allows the Permutive IAM user to use the key
Verify the S3 bucket’s default encryption settings are compatible
AWS-managed S3 encryption (SSE-S3) is supported by default. Customer-managed KMS keys require additional configuration. Contact Permutive support for KMS requirements.