AWS S3
DirectionBidirectional
Environment
WebiOSAndroidCTVAPI Direct
Capability
ConnectivityRouting
SDK RequiredNo
Product(s) Required
Core PlatformRouting
AWS S3 allows publishers to securely store and manage large volumes of advertising and audience data in the cloud.
Overview
The AWS S3 integration enables publishers to leverage Permutive’s bi-directional data capabilities with their S3 storage. This integration operates in two modes: Routing (Destination): Export first-party event data from Permutive to S3 buckets. Permutive offers two distinct routing modes:- S3 Streaming: Near real-time streaming, ideal for low-latency data pipelines and analytics
- S3 Batch: Daily scheduled exports, suitable for data warehouse ingestion and batch processing workflows
Routing capability requires the Routing package in addition to Core Platform. Contact your Customer Success Manager to enable Routing.
Environment Compatibility
| Environment | Supported | Notes |
|---|---|---|
| Web | Yes | — |
| iOS | Yes | — |
| Android | Yes | — |
| CTV | Yes | — |
| API Direct | Yes | — |
Prerequisites
For Routing (exporting data to S3):- AWS account with permissions to create S3 buckets and IAM users/policies
- S3 bucket created in the appropriate AWS region
- IAM user with programmatic access credentials (Access Key ID and Secret Access Key)
- Ability to configure S3 bucket policies with specific permissions
- Secure method to share AWS credentials with Permutive (1Password or GPG encryption recommended)
Setup
- Routing Streaming Setup
- Routing Batch Setup
- Connectivity Setup
S3 Streaming routing exports your first-party event data to an S3 bucket in near real-time as GZIP-compressed JSONL files with Hive-style partitioning. Data arrives with approximately 5-minute latency, making it ideal for low-latency data pipelines and ingestion into AWS services such as Athena, Redshift, or EMR.Setup requires coordination with Permutive support. You will need to prepare your AWS environment (S3 bucket, IAM user with programmatic access) and then share your bucket details and credentials with the Permutive team.
Prerequisites
- AWS account with permissions to create S3 buckets and IAM users/policies
- An S3 bucket in a region-specific location (e.g.,
us-east-1,eu-west-1) with public access blocked - A dedicated IAM user with
s3:List*,s3:Get*,s3:Delete*, ands3:Put*permissions on the bucket - A secure method to share AWS credentials with Permutive (1Password or GPG encryption recommended)
What Happens After Setup
Once routing is active:- Files stream to S3 in near real-time with approximately 5-minute latency
- Hive-style partitions are created automatically by hour
- Event data is written as GZIP-compressed JSONL files
- File naming follows the pattern
{timestamp}-{hash}-{worker_id}.jsonl.gz
Data Types
Streaming Schema
Event Data (events)
Event Data (events)
Events in S3 Streaming are exported in newline-delimited JSON format with the following structure:
Unix timestamp in milliseconds as a string
Unique identifier for this event
Permutive user ID
Name of the event (e.g.,
Pageview, slotclicked)Organization identifier
Workspace/project identifier
Session identifier (optional)
Page view identifier (optional)
Source URL (optional)
Array of segment IDs the user belongs to
Custom event properties as key-value pairs
Example Event
Identity Sync Data (sync_aliases)
Identity Sync Data (sync_aliases)
Identity synchronization events contain cross-device and identity mapping data.
Unix timestamp in milliseconds as a string
Permutive user ID
Organization identifier
Workspace/project identifier
Array of alias objects, each containing:
id: The alias identifier valuetag: The alias type (e.g.,email_sha256,device_id)
Example Sync Alias
Segment Metadata (segment)
Segment Metadata (segment)
Segment metadata snapshots containing segment definitions. These files are NOT date-partitioned.
Segment UUID
Segment number/ID used in the segments array of events
Human-readable segment name
Array of tags associated with the segment
Additional segment metadata
Workspace identifier
Array of ancestor workspace/organization IDs
State of the workspace (e.g., “Active”, “Deleted”)
Whether the segment has been deleted
Example Segment
Batch Schema
Event Tables (e.g., pageview_events)
Event Tables (e.g., pageview_events)
Batch exports create separate tables for each event type (e.g.,
pageview_events, videoview_events). All event tables share a common structure:Timestamp for when the event was received by Permutive (in UTC)
Unique identifier for each individual event
Identifier unique to a particular user
Identifier unique to a user’s session. Sessions last 30 minutes unless a user stays on site
Identifier unique to a particular page or screen view
Identifier for the workspace which the event originated from
A list of all segment IDs the user was in when the event fired
A list of all cohort codes the user was in when the event fired
Event-specific properties as a nested object. Structure varies by event type.
Example Pageview Event
Aliases Table (aliases)
Aliases Table (aliases)
Identity data and alias mappings for cross-device tracking.
Timestamp when the alias was captured
Type of alias event
Permutive user identifier
External identity value (e.g., hashed email, device ID)
Identity tag or namespace (e.g.,
email_sha256, device_id)Workspace identifier
Example Alias
Domains Table (domains)
Domains Table (domains)
Segment Metadata Table (segment_metadata)
Segment Metadata Table (segment_metadata)
Segment definitions and metadata. This is a snapshot table that is fully replaced with each export.
Segment ID number
Segment name
Array of tags associated with the segment
JSON string containing additional segment metadata
Workspace identifier
Example Segment Metadata
File Formats and Compression
Streaming Format
Streaming Format
- Format: Newline-delimited JSON (
.jsonl) - Compression: GZIP (
.gz) - File Extension:
.jsonl.gz - Character Encoding: UTF-8
Batch Formats
Batch Formats
JSON Format
- Format: Newline-delimited JSON
- Compression: GZIP
- File Extension:
.json.gz - Character Encoding: UTF-8
Parquet Format
- Format: Apache Parquet columnar format
- Compression: Snappy
- File Extension:
.snappy.parquet - Schema: Derived from BigQuery table structure
Parquet format is recommended for data warehouse ingestion and analytics workloads due to better compression and query performance.
Troubleshooting
Permission Denied Errors
Permission Denied Errors
Symptom: Files are not appearing in S3 bucket, or logs show permission errors.Solution:
-
Verify the IAM user has all required permissions:
s3:PutObjects3:GetObjects3:DeleteObjects3:ListBucket
-
Check that the bucket policy includes the correct bucket ARN:
-
Verify the
bucket-owner-full-controlACL condition is correctly configured - Ensure the IAM user credentials (Access Key ID and Secret Access Key) are current and not expired
Data Not Appearing in S3 (Streaming)
Data Not Appearing in S3 (Streaming)
Symptom: No files appearing in S3 bucket after setup, or files stopped appearing.Solution:
- Verify the Permutive SDK is properly deployed and events are being collected (check Event Inspector in the Dashboard)
- Low-traffic sites may see longer delays between files due to batch size thresholds
-
Verify the bucket region matches the configured region:
- Region must be specific (e.g.,
eu-central-1, not justEU)
- Region must be specific (e.g.,
-
Verify bucket path structure is correct:
- If issues persist, contact Permutive support at [email protected] with your integration details
Data Not Appearing in S3 (Batch)
Data Not Appearing in S3 (Batch)
Symptom: Daily batch exports are missing or delayed.Solution:
- Batch exports run on 24-hour cycles. Check if sufficient time has passed since the last export window.
- Verify the Permutive SDK is properly deployed and events are being collected
- Contact Permutive support at [email protected] to check batch export job logs and status
Incorrect Bucket Path Structure
Incorrect Bucket Path Structure
Symptom: Files appearing in unexpected locations or wrong folder structure.Solution:
-
Verify the
bucketPrefixconfiguration:- Should NOT include leading
/unless intentional - Should NOT include bucket name
- Example:
permutive/not/permutive/ors3://bucket/permutive/
- Should NOT include leading
-
For Streaming, data uses Hive-style partitioning:
type=events/year=2026/month=01/day=15/hour=14/- This is expected behavior and cannot be customized
-
For Batch, data is organized by table name:
data/{table_name}/year=2026/month=1/day=15/- This is expected behavior and cannot be customized
Bucket Policy Validation Errors
Bucket Policy Validation Errors
Symptom: AWS returns validation errors when applying bucket policy.Solution:
-
Ensure the bucket policy JSON is valid:
- Check for missing commas, brackets, or quotes
- Use AWS Policy Generator or an online JSON validator
-
Verify ARN format is correct:
- Bucket ARN:
arn:aws:s3:::BUCKET_NAME - Object ARN:
arn:aws:s3:::BUCKET_NAME/* - Note the three colons
:::before bucket name
- Bucket ARN:
-
Confirm the
StringEqualscondition is correctly formatted:
Missing Event Types or Fields
Missing Event Types or Fields
Symptom: Some event types or fields are not appearing in exported data.Solution:
-
Verify events are being collected in Permutive:
- Check Event Inspector in the Dashboard to confirm events are tracked
- Use browser developer console to verify SDK is firing events
-
Check event schema matches expected structure:
- Events must include required fields:
event_id,user_id,event_name, etc. - Custom properties are in the
propertiesobject
- Events must include required fields:
-
Schema changes may require integration reconfiguration:
- Contact Permutive support if you’ve made significant schema changes
KMS Encryption Issues
KMS Encryption Issues
Symptom: Errors related to KMS encryption when writing to S3.Solution:
-
If using customer-managed KMS keys, verify the Permutive IAM user has KMS permissions:
- Confirm the KMS key policy allows the Permutive IAM user to use the key
- Verify the S3 bucket’s default encryption settings are compatible
AWS-managed S3 encryption (SSE-S3) is supported by default. Customer-managed KMS keys require additional configuration. Contact Permutive support for KMS requirements.
Changelog
No changes listed yet. For detailed changelog information, visit our Changelog.