AWS S3 Connector Setup Guide

The AWS S3 Connector lets you ingest conversation recordings and chat transcripts from a configured S3 folder into Quality AI Express on a customizable schedule. This enables you to use Quality AI with third-party Contact Center as a Service (CCaaS) solutions.

Quick Start (5 Minutes)

Enable Quality AI Express in platform settings.
Upload test.csv to your S3 folder with sample data.
Configure the S3 connector with bucket credentials and paths.
Run validation tests to verify connectivity.
Set a processing schedule and monitor via logs.

Critical Requirements

Requirement	Details
Stereo Audio	Single file with agent (left) + customer (right) channels
Mono Audio	Two separate files — one agent-only, one customer-only
Timestamps	ISO 8601 format with UTC timezone (YYYY-MM-DDTHH:MM:SSZ)
Agent Emails	Must exactly match platform user accounts

Prerequisites

AWS Environment

S3 bucket created in your preferred region
IAM user/role with read-only S3 permissions
Planned bucket folder structure (unified vs. separate paths)
Test audio/chat files prepared for validation

Platform Requirements

Quality AI Express enabled in settings
All agents onboarded with correct email addresses
Service queues configured and ready for mapping
User has Integrations & Extensions permissions

Data Validation

Audio files in WAV or MP3 format (maximum 50 MB each)
Mono recordings split into separate agent/customer files
All timestamps in ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ)
All recording URLs accessible via HTTPS
CSV files contain all required metadata fields
test.csv file created with sample data

Supported Recording Types

Type	Format	Files per Conversation	Channel	Analytics
Stereo Voice	WAV/MP3	1	Left=Agent, Right=Customer	Full Analytics
Mono Voice	WAV/MP3	2 (separate agent/customer)	N/A	Enhanced Analytics
Voice Transcripts	JSON	1	Pre-transcribed audio	Text Analytics
Chat Scripts	JSON	1	Message-level attribution	Full Text Analytics

Field Differences by Recording Type

Field	Stereo Voice	Mono Voice	Voice Transcripts	Chat Scripts
URL field	`recordingUrl`	`agentRecordings` + `customerRecordings`	`transcriptUrl`	`chatScriptUrl`
Channel field	`agentChannel`, `customerChannel`	—	—	—
Provider field	`asrProvider`	`asrProvider`	`asrProvider`	—

Authentication Methods

Method	Description
Access Keys	Simple setup, suitable for single integrations
IAM Roles	Enterprise-grade security, recommended for production

Required IAM Permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::your-bucket-name",
        "arn:aws:s3:::your-bucket-name/*"
      ]
    }
  ]
}

Mono Recording Requirements

Mono recordings require two separate audio files — one for the agent, one for the customer. A single mixed mono file is not supported and significantly reduces transcription accuracy.

Supported (two clean mono files):

conv-123456-agent.wav (agent audio only)
conv-123456-customer.wav (customer audio only)

Not supported (single mixed mono file):

conv-123456-mixed.wav (both speakers mixed)

Data Flow Architecture

CSV Metadata Formats

Stereo Voice Recordings

Configuration: recordingType = stereo, channelType = voice

Field	Required	Type	Example	Notes
`conversationId`	Required	String	conv-123456	Unique ID, max 50 chars
`agentEmail`	Required	String	john.smith@company.com	Must exist in platform
`conversationStartTime`	Required	String	2025-04-10T14:30:00Z	ISO 8601 UTC
`conversationEndTime`	Required	String	2025-04-10T14:32:45Z	Must be after start time
`channelType`	Required	String	voice	Always `voice` for audio
`recordingType`	Required	String	stereo	Always `stereo` for this format
`recordingUrl`	Required	String	https://s3.amazonaws.com/bucket/conv-123456.wav	HTTPS URL
`queueId`	Required	String	support-tier1	Must exist in queue mapping
`agentChannel`	Required	Integer	0	Agent audio channel (0=left, 1=right)
`customerChannel`	Required	Integer	1	Customer audio channel (0=left, 1=right)
`language`	Optional	String	en	ISO 639-1, defaults to `en`
`asrProvider`	Optional	String	microsoft	Audio service provider

Mono Voice Recordings

Configuration: recordingType = mono, channelType = voice

Mono recordings require two separate CSV entries and two audio files per conversation.

Field	Required	Type	Example	Notes
`conversationId`	Required	String	conv-123456	Same ID for both entries
`agentEmail`	Required	String	john.smith@company.com	Must exist in platform
`conversationStartTime`	Required	String	2025-04-10T14:30:00Z	ISO 8601 UTC
`conversationEndTime`	Required	String	2025-04-10T14:32:45Z	Must be after start time
`channelType`	Required	String	voice	Always `voice` for audio
`recordingType`	Required	String	mono	Always `mono` for this format
`agentRecordings`	Required	String	https://s3.amazonaws.com/bucket/conv-123456-agent.wav	URL to agent recording
`customerRecordings`	Required	String	https://s3.amazonaws.com/bucket/conv-123456-customer.wav	URL to customer recording
`queueId`	Required	String	support-tier1	Must exist in queue mapping
`agentId`	Optional	String	agent-789	Internal agent identifier
`language`	Optional	String	en	ISO 639-1, defaults to `en`
`asrProvider`	Optional	String	microsoft	Transcription provider

Voice Transcripts (Pre-transcribed Audio)

Configuration: recordingType = transcription, channelType = voice Use when you have already transcribed audio files and want to skip speech-to-text processing.

Field	Required	Type	Example	Notes
`conversationId`	Required	String	conv-123456	Unique ID, max 50 chars
`agentEmail`	Required	String	john.smith@company.com	Must exist in platform
`conversationStartTime`	Required	String	2025-04-10T14:30:00Z	ISO 8601 UTC
`conversationEndTime`	Required	String	2025-04-10T14:32:45Z	Must be after start time
`channelType`	Required	String	voice	Always `voice` for audio transcripts
`recordingType`	Required	String	transcription	Always `transcription`
`transcriptPath`	Required	String	transcripts/voice-123.json	Path to JSON transcript file
`queueId`	Required	String	support-tier1	Must exist in queue mapping
`language`	Optional	String	en	ISO 639-1, defaults to `en`
`asrProvider`	Optional	String	microsoft	Original audio service provider

Chat Scripts (Live Chat Interactions)

Configuration: recordingType = transcription, channelType = chat Use for live chat interactions from web chat, messaging platforms, or chat-based customer service.

Field	Required	Type	Example	Notes
`conversationId`	Required	String	conv-123456	Unique ID, max 50 chars
`agentEmail`	Required	String	john.smith@company.com	Must exist in platform
`conversationStartTime`	Required	String	2025-04-10T14:30:00Z	ISO 8601 format
`conversationEndTime`	Required	String	2025-04-10T14:45:00Z	Must be after start time
`channelType`	Required	String	chat	Always `chat` for text interactions
`recordingType`	Required	String	transcription	Always `transcription` for chat
`transcriptPath`	Required	String	transcripts/chat-123.json	Path to JSON transcript file
`queueId`	Required	String	support-tier1	Must exist in queue mapping
`language`	Optional	String	en-US	Defaults to `en` if not specified

Chat scripts include interactions from web chat, WhatsApp, Facebook Messenger, and similar platforms. For conversations involving transfers, use the queueId where the conversation ended and the agentEmail of the agent who terminated it.

JSON Transcript Schemas

Voice Transcript Format

{
  "recognizedPhrases": [
    {
      "channel": 0,
      "offsetInTicks": 140000000.0,
      "nBest": [
        {
          "lexical": "yes one four three four two six",
          "words": [
            {
              "word": "yes",
              "offsetInTicks": 140000000.0,
              "durationInTicks": 3200000.0,
              "confidence": 0.51653963
            }
          ]
        }
      ]
    }
  ]
}

Required fields: channel, offsetInTicks, nBest[].lexical, nBest[].words[].word, nBest[].words[].offsetInTicks, nBest[].words[].durationInTicks, nBest[].words[].confidence

Chat Transcript Format

{
  "1": {
    "type": "AGENT",
    "text": "Good afternoon, how can I help you today?",
    "timestamp": 1749562206000,
    "userId": "john.smith@company.com"
  },
  "2": {
    "type": "USER",
    "text": "I need help with my account balance.",
    "timestamp": 1749562253142,
    "userId": "customer_12345"
  }
}

Required fields:

Field	Values	Description
`type`	`AGENT`, `USER`, or `SYSTEM`	Participant role
`text`	String	Message content
`timestamp`	Integer	Unix timestamp in milliseconds
`userId`	String	Participant identifier

Step-by-Step Configuration

Step 1: Prepare S3 Environment

Option 1: Unified Path Structure

Option 2: Separate Paths

Validation checklist (data preparation):

All audio files are accessible via HTTPS URLs
CSV files contain required fields with correct headers
Mono recordings have separate agent/customer files
test.csv exists in each configured folder
File sizes are under 50 MB each

Step 2: Platform Configuration

Navigate to Quality AI > Configure > Connectors.
Select + Add Connector > Amazon S3 > Connect.
Configure Basic Settings:
- Name — descriptive connector name
- AWS Region — your S3 bucket region
- Auth Type — access keys or IAM role
Configure authentication:
- Access Keys — enter Access Key and Secret Key.
- IAM Role — enter the IAM Role ARN.
Configure folder paths:
- Unified Path — single path for voice and chat: s3://your-bucket/conversations/
- Separate Paths — voice and chat paths separately:
  - Voice: s3://your-bucket/voice-interactions/
  - Chat: s3://your-bucket/chat-interactions/

Validation checklist (connection setup):

Select the Test tab in the connector configuration.
Verify all checks pass:

Check	Expected Result
Authentication	Connected successfully
File Path Access	S3 bucket accessible
File Format	CSV format validated
Metadata Validation	Required fields confirmed

If a check fails:

Check	Resolution
Authentication	Verify credentials and IAM permissions
File Access	Check bucket name, region, folder paths, and URL accessibility
Format/Metadata	Ensure `test.csv` exists with correct structure; verify column headers and timestamp formats

Step 3: Queue Mapping and Scheduling

Navigate to the Queue tab and map CSV queueId values to Quality AI Express queues. Ensure exact string matches.
Navigate to the Schedule tab:
- Interval — choose frequency (minutes/hours/days)
- Start Time — set initial run time (UTC timezone)
Select Save to activate.

Final validation checklist:

Queue mappings saved and validated
Processing schedule configured and active
First ingestion job appears in the Log tab
No error messages in processing logs

Success indicators:

Conversations appear in Quality AI Express dashboards
Analytics data populates for ingested interactions

Troubleshooting

Authentication Issues

Problem	Symptoms	Solution
Invalid Credentials	Authentication failed error	Verify access key/secret key; check IAM role ARN format; ensure credentials have not expired
Permission Denied	Access denied to S3 bucket	Add S3 read permissions to the IAM user/role; verify bucket policy; check that bucket region matches configuration

Data Processing Issues

Problem	Symptoms	Solution
Timestamp Errors	Invalid timestamp format	Use ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ); include UTC timezone; verify end time is after start time

Performance Expectations

Processing takes approximately 3–5 minutes per conversation, depending on conversation duration, ASR transcription latency (voice), and LLM response latency.

Modules

Platform Services

References

AWS S3 Connector Setup Guide

Quick Start (5 Minutes)

Critical Requirements

Prerequisites

AWS Environment

Platform Requirements

Data Validation

Supported Recording Types

Field Differences by Recording Type

Authentication Methods

Mono Recording Requirements

Data Flow Architecture

CSV Metadata Formats

Stereo Voice Recordings

Mono Voice Recordings

Voice Transcripts (Pre-transcribed Audio)

Chat Scripts (Live Chat Interactions)

JSON Transcript Schemas

Voice Transcript Format

Chat Transcript Format

Step-by-Step Configuration

Step 1: Prepare S3 Environment

Step 2: Platform Configuration

Step 3: Queue Mapping and Scheduling

Troubleshooting

Authentication Issues

Data Processing Issues

Performance Expectations

Modules

Platform Services

References

​Quick Start (5 Minutes)

​Critical Requirements

​Prerequisites

​AWS Environment

​Platform Requirements

​Data Validation

​Supported Recording Types

​Field Differences by Recording Type

​Authentication Methods

​Mono Recording Requirements

​Data Flow Architecture

​CSV Metadata Formats

​Stereo Voice Recordings

​Mono Voice Recordings

​Voice Transcripts (Pre-transcribed Audio)

​Chat Scripts (Live Chat Interactions)

​JSON Transcript Schemas

​Voice Transcript Format

​Chat Transcript Format

​Step-by-Step Configuration

​Step 1: Prepare S3 Environment

​Step 2: Platform Configuration

​Step 3: Queue Mapping and Scheduling

​Troubleshooting

​Authentication Issues

​Data Processing Issues

​Performance Expectations

Quick Start (5 Minutes)

Critical Requirements

Prerequisites

AWS Environment

Platform Requirements

Data Validation

Supported Recording Types

Field Differences by Recording Type

Authentication Methods

Mono Recording Requirements

Data Flow Architecture

CSV Metadata Formats

Stereo Voice Recordings

Mono Voice Recordings

Voice Transcripts (Pre-transcribed Audio)

Chat Scripts (Live Chat Interactions)

JSON Transcript Schemas

Voice Transcript Format

Chat Transcript Format

Step-by-Step Configuration

Step 1: Prepare S3 Environment

Step 2: Platform Configuration

Step 3: Queue Mapping and Scheduling

Troubleshooting

Authentication Issues

Data Processing Issues

Performance Expectations