Skip to main content
The AWS S3 Connector lets you ingest conversation recordings and chat transcripts from a configured S3 folder into Quality AI Express on a customizable schedule. This enables you to use Quality AI with third-party Contact Center as a Service (CCaaS) solutions.

Quick Start (5 Minutes)

  1. Enable Quality AI Express in platform settings.
  2. Upload test.csv to your S3 folder with sample data.
  3. Configure the S3 connector with bucket credentials and paths.
  4. Run validation tests to verify connectivity.
  5. Set a processing schedule and monitor via logs.

Critical Requirements

RequirementDetails
Stereo AudioSingle file with agent (left) + customer (right) channels
Mono AudioTwo separate files — one agent-only, one customer-only
TimestampsISO 8601 format with UTC timezone (YYYY-MM-DDTHH:MM:SSZ)
Agent EmailsMust exactly match platform user accounts

Prerequisites

AWS Environment

  • S3 bucket created in your preferred region
  • IAM user/role with read-only S3 permissions
  • Planned bucket folder structure (unified vs. separate paths)
  • Test audio/chat files prepared for validation

Platform Requirements

  • Quality AI Express enabled in settings
  • All agents onboarded with correct email addresses
  • Service queues configured and ready for mapping
  • User has Integrations & Extensions permissions

Data Validation

  • Audio files in WAV or MP3 format (maximum 50 MB each)
  • Mono recordings split into separate agent/customer files
  • All timestamps in ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ)
  • All recording URLs accessible via HTTPS
  • CSV files contain all required metadata fields
  • test.csv file created with sample data

Supported Recording Types

TypeFormatFiles per ConversationChannelAnalytics
Stereo VoiceWAV/MP31Left=Agent, Right=CustomerFull Analytics
Mono VoiceWAV/MP32 (separate agent/customer)N/AEnhanced Analytics
Voice TranscriptsJSON1Pre-transcribed audioText Analytics
Chat ScriptsJSON1Message-level attributionFull Text Analytics

Field Differences by Recording Type

FieldStereo VoiceMono VoiceVoice TranscriptsChat Scripts
URL fieldrecordingUrlagentRecordings + customerRecordingstranscriptUrlchatScriptUrl
Channel fieldagentChannel, customerChannel
Provider fieldasrProviderasrProviderasrProvider

Authentication Methods

MethodDescription
Access KeysSimple setup, suitable for single integrations
IAM RolesEnterprise-grade security, recommended for production
Required IAM Permissions:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::your-bucket-name",
        "arn:aws:s3:::your-bucket-name/*"
      ]
    }
  ]
}

Mono Recording Requirements

Mono recordings require two separate audio files — one for the agent, one for the customer. A single mixed mono file is not supported and significantly reduces transcription accuracy.
Supported (two clean mono files):
  • conv-123456-agent.wav (agent audio only)
  • conv-123456-customer.wav (customer audio only)
Not supported (single mixed mono file):
  • conv-123456-mixed.wav (both speakers mixed)

Data Flow Architecture

Architecture

CSV Metadata Formats

Stereo Voice Recordings

Configuration: recordingType = stereo, channelType = voice
FieldRequiredTypeExampleNotes
conversationIdRequiredStringconv-123456Unique ID, max 50 chars
agentEmailRequiredStringjohn.smith@company.comMust exist in platform
conversationStartTimeRequiredString2025-04-10T14:30:00ZISO 8601 UTC
conversationEndTimeRequiredString2025-04-10T14:32:45ZMust be after start time
channelTypeRequiredStringvoiceAlways voice for audio
recordingTypeRequiredStringstereoAlways stereo for this format
recordingUrlRequiredStringhttps://s3.amazonaws.com/bucket/conv-123456.wavHTTPS URL
queueIdRequiredStringsupport-tier1Must exist in queue mapping
agentChannelRequiredInteger0Agent audio channel (0=left, 1=right)
customerChannelRequiredInteger1Customer audio channel (0=left, 1=right)
languageOptionalStringenISO 639-1, defaults to en
asrProviderOptionalStringmicrosoftAudio service provider

Mono Voice Recordings

Configuration: recordingType = mono, channelType = voice
Mono recordings require two separate CSV entries and two audio files per conversation.
FieldRequiredTypeExampleNotes
conversationIdRequiredStringconv-123456Same ID for both entries
agentEmailRequiredStringjohn.smith@company.comMust exist in platform
conversationStartTimeRequiredString2025-04-10T14:30:00ZISO 8601 UTC
conversationEndTimeRequiredString2025-04-10T14:32:45ZMust be after start time
channelTypeRequiredStringvoiceAlways voice for audio
recordingTypeRequiredStringmonoAlways mono for this format
agentRecordingsRequiredStringhttps://s3.amazonaws.com/bucket/conv-123456-agent.wavURL to agent recording
customerRecordingsRequiredStringhttps://s3.amazonaws.com/bucket/conv-123456-customer.wavURL to customer recording
queueIdRequiredStringsupport-tier1Must exist in queue mapping
agentIdOptionalStringagent-789Internal agent identifier
languageOptionalStringenISO 639-1, defaults to en
asrProviderOptionalStringmicrosoftTranscription provider

Voice Transcripts (Pre-transcribed Audio)

Configuration: recordingType = transcription, channelType = voice Use when you have already transcribed audio files and want to skip speech-to-text processing.
FieldRequiredTypeExampleNotes
conversationIdRequiredStringconv-123456Unique ID, max 50 chars
agentEmailRequiredStringjohn.smith@company.comMust exist in platform
conversationStartTimeRequiredString2025-04-10T14:30:00ZISO 8601 UTC
conversationEndTimeRequiredString2025-04-10T14:32:45ZMust be after start time
channelTypeRequiredStringvoiceAlways voice for audio transcripts
recordingTypeRequiredStringtranscriptionAlways transcription
transcriptPathRequiredStringtranscripts/voice-123.jsonPath to JSON transcript file
queueIdRequiredStringsupport-tier1Must exist in queue mapping
languageOptionalStringenISO 639-1, defaults to en
asrProviderOptionalStringmicrosoftOriginal audio service provider

Chat Scripts (Live Chat Interactions)

Configuration: recordingType = transcription, channelType = chat Use for live chat interactions from web chat, messaging platforms, or chat-based customer service.
FieldRequiredTypeExampleNotes
conversationIdRequiredStringconv-123456Unique ID, max 50 chars
agentEmailRequiredStringjohn.smith@company.comMust exist in platform
conversationStartTimeRequiredString2025-04-10T14:30:00ZISO 8601 format
conversationEndTimeRequiredString2025-04-10T14:45:00ZMust be after start time
channelTypeRequiredStringchatAlways chat for text interactions
recordingTypeRequiredStringtranscriptionAlways transcription for chat
transcriptPathRequiredStringtranscripts/chat-123.jsonPath to JSON transcript file
queueIdRequiredStringsupport-tier1Must exist in queue mapping
languageOptionalStringen-USDefaults to en if not specified
Chat scripts include interactions from web chat, WhatsApp, Facebook Messenger, and similar platforms. For conversations involving transfers, use the queueId where the conversation ended and the agentEmail of the agent who terminated it.

JSON Transcript Schemas

Voice Transcript Format

{
  "recognizedPhrases": [
    {
      "channel": 0,
      "offsetInTicks": 140000000.0,
      "nBest": [
        {
          "lexical": "yes one four three four two six",
          "words": [
            {
              "word": "yes",
              "offsetInTicks": 140000000.0,
              "durationInTicks": 3200000.0,
              "confidence": 0.51653963
            }
          ]
        }
      ]
    }
  ]
}
Required fields: channel, offsetInTicks, nBest[].lexical, nBest[].words[].word, nBest[].words[].offsetInTicks, nBest[].words[].durationInTicks, nBest[].words[].confidence

Chat Transcript Format

{
  "1": {
    "type": "AGENT",
    "text": "Good afternoon, how can I help you today?",
    "timestamp": 1749562206000,
    "userId": "john.smith@company.com"
  },
  "2": {
    "type": "USER",
    "text": "I need help with my account balance.",
    "timestamp": 1749562253142,
    "userId": "customer_12345"
  }
}
Required fields:
FieldValuesDescription
typeAGENT, USER, or SYSTEMParticipant role
textStringMessage content
timestampIntegerUnix timestamp in milliseconds
userIdStringParticipant identifier

Step-by-Step Configuration

Step 1: Prepare S3 Environment

Option 1: Unified Path Structure Unified Path Structure Option 2: Separate Paths Separate Path Structure Validation checklist (data preparation):
  • All audio files are accessible via HTTPS URLs
  • CSV files contain required fields with correct headers
  • Mono recordings have separate agent/customer files
  • test.csv exists in each configured folder
  • File sizes are under 50 MB each

Step 2: Platform Configuration

  1. Navigate to Quality AI > Configure > Connectors.
  2. Select + Add Connector > Amazon S3 > Connect.
  3. Configure Basic Settings:
    • Name — descriptive connector name
    • AWS Region — your S3 bucket region
    • Auth Type — access keys or IAM role
  4. Configure authentication:
    • Access Keys — enter Access Key and Secret Key.
    • IAM Role — enter the IAM Role ARN.
  5. Configure folder paths:
    • Unified Path — single path for voice and chat: s3://your-bucket/conversations/
    • Separate Paths — voice and chat paths separately:
      • Voice: s3://your-bucket/voice-interactions/
      • Chat: s3://your-bucket/chat-interactions/
Validation checklist (connection setup):
  1. Select the Test tab in the connector configuration.
  2. Verify all checks pass:
CheckExpected Result
AuthenticationConnected successfully
File Path AccessS3 bucket accessible
File FormatCSV format validated
Metadata ValidationRequired fields confirmed
If a check fails:
CheckResolution
AuthenticationVerify credentials and IAM permissions
File AccessCheck bucket name, region, folder paths, and URL accessibility
Format/MetadataEnsure test.csv exists with correct structure; verify column headers and timestamp formats

Step 3: Queue Mapping and Scheduling

  1. Navigate to the Queue tab and map CSV queueId values to Quality AI Express queues. Ensure exact string matches.
  2. Navigate to the Schedule tab:
    • Interval — choose frequency (minutes/hours/days)
    • Start Time — set initial run time (UTC timezone)
  3. Select Save to activate.
Final validation checklist:
  • Queue mappings saved and validated
  • Processing schedule configured and active
  • First ingestion job appears in the Log tab
  • No error messages in processing logs
Success indicators:
  • Conversations appear in Quality AI Express dashboards
  • Analytics data populates for ingested interactions

Troubleshooting

Authentication Issues

ProblemSymptomsSolution
Invalid CredentialsAuthentication failed errorVerify access key/secret key; check IAM role ARN format; ensure credentials have not expired
Permission DeniedAccess denied to S3 bucketAdd S3 read permissions to the IAM user/role; verify bucket policy; check that bucket region matches configuration

Data Processing Issues

ProblemSymptomsSolution
Timestamp ErrorsInvalid timestamp formatUse ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ); include UTC timezone; verify end time is after start time

Performance Expectations

Processing takes approximately 3–5 minutes per conversation, depending on conversation duration, ASR transcription latency (voice), and LLM response latency.