Modulate Platform API Documentation

Overview

Modulate Platform API provides enterprise-grade speech analysis with AI-powered transcription, emotion detection, and demographic insights. Built for scale with async processing and optimized for accuracy.

🎯 High Accuracy

State-of-the-art ML models for accurate transcription and analysis

⚡ Fast Processing

Optimized pipeline with 15-second chunking for parallel processing

💰 Pay-per-Feature

Flexible pricing - only pay for the features you use

Supported Formats

Input: MP3, WAV, Opus (up to 5MB, ~5 minutes)
Processing: Automatic conversion to optimized Opus chunks
Languages: English (multi-language support coming soon)

Authentication

All API requests require authentication headers. Get your credentials from the dashboard.


accountuuid: 12345678-1234-1234-1234-123456789abc
apikey: abcdef12-3456-7890-abcd-ef1234567890
Content-Type: application/json

Quick Start

Get started in under 5 minutes with a simple transcription request:

1

Encode your audio

Convert your audio file to base64 before submission.

2

Submit job

Send a POST request to /api_service with your encoded audio.

3

Poll for results

Check the status with GET /api_service/job_status/{job_id} until your transcript is ready.

# 1. Submit audio for transcription
curl -X POST "https://cloud-processing-api.modulate.ai/api_service" \
  -H "accountuuid: 12345678-1234-1234-1234-123456789abc" \
  -H "apikey: abcdef12-3456-7890-abcd-ef1234567890" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_base64": "UklGRiQAAABXQVZFZm10...",
    "file_extension": "wav",
    "flags": {}
  }'

# Response: {"job_id": "abc-123", "status": "processing"}

# 2. Check status (repeat until completed)
curl -X GET "https://cloud-processing-api.modulate.ai/api_service/job_status/abc-123" \
  -H "accountuuid: 12345678-1234-1234-1234-123456789abc" \
  -H "apikey: abcdef12-3456-7890-abcd-ef1234567890"

API Endpoints

POST Submit Audio for Processing

POST /api_service

Request Body

{
  "audio_base64": "base64-encoded-audio-data",
  "file_extension": "mp3|wav|opus",
  "flags": {
    "emotion_annotation": true,    // Optional: +1 API call
    "demographics": true          // Optional: +1 API call
  },
  "filename": "my-audio-file.mp3"  // Optional
}

{
  "job_id": "18cfd8ae-84f7-4f46-9a33-cecc7db432a9",
  "status": "processing",
  "api_calls_charged": 3,
  "estimated_processing_time": "30-60 seconds",
  "message": "Audio uploaded successfully. Use job_id to check status."
}

{
  "error": "Invalid file format. Supported: mp3, wav, opus",
  "code": "INVALID_FORMAT",
  "details": {
    "file_extension": "mp4",
    "supported_formats": ["mp3", "wav", "opus"]
  }
}

GET Get Job Status & Results

GET /api_service/job_status/{job_id}

Parameters

Parameter	Type	Description
job_id	string	Job identifier from submit response
include_chunks	boolean	Include detailed chunk results (optional)

Response

Processing
Completed

{
  "job_id": "18cfd8ae-84f7-4f46-9a33-cecc7db432a9",
  "status": "processing",
  "progress": {
    "total_chunks": 12,
    "completed_chunks": 8,
    "percentage": 67
  },
  "estimated_completion": "30 seconds"
}

{
  "job_id": "18cfd8ae-84f7-4f46-9a33-cecc7db432a9",
  "status": "completed",
  "results": {
    "full_transcription": "Hello world, this is a test audio file...",
    "chunks": [
      {
        "chunk_id": "chunk_001",
        "start_time_seconds": 0,
        "duration_seconds": 15,
        "transcription": "Hello world, this is a test",
        "emotion": "0.95",      // If emotion_annotation enabled
        "shouting": "0.02",     // If emotion_annotation enabled
        "age": "adult",         // If demographics enabled
        "gender": "male"        // If demographics enabled
      }
    ]
  },
  "processing_time_seconds": 23.4,
  "api_calls_charged": 3
}

Features & Pricing

🎙️ Transcription

High-accuracy speech-to-text

Always Included

1 API call per audio file

😡 Emotion Analysis

Anger detection & shouting identification

Optional

+1 API call if enabled

👥 Demographics

Age group & gender classification

Optional

+1 API call if enabled

Feature Flags

{
              "flags": {
                "emotion_annotation": true,    // Enables anger + shouting detection
                "demographics": true          // Enables age + gender classification
              }
            }

            // API call calculation:
            // Base transcription: 1 call
            // + emotion_annotation: +1 call  
            // + demographics: +1 call
            // Total: 3 API calls

Python SDK

For the best developer experience, use our official Python SDK:

# Download SDK from dashboard or copy the AudioProcessingClient class

SDK Usage

from audio_processing_sdk import AudioProcessingClient, AudioFeature

                  # Initialize client
                  client = AudioProcessingClient(
                      account_uuid="12345678-1234-1234-1234-123456789abc",
                      api_key="abcdef12-3456-7890-abcd-ef1234567890"
                  )

                  # Submit audio with features
                  job_id = client.submit_audio(
                      "my-audio.mp3",
                      features=[
                          AudioFeature.TRANSCRIPTION,    # Always included (free)
                          AudioFeature.EMOTION,          # +1 API call
                          AudioFeature.DEMOGRAPHICS      # +1 API call
                      ]
                  )

                  # Wait for completion with progress callback
                  def show_progress(result):
                      progress = result.progress
                      print(f"Progress: {progress['percentage']}%")

                  result = client.wait_for_completion(
                      job_id, 
                      progress_callback=show_progress
                  )

                  # Access results
                  print(f"Transcription: {result.full_transcription}")
                  print(f"API calls used: {result.api_calls_consumed}")

                  for chunk in result.chunks:
                      print(f"Chunk {chunk.start_time_seconds}s: {chunk.transcription}")
                      if chunk.emotion:
                          print(f"  Emotion: {chunk.emotion}")

Code Examples

import base64
import requests
import time

# Configuration
BASE_URL = "https://cloud-processing-api.modulate.ai"
ACCOUNT_UUID = "12345678-1234-1234-1234-123456789abc"
API_KEY = "abcdef12-3456-7890-abcd-ef1234567890"

headers = {
    "accountuuid": ACCOUNT_UUID,
    "apikey": API_KEY,
    "Content-Type": "application/json"
}

# Read and encode audio file
with open("audio.mp3", "rb") as f:
    audio_data = base64.b64encode(f.read()).decode()

# Submit for processing
payload = {
    "audio_base64": audio_data,
    "file_extension": "mp3",
    "flags": {
        "emotion_annotation": True,
        "demographics": True
    }
}

response = requests.post(f"{BASE_URL}/api_service", 
                        json=payload, headers=headers)
job_data = response.json()
job_id = job_data["job_id"]

print(f"Job submitted: {job_id}")

# Poll for completion
while True:
    status_response = requests.get(
        f"{BASE_URL}/api_service/job_status/{job_id}",
        headers=headers
    )
    result = status_response.json()
    
    if result["status"] == "completed":
        print("Processing complete!")
        print(f"Transcription: {result['results']['full_transcription']}")
        break
    elif result["status"] == "failed":
        print("Processing failed!")
        break
    else:
        progress = result.get("progress", {})
        print(f"Progress: {progress.get('percentage', 0)}%")
        time.sleep(10)

#!/bin/bash

# Configuration
BASE_URL="https://cloud-processing-api.modulate.ai" 
ACCOUNT_UUID="12345678-1234-1234-1234-123456789abc"
API_KEY="abcdef12-3456-7890-abcd-ef1234567890"

# Encode audio file
AUDIO_B64=$(base64 -i audio.mp3)

# Submit job
RESPONSE=$(curl -s -X POST "$BASE_URL/api_service" \
  -H "accountuuid: $ACCOUNT_UUID" \
  -H "apikey: $API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"audio_base64\": \"$AUDIO_B64\",
    \"file_extension\": \"mp3\",
    \"flags\": {
      \"emotion_annotation\": true,
      \"demographics\": true
    }
  }")

JOB_ID=$(echo $RESPONSE | jq -r '.job_id')
echo "Job ID: $JOB_ID"

# Poll for completion
while true; do
  STATUS=$(curl -s -X GET "$BASE_URL/api_service/job_status/$JOB_ID" \
    -H "accountuuid: $ACCOUNT_UUID" \
    -H "apikey: $API_KEY")
  
  STATUS_VALUE=$(echo $STATUS | jq -r '.status')
  
  if [ "$STATUS_VALUE" = "completed" ]; then
    echo "Processing complete!"
    echo $STATUS | jq '.results.full_transcription'
    break
  elif [ "$STATUS_VALUE" = "failed" ]; then
    echo "Processing failed!"
    break
  else
    PERCENTAGE=$(echo $STATUS | jq -r '.progress.percentage // 0')
    echo "Progress: $PERCENTAGE%"
    sleep 10
  fi
done

const BASE_URL = "https://cloud-processing-api.modulate.ai";
const ACCOUNT_UUID = "12345678-1234-1234-1234-123456789abc";
const API_KEY = "abcdef12-3456-7890-abcd-ef1234567890";

const headers = {
    'accountuuid': ACCOUNT_UUID,
    'apikey': API_KEY,
    'Content-Type': 'application/json'
};

async function processAudio(audioFile) {
    // Convert file to base64
    const reader = new FileReader();
    const audioBase64 = await new Promise(resolve => {
        reader.onload = () => resolve(reader.result.split(',')[1]);
        reader.readAsDataURL(audioFile);
    });

    // Submit job
    const submitResponse = await fetch(`${BASE_URL}/api_service`, {
        method: 'POST',
        headers: headers,
        body: JSON.stringify({
            audio_base64: audioBase64,
            file_extension: audioFile.name.split('.').pop(),
            flags: {
                emotion_annotation: true,
                demographics: true
            }
        })
    });

    const submitData = await submitResponse.json();
    const jobId = submitData.job_id;
    console.log(`Job submitted: ${jobId}`);

    // Poll for completion
    while (true) {
        const statusResponse = await fetch(
            `${BASE_URL}/api_service/job_status/${jobId}`,
            { headers }
        );
        const result = await statusResponse.json();

        if (result.status === 'completed') {
            console.log('Processing complete!');
            console.log('Transcription:', result.results.full_transcription);
            return result;
        } else if (result.status === 'failed') {
            throw new Error('Processing failed');
        } else {
            const progress = result.progress?.percentage || 0;
            console.log(`Progress: ${progress}%`);
            await new Promise(resolve => setTimeout(resolve, 10000));
        }
    }
}

// Usage with file input
document.getElementById('audioFile').addEventListener('change', async (e) => {
    const file = e.target.files[0];
    if (file) {
        try {
            const result = await processAudio(file);
            console.log('Results:', result);
        } catch (error) {
            console.error('Error:', error);
        }
    }
});

Error Handling

HTTP Status Codes

Code	Description	Common Causes
200	Success	Job status retrieved successfully
202	Accepted	Audio submitted for processing
400	Bad Request	Invalid file format, malformed JSON
401	Unauthorized	Invalid account UUID or API key
404	Not Found	Job ID not found
413	Payload Too Large	File exceeds 5MB limit
429	Too Many Requests	Quota exceeded or rate limited
500	Internal Server Error	Processing error, contact support

Error Response Format

{
  "error": "Human-readable error message",
  "code": "ERROR_CODE_CONSTANT",
  "details": {
    "field": "additional context",
    "suggestion": "how to fix the issue"
  },
  "timestamp": "2025-06-15T16:50:55Z"
}

Rate Limits & Quotas

📁 File Limits

Max file size: 5MB
Max duration: ~5 minutes
Supported formats: MP3, WAV, Opus
Language: English only

⚡ API Quotas

Monthly quota: Shown on dashboard
Concurrent jobs: 10 per account
Request rate: 60 requests/minute
Billing: Post-pay monthly

💡 Pro Tips

Use only the features you need to minimize API calls
Implement exponential backoff for status polling
Cache results to avoid reprocessing the same audio
Monitor your quota usage on the dashboard

Need Help?

Visit your dashboard for account management and usage tracking.

For technical support or feature requests, contact our team.

📚 API Docs