transcribe.audio

Transcribe an audio file to text. POST { url, language?, diarize? } — the audio is fetched from your URL (wav, mp3, m4a/aac, ogg/opus, flac, webm; up to 15 MB and 15 minutes per call — split longer recordings and call once per segment). Returns the full punctuated transcript, overall confidence, audio duration, detected or specified language (BCP-47, e.g. "en", "es"), word-level timestamps with confidences, and — with diarize=true — speaker-segmented utterances (who said what, when). High-accuracy speech recognition for meeting notes, podcast processing, voicemail handling, and media monitoring.

price
$0.0675 USDC per call
method
POST/api/transcribe/audio
payment
x402 v2 · USDC on Base (EIP-3009) or Solana (SPL transfer)
auth
None. Sign the payment, retry with PAYMENT-SIGNATURE.
tier
Tier 2 — LLM/render upstream included

Parameters

NameTypeDescription
urlrequiredstringPublic audio URL. Caps: 15 MB, 15 minutes.
max 2048 chars · uri
languagestring
match ^[a-z]{2,3}(-[A-Za-z0-9]{2,8})?$
diarizeboolean

Code samples

cURLbash
# 1. Probe the endpoint with no auth — receive 402 with PaymentRequirements
curl -sS -X POST 'https://2s.io/api/transcribe/audio' \
  -H 'Content-Type: application/json' \
  -d '{"url":"https://example.com","language":"example","diarize":false}'

# 2. Sign the EIP-3009 transferWithAuthorization for the advertised price +
#    payTo from the 402 envelope, then retry with PAYMENT-SIGNATURE:
curl -sS -X POST 'https://2s.io/api/transcribe/audio' \
  -H 'Content-Type: application/json' \
  -H 'PAYMENT-SIGNATURE: <base64-json-payload>' \
  -d '{"url":"https://example.com","language":"example","diarize":false}'

# Or just use the canonical runner — it handles the whole loop:
#   EVM_PRIVATE_KEY=0x... node --env-file=.env.local \
#     --experimental-strip-types scripts/x402-pay.ts \
#     'https://2s.io/api/transcribe/audio'
TypeScript / Node — @2sio/sdktypescript
import { TwoS } from '@2sio/sdk'

const client = new TwoS({
  privateKey: process.env.EVM_PRIVATE_KEY as `0x${string}`,
})

const result = await client.transcribe.audio({
  "url": "https://example.com",
  "language": "example",
  "diarize": false
})

console.log('endpoint:', result.endpoint)
console.log('cost:', result.costUsd, 'USDC')
console.log('tx:', result.settlement?.txHash)
console.log('data:', result.data)
Python — 2siopython
import os
from twosio import TwoS

client = TwoS(private_key=os.environ["EVM_PRIVATE_KEY"])

result = client.transcribe.audio(url="https://example.com", language="example", diarize=False)

print("endpoint:", result.endpoint)
print("cost:", result.cost_usd, "USDC")
print("tx:", (result.settlement or {}).get("tx_hash"))
print("data:", result.data)
MCP — Claude Desktop / AgentKit / any MCP hostjson
// 1. Add @2sio/mcp to your MCP host config (Claude Desktop example below).
//    EVM_PRIVATE_KEY funds x402 payments per call.

// claude_desktop_config.json
{
  "mcpServers": {
    "2sio": {
      "command": "npx",
      "args": ["-y", "@2sio/mcp"],
      "env": { "EVM_PRIVATE_KEY": "0x..." }
    }
  }
}

// 2. Once the server is running, agents call this tool via standard MCP:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "transcribe.audio",
    "arguments": {
      "url": "https://example.com",
      "language": "example",
      "diarize": false
    }
  }
}

Response

FieldTypeDescription
transcriptstringFull punctuated transcript.
confidencenumber
durationSecondsnumber
languagestring
wordCountnumber
wordsarrayWord-level timestamps (seconds).
utterancesarraySpeaker turns; populated when diarize=true.
sourceobject
Example response datajson
{
  "transcript": "example",
  "confidence": 1,
  "durationSeconds": 1,
  "language": "example",
  "wordCount": 1,
  "words": [
    {
      "word": "example",
      "start": 1,
      "end": 1,
      "confidence": 1,
      "speaker": 1
    }
  ],
  "utterances": [
    {
      "speaker": 1,
      "start": 1,
      "end": 1,
      "text": "example"
    }
  ],
  "source": {
    "provider": "example",
    "url": "example"
  }
}

Discovery

2s.io is x402-native. Every call is paid per-request from a USDC-funded EVM wallet on Base — no signup, no API keys, no monthly fees. Source code: github.com/2s-io/sdk.