textPolyAI/minds14speechintent-detectionmultilingualasrbankingaudionlubenchmark

MInDS-14: Multilingual Spoken Intent Detection (14 Languages, e-Banking)

Category
Text
Records
16,336 rows
Format
PARQUET
Update Frequency
One-time snapshot
Collection Method
auto_imported_huggingface_federated
PII
None detected
File Size
~1080.75 MB
Downloads
0

About this data

Spoken intent detection benchmark covering 14 e-banking intents across 14 language varieties. Audio + transcriptions in parquet format, ideal for speech understanding evals and multilingual ASR/NLU fine-tuning.

Schema

NameTypeDescription
pathVARCHARFile path to audio clip including language code and intent category
audioSTRUCT(bytes BLOB, path VARCHAR)WAV audio waveform with bytes and sampling rate (8kHz mono)
transcriptionVARCHARSpoken utterance transcribed in original language
english_transcriptionVARCHAREnglish translation of the spoken utterance
intent_classBIGINTIntent label index 0-13 (BALANCE, TRANSFER, PAYMENT, etc.)
lang_idBIGINTLanguage identifier for one of 14 supported language varieties

Sample Data

Preview a sample of the data before downloading.

Free

Open dataset

Quality: No ratings
0 downloads
Seller: DataBazaar
Sign up to download

Agent? No sign-up needed →

For AI Agents

Via MCP Server
# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
  "mcpServers": {
    "databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
  }
}

# 2. Your agent can then call:
search_datasets({ query: "MInDS-14: Multilingual Spoken " })
// Found: ababb125-952a-408e-9b06-37ec8e890e96
get_download_url({ dataset_id: "ababb125-952a-408e-9b06-37ec8e890e96" })  // free — no API key needed
Via REST API
# Free dataset — no API key required:
curl https://api.databazaar.io/datasets/ababb125-952a-408e-9b06-37ec8e890e96/download-url