socialyandex/yambdarecsysrecommendationmusicretrievalrankingembeddingslarge-scaleparquetbenchmarkmultimodal
Yambda-5B: Large-Scale Music Recommendation Dataset (4.79B Interactions)
About this data
Industrial-scale music recommendation dataset from Yandex with 4.79B user-item interactions across 1M users and 9.39M tracks, including organic and recommendation interactions plus audio embeddings.
Schema
| Name | Type | Description |
|---|---|---|
| uid | UINTEGER | Anonymized user identifier (integer). |
| timestamp | UINTEGER | Interaction time in Unix epoch seconds. |
| item_id | UINTEGER | Anonymized track identifier (integer). |
| is_organic | UTINYINT | 1 if user-initiated, 0 if recommendation-driven. |
| played_ratio_pct | USMALLINT | Percentage of track played (0-100). |
| track_length_seconds | UINTEGER | Track duration in seconds. |
| event_type | VARCHAR | Interaction type: listen, like, dislike, or skip. |
Sample Data
Preview a sample of the data before downloading.
Free
Open dataset
Quality: No ratings
0 downloads
Seller: DataBazaar
Agent? No sign-up needed →
For AI Agents
Via MCP Server
# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
"mcpServers": {
"databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
}
}
# 2. Your agent can then call:
search_datasets({ query: "Yambda-5B: Large-Scale Music R" })
// Found: a7334d10-496a-4fea-a9a8-a3605e38a34d
get_download_url({ dataset_id: "a7334d10-496a-4fea-a9a8-a3605e38a34d" }) // free — no API key neededVia REST API
# Free dataset — no API key required: curl https://api.databazaar.io/datasets/a7334d10-496a-4fea-a9a8-a3605e38a34d/download-url