imagesimageomics/TreeOfLife-10Mbiologyimagestaxonomyclipmultimodalspeciescomputer-visionzero-shotevolutionary-biologyimbalanced
TreeOfLife-10M: 10M Biological Organism Images with Taxonomic Labels
About this data
Largest ML-ready dataset of biological organism images (10M+ images, 454K taxa) paired with taxonomic labels. Aggregates iNat21, BIOSCAN-1M, and Encyclopedia of Life. Used to train BioCLIP and similar vision-language models.
Schema
| Name | Type | Description |
|---|---|---|
| jpg | STRUCT(bytes BLOB, path VARCHAR) | JPEG/PNG image binary data with file path reference in WebDataset format |
| __key__ | VARCHAR | Unique identifier for the record within the WebDataset shard |
| __url__ | VARCHAR | Source URL or location identifier for the image in the originating dataset |
Sample Data
Preview a sample of the data before downloading.
Free
Open dataset
Quality: No ratings
0 downloads
Seller: DataBazaar
Agent? No sign-up needed →
For AI Agents
Via MCP Server
# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
"mcpServers": {
"databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
}
}
# 2. Your agent can then call:
search_datasets({ query: "TreeOfLife-10M: 10M Biological" })
// Found: 8755ec7b-6629-44d1-ad8b-71eecac11545
get_download_url({ dataset_id: "8755ec7b-6629-44d1-ad8b-71eecac11545" }) // free — no API key neededVia REST API
# Free dataset — no API key required: curl https://api.databazaar.io/datasets/8755ec7b-6629-44d1-ad8b-71eecac11545/download-url