otheropenml/181-yeastclassificationbioinformaticsmulticlasstabularbenchmarkuciopenmlimbalanced
Yeast Protein Localization (UCI/OpenML)
About this data
Classic multi-class classification benchmark predicting cellular localization sites of yeast proteins from 8 sequence-derived numeric features. 1,484 instances, 10 classes.
Schema
| Name | Type | Description |
|---|---|---|
| mcg | DOUBLE | Signal sequence recognition score from McGeoch's method, range [0–1]. |
| gvh | DOUBLE | Signal sequence recognition score from von Heijne's method, range [0–1]. |
| alm | DOUBLE | ALOM membrane-spanning region prediction score, range [0–1]. |
| mit | DOUBLE | Discriminant analysis score for mitochondrial targeting in N-terminal region, range [0–1]. |
| erl | DOUBLE | Binary indicator of HDEL ER retention signal presence, 0 or 1. |
| pox | DOUBLE | Peroxisomal targeting signal score in C-terminus, range [0–1]. |
| vac | DOUBLE | Discriminant analysis score for vacuolar/extracellular targeting, range [0–1]. |
| nuc | DOUBLE | Discriminant analysis score for nuclear localization signals, range [0–1]. |
| class_protein_localization | VARCHAR | Cellular localization site: CYT, NUC, MIT, ME3, ME2, ME1, EXC, VAC, POX, or ERL. |
Sample Data
Preview a sample of the data before downloading.
Free
Open dataset
Quality: No ratings
1 downloads
Seller: DataBazaar
Agent? No sign-up needed →
For AI Agents
Via MCP Server
# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
"mcpServers": {
"databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
}
}
# 2. Your agent can then call:
search_datasets({ query: "Yeast Protein Localization (UC" })
// Found: 2ab35349-0b96-4826-aadf-5d4f29eb2510
get_download_url({ dataset_id: "2ab35349-0b96-4826-aadf-5d4f29eb2510" }) // free — no API key neededVia REST API
# Free dataset — no API key required: curl https://api.databazaar.io/datasets/2ab35349-0b96-4826-aadf-5d4f29eb2510/download-url