textnvidia/Nemotron-Math-v2mathematicsreasoningllm-trainingdistillationlong-contexttool-usenvidiafine-tuningsynthetic-datacot
Nemotron-Math-v2: Mathematical Reasoning Trajectories (NVIDIA)
About this data
NVIDIA's 347K math problems with 7M model-generated reasoning trajectories for distilling mathematical reasoning. Long-context, tool-use, multi-mode supervision. CC-BY-4.0/CC-BY-SA-4.0.
Schema
| Name | Type | Description |
|---|---|---|
| uuid | VARCHAR | Unique identifier (UUID v4 format) for the record. |
| expected_answer | VARCHAR | Final numerical or symbolic answer to the mathematical problem. |
| problem | VARCHAR | Mathematical problem statement in LaTeX or plain text format. |
| original_expected_answer | VARCHAR | Initial expected answer before any corrections or majority voting. |
| changed_answer_to_majority | BOOLEAN | Boolean flag indicating if answer was updated to match majority model consensus. |
| data_source | VARCHAR | Origin dataset or problem collection (e.g., aops, competition, textbook). |
| messages | STRUCT("role" VARCHAR, "content" VARCHAR, reasoning_content VARCHAR, tool_calls STRUCT(id VARCHAR, "type" VARCHAR, "function" STRUCT("name" VARCHAR, arguments VARCHAR))[], tool_call_id VARCHAR, "name" VARCHAR)[] | Array of conversational turns with role, content, reasoning traces, and tool invocations. |
| used_in | VARCHAR[] | Array of dataset splits or benchmarks this record appears in. |
| metadata | STRUCT(reason_low_with_tool STRUCT(count BIGINT, pass BIGINT, accuracy DOUBLE), reason_low_no_tool STRUCT(count BIGINT, pass BIGINT, accuracy DOUBLE), reason_medium_with_tool STRUCT(count BIGINT, pass BIGINT, accuracy DOUBLE), reason_medium_no_tool STRUCT(count BIGINT, pass BIGINT, accuracy DOUBLE), reason_high_with_tool STRUCT(count BIGINT, pass BIGINT, accuracy DOUBLE), reason_high_no_tool STRUCT(count BIGINT, pass BIGINT, accuracy DOUBLE)) | Nested counts and accuracy metrics by reasoning difficulty level and tool availability. |
| license | VARCHAR | License identifier governing dataset usage (e.g., CC-BY-4.0, CC-BY-SA-4.0). |
| tools | STRUCT("type" VARCHAR, "function" STRUCT("name" VARCHAR, description VARCHAR, parameters STRUCT("type" VARCHAR, properties STRUCT(code STRUCT("type" VARCHAR, description VARCHAR)), required VARCHAR[])))[] | Array of available function definitions with parameters for tool-use trajectories. |
| url | VARCHAR | Source URL or reference link for the problem. |
| user_name | VARCHAR | Username or author identifier of problem contributor or source. |
| user_url | VARCHAR | Profile or homepage URL of the user who contributed the problem. |
Sample Data
Preview a sample of the data before downloading.
Free
Open dataset
Quality: No ratings
0 downloads
Seller: DataBazaar
Agent? No sign-up needed →
For AI Agents
Via MCP Server
# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
"mcpServers": {
"databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
}
}
# 2. Your agent can then call:
search_datasets({ query: "Nemotron-Math-v2: Mathematical" })
// Found: 17600013-74d6-45d5-b618-c2f8b0afe9c8
get_download_url({ dataset_id: "17600013-74d6-45d5-b618-c2f8b0afe9c8" }) // free — no API key neededVia REST API
# Free dataset — no API key required: curl https://api.databazaar.io/datasets/17600013-74d6-45d5-b618-c2f8b0afe9c8/download-url