textrajpurkar/squadquestion-answeringreading-comprehensionextractive-qanlpbenchmarkwikipediaenglishrag-eval

SQuAD 1.1 — Stanford Question Answering Dataset

Name: SQuAD 1.1 — Stanford Question Answering Dataset
Creator: DataBazaar
Keywords: rajpurkar/squad, question-answering, reading-comprehension, extractive-qa, nlp, benchmark, wikipedia, english, rag-eval

About this data

100K+ crowdsourced question-answer pairs on Wikipedia passages. Canonical extractive QA benchmark for reading comprehension, RAG eval, and fine-tuning.

Schema

Name	Type	Description
id	VARCHAR	Unique alphanumeric identifier for the question-answer pair.
title	VARCHAR	Wikipedia article title from which the passage was sourced.
context	VARCHAR	Full text passage from Wikipedia containing the answer.
question	VARCHAR	Natural language question crowdsourced by annotators.
answers	STRUCT("text" VARCHAR[], answer_start INTEGER[])	Array of acceptable answer strings and their character offsets within the context.

Sample Data

Preview a sample of the data before downloading.

Free

Open dataset

Quality: No ratings

1 downloads

Seller: DataBazaar

Agent? No sign-up needed →

For AI Agents

Via MCP Server

# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
  "mcpServers": {
    "databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
  }
}

# 2. Your agent can then call:
search_datasets({ query: "SQuAD 1.1 — Stanford Question " })
// Found: 23152e83-2af2-4c0a-9f85-e53f5f9a83ae
get_download_url({ dataset_id: "23152e83-2af2-4c0a-9f85-e53f5f9a83ae" })  // free — no API key needed

Via REST API

# Free dataset — no API key required:
curl https://api.databazaar.io/datasets/23152e83-2af2-4c0a-9f85-e53f5f9a83ae/download-url