About & Methodology

How this data is collected, what it covers, and what it doesn't

Important: Data Coverage Is Not Uniform

PolicyDhara's dataset is not a complete or balanced record of Indian policymaking. The number of policies shown for any government era reflects what our sources have digitally available, not the actual volume of policymaking during that period.

40 policies tracked pre-2014 (2%)

2005 policies tracked 2014 onwards (98%)

This imbalance exists because most government digital portals (PIB, India Code, ministry websites) have significantly better archives for recent years. Older UPA-era and pre-UPA policies are underrepresented — not because fewer policies were made, but because fewer are available in machine-readable formats online.

Do not interpret higher counts in recent eras as evidence of more policy activity. India has always been an active legislating democracy. The difference is in digital data availability, not governance output.

What is PolicyDhara?

PolicyDhara is an open-source platform that tracks Indian development policies across sectors, states, and time. It aggregates legislation, government schemes, budget announcements, notifications, and policy research from official sources into a single searchable database.

The name "Dhara" (meaning "flow" or "stream" in Hindi/Sanskrit) reflects the continuous flow of policy actions that shape India's development trajectory. A project by ImpactMojo, which provides free and highest quality development sector know-how.

Data Sources

PolicyDhara collects from the following official and quasi-official sources:

Press Information Bureau (PIB) Central government press releases and policy announcements

India Code (Legislative Department) Acts of Parliament and legislative texts

PRS Legislative Research Bill analyses, legislative briefs, and parliamentary data

NITI Aayog Policy papers, reports, and development research

Reserve Bank of India (RBI) Monetary policy, financial regulation, and economic data

Ministry of Finance / Budget Division Union Budget documents and fiscal policy

State PIB Offices State-level policy announcements (coverage varies by state)

Gazette of India / eGazette Official notifications and statutory orders

Known Biases & Limitations

Users should be aware of the following when interpreting this data:

Recency bias in digital archives

Digital archives are far more complete for 2014-present than for earlier periods. Government websites frequently restructure, and older content is often lost or inaccessible. The PIB digital archive before ~2012 is particularly sparse. This means UPA-era policy counts are artificially low compared to NDA-era counts.

Central government overrepresentation

State-level policy coverage is limited to states with accessible digital portals. Many state governments publish in regional languages or in formats that are harder to aggregate. Central government actions are overrepresented relative to state actions.

Announcement vs. implementation gap

Most entries capture policy announcements and official statements. Whether a policy was actually implemented, funded, or had its intended impact may differ significantly from the announcement. Our impact annotations are best-effort estimates, not rigorous evaluations.

Automated sector classification

Policies are classified into sectors using keyword matching. Some policies may be miscategorized or missing from relevant sectors. Cross-cutting policies (e.g., Digital India affects multiple sectors) may not appear in all relevant categories.

English language bias

The pipeline primarily processes English-language sources. Policies published only in Hindi or regional languages may be underrepresented or missing entirely.

PIB as primary source

The PIB is by far the largest source in the dataset. PIB releases tend to present government actions positively. Policies that were withdrawn, failed, or were controversial may receive less coverage in PIB compared to successful initiatives.

Collection Methodology

A Python-based pipeline runs every 6 hours via GitHub Actions. It scrapes and parses content from official government websites, classifies policies by sector and type using keyword matching, deduplicates entries, and stores results as JSON files. The Astro static site then builds pages from this data.

The pipeline is fully open-source and auditable. Anyone can inspect the scraping logic, classification rules, and raw data on the GitHub repository.

Help Improve Coverage

If you notice missing policies, incorrect classifications, or have access to historical policy data — especially for UPA-era or pre-2004 policies — contributions are welcome:

Submit issues or pull requests on the GitHub repository
Suggest new data sources, especially for state-level and historical policies
Help digitise older policy documents that exist only in PDF or physical form
Report misclassifications or incorrect dates

Subscribe & Contribute

Get updates via the RSS feed. Download data from the API & Data Export page. Contributions and bug reports are welcome on GitHub.

Built with Astro, Python, and GitHub Actions. By ImpactMojo — free, highest quality development sector know-how. This project is non-partisan and does not receive funding from any political party or government body.