About & Methodology
How this data is collected, what it covers, and what it doesn't
Important: Data Coverage Is Not Uniform
PolicyDhara's dataset is not a complete or balanced record of Indian policymaking. The number of policies shown for any government era reflects what our sources have digitally available, not the actual volume of policymaking during that period.
This imbalance exists because most government digital portals (PIB, India Code, ministry websites) have significantly better archives for recent years. Older UPA-era and pre-UPA policies are underrepresented — not because fewer policies were made, but because fewer are available in machine-readable formats online.
Do not interpret higher counts in recent eras as evidence of more policy activity. India has always been an active legislating democracy. The difference is in digital data availability, not governance output.
What is PolicyDhara?
PolicyDhara is an open-source platform that tracks Indian development policies across sectors, states, and time. It aggregates legislation, government schemes, budget announcements, notifications, and policy research from official sources into a single searchable database.
The name "Dhara" (meaning "flow" or "stream" in Hindi/Sanskrit) reflects the continuous flow of policy actions that shape India's development trajectory. A project by ImpactMojo, which provides free and highest quality development sector know-how.
Data Sources
PolicyDhara collects from the following official and quasi-official sources:
Known Biases & Limitations
Users should be aware of the following when interpreting this data:
Digital archives are far more complete for 2014-present than for earlier periods. Government websites frequently restructure, and older content is often lost or inaccessible. The PIB digital archive before ~2012 is particularly sparse. This means UPA-era policy counts are artificially low compared to NDA-era counts.
State-level policy coverage is limited to states with accessible digital portals. Many state governments publish in regional languages or in formats that are harder to aggregate. Central government actions are overrepresented relative to state actions.
Most entries capture policy announcements and official statements. Whether a policy was actually implemented, funded, or had its intended impact may differ significantly from the announcement. Our impact annotations are best-effort estimates, not rigorous evaluations.
Policies are classified into sectors using keyword matching. Some policies may be miscategorized or missing from relevant sectors. Cross-cutting policies (e.g., Digital India affects multiple sectors) may not appear in all relevant categories.
The pipeline primarily processes English-language sources. Policies published only in Hindi or regional languages may be underrepresented or missing entirely.
The PIB is by far the largest source in the dataset. PIB releases tend to present government actions positively. Policies that were withdrawn, failed, or were controversial may receive less coverage in PIB compared to successful initiatives.
Collection Methodology
A Python-based pipeline runs every 6 hours via GitHub Actions. It scrapes and parses content from official government websites, classifies policies by sector and type using keyword matching, deduplicates entries, and stores results as JSON files. The Astro static site then builds pages from this data.
The pipeline is fully open-source and auditable. Anyone can inspect the scraping logic, classification rules, and raw data on the GitHub repository.
Help Improve Coverage
If you notice missing policies, incorrect classifications, or have access to historical policy data — especially for UPA-era or pre-2004 policies — contributions are welcome:
- Submit issues or pull requests on the GitHub repository
- Suggest new data sources, especially for state-level and historical policies
- Help digitise older policy documents that exist only in PDF or physical form
- Report misclassifications or incorrect dates
Subscribe & Contribute
Get updates via the RSS feed. Download data from the API & Data Export page. Contributions and bug reports are welcome on GitHub.
Built with Astro, Python, and GitHub Actions. By ImpactMojo — free, highest quality development sector know-how. This project is non-partisan and does not receive funding from any political party or government body.