Artificial Intelligence (AI) is transforming how we live and work — writing emails, generating art, and answering questions in seconds. But beyond these high-tech breakthroughs, a quieter shift is underway. One that isn’t about creating something new but about unlocking what already exists.
Across India, decades of public data from parliamentary debates and census records to ministry reports sit scattered in PDFs, scanned documents, and hard-to-navigate archives. Technically public, yet practically out of reach for most people.
For journalists chasing facts, researchers building studies, or citizens trying to understand policy, accessing this information has long meant hours and sometimes days of manual searching.
This is the gap that Factly, a Hyderabad-based research and data journalism organisation, has been working to bridge since 2016. And today, with its AI-powered tools Dataful and Tagore AI, it is rethinking what access to public data can actually look like.
A vision rooted in transparency
Factly’s story begins with one man’s conviction that transparency is the foundation of democracy.
Rakesh Dubbudu, 42, an engineer-turned-transparency-campaigner, has been on this journey for years. Following his engineering education, he became deeply involved with India’s Right to Information (RTI) movement, working tirelessly to hold government systems accountable. This work exposed him to both the potential and the frustrations of public data.
“I was part of the larger Right to Information movement,” Rakesh tells The Better India.“That’s where I realised the immense power of information and how inaccessible it really was for most people.”
Public data may exist in abundance, but without access and usability, it remains out of reach for those who need it most.
Even when data was technically available, it was often fragmented, inconsistently formatted, or buried in scanned documents. For citizens, researchers, and even journalists, this made meaningful use difficult.
So in 2016, Rakesh started Factly with a simple goal: make public data understandable and usable.
What began as explainers and fact-checking gradually evolved into something deeper. Behind the scenes, the team — now 30+ people across research, data, and technology — started building a structured ecosystem of public data.
Parliamentary records, ministry reports, economic indicators — cleaned, standardised, and stitched together over time.
“As technology evolved, we saw an opportunity to do more,” Rakesh says. “We realised we could make public data significantly easier for people to access and use.”
This realisation led to the creation of Tagore AI and Dataful, two AI-driven tools designed to simplify how people interact with India’s public records.
From scattered information to real-world use
The impact of this work becomes most visible not in the tools themselves — but in who is using them.
Today, Dataful alone has over 40,000 users. And they’re not just data journalists. The users are present across categories:
-
Government officials are signing up with official email IDs to download datasets — a sign that even internal stakeholders are looking for simpler ways to access public data
-
PhD scholars and academic faculty are using datasets on agriculture, elections, and sustainability reporting
-
NGOs and social sector organisations are working with CSR, health, and development indicators
Consultancies and fintech companies are analysing vehicle registrations, payment data, and economic trends
In other words, the same data that once sat scattered across portals is now being used across academia, policy, social impact, and industry.
And then there are the stories it helps tell.
From journalists to policymakers, structured data is enabling sharper decisions, deeper research, and stronger stories.
Newsrooms like The Quint and The Hindu have used Dataful to report on unemployment, crimes against women, student suicides, and Indian students abroad — stories that depend on patterns over time, not just isolated data points.
For journalists like Abhishek Anand, Senior Correspondent at The Quint, that structure has been transformative.
“Before Dataful, we had to go to individual government websites for everything,” he says. “Finding one data point in a 600-page report was exhausting.”
During election coverage, this challenge became even more pronounced.
“Dataful helped us cut through that noise,” Abhishek explains. “Instead of reacting to claims, we could show what the data actually says over time.”
For Shashidhar KG, Managing Editor at MediaNama, Dataful offered something equally valuable: breadth.
“As a tech policy publication, we track certain regulators like RBI or TRAI very closely,” he explains. “But other ministries don’t publish data as neatly or periodically.”
Dataful’s consolidated datasets helped MediaNama expand beyond their usual regulatory beats — allowing reporters to quickly add credible data points to stories on cybercrime, digital payments, or consumer protection.
“It evens the field,” Shashidhar says. “You’re no longer limited to the few regulators who are organised with data.”
Where Dataful organises complexity, Tagore AI simplifies discovery — together, turning information into insight.
Rather than requiring newsrooms to invest heavily in data journalism, Dataful makes it easier for reporters to strengthen their stories using verified data, context, and long-term trends.
“At the end of the day, data helps you make stronger arguments,” he says. “It helps you see patterns, not just incidents.”
For both journalists, the biggest shift wasn’t just speed — it was focus.
“When you’re not spending most of your time looking for data,” Abhishek reflects, “you can spend it thinking more deeply about the story.”
In the policy space, datasets have been used by institutions like the Ministry of Environment, Forest and Climate Change and the Wildlife Institute of India — for instance, analysing railway data to study wildlife collisions.
And in academia, the data has been cited in research, including studies on crop diversity.
Two tools, two different problems
Factly’s approach works because it tackles two related but distinct gaps.
Dataful focuses on structured datasets. It takes messy, inconsistent government data and turns it into clean, analysis-ready formats across 50+ sectors. Tagore AI, focuses on discovery.
Instead of searching through multiple portals, users can ask questions in plain language and get answers based on official records — including parliamentary questions and answers since 1952 and PIB releases since 1947.
So if Dataful helps you work with data, Tagore AI helps you find meaning in information.
“By grounding Tagore AI in government data, we ensure the information is accurate and trustworthy,” Rakesh explains. “It’s better to say, ‘this data doesn’t exist,’ than to risk misinformation.”
Together, they reduce the biggest friction point of usability.
Why existing platforms don’t fully solve this
There’s no shortage of data platforms today — from government portals to global aggregators.
But most of them solve only one part of the problem. Platforms like the Open Government Data portal or Google Dataset Search help users find data. Others like Our World in Data provide curated insights.
What they don’t do consistently is make data analysis-ready.
“Government datasets are often fragmented, inconsistent across years, and lack time-series continuity. Users still have to download, clean, and standardise everything themselves,” adds Rakesh.
That’s the gap Dataful fills, by doing that work upfront.
And while those platforms don’t typically help with navigating official documents, Tagore AI allows users to trace how the government has responded to issues over decades — without manually sifting through PDFs.
The shift is subtle, but important: from accessing data to actually using it.
The invisible work behind the tools
Making this possible required more than just AI. A large part of Factly’s work is still deeply manual.
Government data comes with challenges — inconsistent formats, changing definitions, missing years, and non-machine-readable files. To deal with this, the team combines domain expertise with internal tools.
Researchers identify sources across ministries and reports. Data analysts clean and standardise them. Engineers build systems for validation and automation.
Behind every clean dataset lies hours of invisible effort — cleaning, verifying, and stitching together fragmented records.
Some datasets are updated automatically — like daily AQI data extracted from PDFs. Others, like PIB releases, are updated multiple times a day.
And when data isn’t publicly available, the team files RTI requests to obtain it.
This mix of manual effort and automation is what makes the system work at scale.
Keeping AI grounded in reality
Working with AI brings its own risks — especially around accuracy.
Factly’s solution is simple: limit what the AI can do. Both Dataful and Tagore AI rely only on official government data. The system does not pull from the open internet or generate speculative answers.
Instead, it responds only using structured databases and indexed documents within the platform.
“This significantly reduces the risk of misinformation or hallucinations.The trade-off is that if the data doesn’t exist, the tool may not provide an answer,” says Rakesh.
But that’s intentional, because in this case, reliability matters more than completeness.
Making public data actually public
At its core, Factly’s work is less about AI — and more about access.
“People often confuse availability with accessibility,” Rakesh says. “Just because data exists doesn’t mean it’s usable.”
By turning scattered records into searchable tools and structured datasets, platforms like Dataful and Tagore AI are changing how people interact with information.
They’re helping journalists tell stronger stories, researchers build better studies, and even policymakers make more informed decisions.
Not by adding more data, but by making existing data finally usable.
All images courtesy Rakesh Dubbudu




