India is positioning itself to take the lead by addressing critical data accessibility challenges such as privacy and confidentiality concerns, lack of clarity around data ownership and lack of interoperability between AI systems and data, said a report on Thursday.

More importantly, generation of large volumes of synthetic data should be resolved to facilitate easier data access and foster AI innovation in India, said a latest whitepaper by EY — ‘Enabling AI development in India through data access’.

It outlines key recommendations for public and private sectors, emphasising the need to develop institutions and mechanisms that incentivise sharing of proprietary data, necessitates a robust data framework, setting up of a data marketplace, and a need to invest in the development of standards for interoperability of data to drive AI innovation.

Reflecting on the findings, Rajnish Gupta, Partner, Tax and Economic Policy Group, EY India, said, “Access to proprietary datasets will be a key differentiating factor between countries that emerge as winners and those that are unable to leverage the AI opportunity.”

The government’s proactive role in establishing data frameworks, along with the private sector’s participation in data sharing and standardisation, will be pivotal in creating a robust AI ecosystem, he said.

Some of the key thematic pillars emerging from government initiatives to improve data accessibility for AI development include the establishment of a specialised agency for managing vast amounts of data and overseeing the development and management of the data ecosystem, the report highlighted.

It also recommended laws and regulations that ensure adequate privacy safeguards for personal data. The processing and use of personal data should be done with the consent of the user.

It recommended frameworks, regulations, and rules to promote sharing of proprietary data through marketplaces and data exchanges, setting up interoperability standards, incentivising private participation in data marketplaces, and clarifying issues like data ownership, it said.

The government has access to vast amounts of data that can be used as training datasets. Publishing data and digitising these records as open data can enable the development of AI models in local languages, it added.