Home Data European Commission seeks proposals on synthetic data for digital finance ‘data hub’

European Commission seeks proposals on synthetic data for digital finance ‘data hub’

European Commission: Mairead McGuinness has described the data initiative as a “good example of how, as regulators, we can also be innovators” | Credit: Jai79 (Pixabay)

The European Commission is seeking viewpoints from the private sector on the types of data held by supervisory authorities that companies would likely find useful to test new fintech solutions and ‘train artificial intelligence (AI) models’.

It has issued a call for proposals on the matter as it builds a ‘Data Hub’ – described as a ‘space where financial companies can access supervisory data for training and testing purposes’ – within the European Union (EU) Digital Finance Platform: a Commission-hosted website launched two years ago to help private-sector innovators to ‘scale up’ fintech solutions.

The overarching ambition is to promote data-sharing to foster innovation across Europe’s financial sector.

The Commission states in its Data Hub call for proposals, which is open to 26 April, that the Data Hub will be built using synthetic data – data that is artificially generated – to ensure compliance with confidentiality requirements.

‘Synthetic data offers a way for national supervisors to participate in the project without having to make the real data they hold accessible to any third party,’ the Commission explains. ‘The Data Hub will therefore host synthetic datasets that have been created by participating supervisors based on the real data they hold.’

RELATED ARTICLE EU Digital Finance Platform to map fintech growth and support cross-border testing – a news story (11 April 2022) on the launch of the platform two years ago

Specific ‘use-cases’ needed

The Commission states that responses it receives to the call for proposals ‘will be valuable to define which datasets will be made accessible on the platform’. But it advises that companies feeding in their views should ‘set out in detail a specific use-case and a solution to be tested’; and ‘should describe the datasets that would be needed, and the variables to be prioritised in the synthesisation process’.

The Commission also states that ‘datasets must be limited to non-personal data, as well as to datasets that national supervisors in the financial sector hold’. It adds that ‘due to the nature of the project and of synthetic data, the datasets available on the Data Hub will be very granular and unique, rather than particularly recent or frequently updated’.

Companies responding to the call need to be from an EU or European Economic Area (EEA) country.

The Data Hub section of the Digital Finance Platform describes its creation as a ‘key novelty’ being added ‘in the second phase of the project’. Its contents will be available to companies, academics and researchers.

The Data Hub initiative also aligns with the ‘European Data Strategy’ through which the 27-member EU is aiming to boost the development of ‘trustworthy data-sharing systems’ through four broad sets of measures, one of which is to facilitate the re-use of certain public sector data that cannot be made available as open data.

REGISTER NOW The Global Government Fintech Lab 2024 – our one-day event in Dublin, Ireland, on Thursday 25 April – contains a session titled ‘Financial regulators and innovative technology: on the right path?’ | Free to attend for public servants CLICK HERE TO REGISTER

Supervisory agencies will also benefit

The EU Digital Finance Platform’s second phase was launched during an online event on 21 March.

During the event the EU’s financial services commissioner Mairead McGuinness described the Data Hub as a “good example of how, as regulators, we can also be innovators”.

“Data, information, is more and more important in finance,” she said. “And access to high-quality, relevant data has become increasingly vital for financial companies to thrive.”

As well as companies with fintech solutions, she said that supervisory agencies themselves would also benefit from the Data Hub “because it will allow them to learn more about the technologies that innovative financial firms are using.”

The Commission’s Directorate-General for Financial Stability, Financial Services and Capital Markets Union (DG Fisma) previously also issued a call for proposals on the Data Hub in March 2023.

RELATED ARTICLE FCA engages fintech company to run permanent digital sandbox – a news story (28 April 2023) on the UK’s Financial Conduct Authority awarding a three-year contract to a London-based fintech company (NayaOne) to run its digital sandbox

UK FCA’s synthetic data report

The UK’s Financial Conduct Authority (FCA) just last month published a 45-page overview of the opportunities and risks of using synthetic data to contribute to ‘beneficial and responsible’ innovation in financial services.

‘Synthetic data is one of many privacy-enhancing technologies that can expand and support data sharing,’ the FCA states about the ‘Report: Using Synthetic Data in Financial Services’. ‘While it has the potential to address important financial services public policy issues, such as financial crime and fraud, there are still open questions that are being researched.’

The FCA set up a Synthetic Data Expert Group (SDEG) as a sub-group of its Innovation Advisory Group (IAG) just over 12 months ago. Scheduled to run until November this year, the SDEG brings together 21 experts from across financial services, public sector, data and technology vendors, as well as consumer groups, to discuss the use of synthetic data in financial markets. Last month’s report was collectively authored by SDEG members and FCA staff.

The FCA has undertaken a number of synthetic data-related initiatives, starting with a ‘DataSprint’ in June 2020. After that, in partnership with the City of London Corporation, it launched two digital sandbox pilots. These initiatives granted participants access to synthetic data.

The authority then launched its permanent digital sandbox in August 2023, including access to more than 200 synthetic, public or anonymised datasets