Creating real-world data trusts… in conversation with Mark Surman.
The Data Trust Initiative’s core mission is to move data trusts from theory to practice – creating real-world data trusts that serve different communities. Shortly after the launch of our call for pilot projects, Sylvie Delacroix sat down with Mozilla Executive Director Mark Surman for our next ‘in conversation with…’ blog. Mozilla’s Data Futures Lab complements the work of the Data Trust Initiative, undertaking research and providing funding related to a wide variety of data stewardship approaches.
A big focus of the Data Trusts Initiative’s work for the last year has been on the issues that need to be addressed if we’re to create real-world data trusts. How has Mozilla been grappling with these issues?
Mark: Mozilla’s vision is for an internet that puts people first, empowering individuals to shape their own experiences, so it was natural that we took an early interest in data trusts. We see a lot of organisations and projects trying to implement new types of data stewardship. Our Data Futures Lab - which provides funding, scaffolding for collaboration, and convening around emerging ideas - has been providing a space to experiment with different approaches and explore what type of stewardship works in different situations.
Sylvie: Given Mozilla’s mission, it seems like it would be a natural home for data trust pilots – particularly for pilots that explore how to set up a generalist data trust. Has the Foundation been considering its role in this regard?
Mark: If we look at the projects that we’ve supported in the Data Futures Lab – and the wider evolution of research and practice around data stewardship – I think we’re at a point now where we can see a set of big challenges that need to be addressed before an organisation would invest in setting up a data trust. The first is: where do you start? We need to define the practical starting points for setting up a trust – there needs to be a clear issue to tackle or something for the trust to do; it has to provide something that people understand and want; and you need to convince people that the trust is worth investing in.
What might be the routes to tackling that challenge?
Mark: There are different ways of approaching this. When we’re talking about data trusts, we usually start from the assumption that we’d be taking an existing user base and transforming their relationship with Mozilla. You can imagine that being relevant to something like Mozilla Rally – Rally is a powerful research platform that allows users to donate their data to studies and, in turn, for researchers to observe patterns in how platforms and people behave on the internet. It’s a research service that doesn’t exist today, and that lots of different groups might want access to – so there is a clear value proposition. It also generates a data stewardship need, and you can see how a platform like Rally might want to integrate a trust-like intermediary to manage access to its data resources.
There might be other ways of getting started too. If you could describe in detail what it is you want to build – with enough confidence that you can build it – could you crowd-fund a data trust’s development? It would certainly be a way of testing if you have something that people want. Either way, the starting point needs to be the use case, the ‘ask’ of the user, and the proposed user benefits. Once you have that, you can start capturing people’s imaginations with the possibilities offered by data trusts.
Sylvie: We’ve seen some potential use cases emerge through the COVID-19 pandemic. A lot of children have started using online platforms to submit homework, for example, and the companies running those platforms can build profiles about students based on this data. That is personal data that you have the right to access. If you can combine that profile with the information held about students from schools, then you have the possibility of gaining valuable insights into the different types of education interventions that can benefit different children. You can imagine a data trustee working on behalf of a collection of schools to access these profiles and generate knowledge that you would not have otherwise, with the aim of using these insights to improve educational outcomes.
Mark: The question is about how to get started. A metaphor I use a lot for the current state of debate and exploration related to data trusts is the open source movement.
With data trusts, we're at that early moment like we were in the open source community in the 1990s. At the time, people were creating and sharing open code but there were no agreed upon norms for licensing. The community came together to create the Open Source Initiative and open source definition – essentially a consensus statement on the ten things that qualify something to be an open source license. This was coupled with a trademark on ‘open source’, which meant that anyone wanting to call their license open source had to comply with those characteristics. This lightweight certification process helped the community maintain the user benefits that open source had been delivering.
As we move forward with things like data trusts, we'll need this kind of consensus building on the definitions, the licenses and the use cases. We’ll need to experiment wildly -- and also agree on some definitions and ground rules. Things like the Data Trust Initiative and the Data Futures Lab are signs that we are moving in this direction.
Using that parallel, what might be the ‘ten things’ that data trusts must deliver?
Mark: It is interesting to think about what the core constitutional rules of data trusts might be. There’s been a lot of discussion in recent years about the legal models involved; now we need more focus on the user needs.
Sylvie: At the Data Trusts Initiative, we’ve talked about bottom-up empowerment as being central to data trusts – they must provide a route for individuals and communities to take the reins of their data. Their other characteristics include facilitating collectivisation and providing institutional safeguards. There’s lots still to be worked out under those headings, which we’re hoping to learn more about through our pilot project scheme.
Mark: At the Foundation, we’ve been doing a lot of work on the Data Governance Act and equivalent frameworks around the world. One way of building up the ‘ten things’ might be to look at what is covered by – or missing from – these frameworks, in terms of data stewardship. What do we think should be in place, but hasn’t yet been picked up by regulation or practice? This might be helpful in getting more specific about what changes to data management practices we want data trusts to be pushing for.
What other challenges do you see data stewardship projects trying to work through?
Mark: Even if you can define the user’s needs and capture the imagination of decision-makers, there are still a lot of organisational issues to work through. A lot of those issues are very pragmatic. For example, we need to think about what incentives developers or project leads have to implement a data trust. People are usually going to look for something fast and easy to implement – an off-the-shelf data access agreement, for example. Right now, a data trust isn’t something that is easy to pick up. We need to think about what tools or support we can create in this regard – what are the workable models that initiatives can pick up and run with.
Sylvie: We also need to think carefully about the ways in which the term ‘data trust’ might be mis-used. It should be something that requires careful time and engagement to create, to make sure it is really empowering people. There are lots of emerging data intermediaries that are not focused on empowerment or protecting individuals against vulnerabilities.
Given that landscape, how much progress do you think we’ve made in recent years in moving towards real-world data trusts?
Mark: We’re in a position now where we have a lot of different projects each tackling a different aspect of how to operationalise data trusts. In the Data Futures Lab, for example, we have some projects looking in detail at how we can use subject access requests to gain access to data, and how to make submitting these requests easier. That will generate insights that make it easier for the next wave of projects. Right now, no single project is going to have all the answers. But we’re seeing the seeds of the answers growing across the landscape.
Author: Jess Montgomery (2021)