Data collaboratives: Advancing open access for social good

Brigid van Wanrooy reports from a data sharing masterclass at the Society 4.0 Forum in Melbourne, Australia this week.

Data collaboratives empower the community and society to generate collective insights and collective outcomes. Data collaboratives further advance the Open Access movement — with both aiming to make data available to the public and take steps to use and apply it for social good.

Stefaan Verhulst (Editor-in-Chief of the new open access journal Data & Policy and Chief Research and Development Office at the Gov Lab, NYU) held a masterclass at Society 4.0 in Melbourne on the methodology, tips and tricks to set up an effective data collaborative. It was timely, following Open Access Week. There was keen interest and good will in the room. Attendees came from a range of backgrounds and organisations including government departments, local authorities and community sector organisations. It seems that there is real potential to create data collaboratives to improve community outcomes.

Stefaan Verhulst (Data & Policy; GovLab) at the Society 4.0 Forum

Demand, insight and action

Often a data project will start with the dataset or stop with the insight. Data collaboratives need to take a ‘professional’ approach, ensuring that the project will intentionally produce real-world outcomes.

It is critical that the data collaborative starts with the problem to be addressed. From this point, the questions to be answered by the data analysis can be clearly articulated. It’s also important to consider, from the outset, what action can be taken and who will be responsible for that action.

This is good advice that not only applies to data collaboratives, but to any research project intended to have an impact on the ‘real world’. In my time as a researcher, I was often told by ‘clients’ what the methodology should be or the statistic that was needed, without consideration of the question that needed to be answered. This can be symptomatic of policy-based evidence, that is, finding evidence to support the decision that has already been made. It can also be symptomatic of a lack of understanding of what constitutes a rigorous and effective research process.

Data collaboratives can generate impact in various ways, some of these include improving governance, empowering citizens through greater transparency and open access to data, creating economic opportunity through community intelligence, and solving public problems.

Along with defining the problem and the research questions, it is also essential to consider what type of action — or at the very least, who will be required to act on the insight generated. Without this, it’s likely that the data collaborative will stop at the insight and have no impact.

Seven steps to building a data collaborative

The GovLab has translated these observations and their experience into a seven-step methodology to build an effective data collaborative.

Step 1: Identify the demand — what is the problem you are trying to solve and is it a priority? What are the research questions?

Step 2: Access the required data by cataloguing and auditing known datasets and establishing data stewards — these are designated individuals who have the mandate to work with organisations that hold and use the data. Data stewards have a key — and difficult — role in data collaboratives: they ensure that there is sign-off from privacy and legal. They could be the data owner or custodian, but from my experience they are already are busy overseeing the datasets and ensuring they fulfil their original purpose.

Step 3: Establish partnerships and governance arrangements to ensure roles and responsibilities are clear. Every stakeholder doesn’t need to be involved in every step along the way.

Step 4: Design and operationalise the trusted data collaboration.

Step 5: Analyse the data in a responsible and ethical way. There are four main insights that can be generated: situational analysis, cause-and-effect analysis, predictive analysis and impact assessment.

Step 6: Deliver, communicate and act on the insights.

Step 7: Assess the impact to learn what works and keep iterating. Data collaboratives shouldn’t be a ‘one-off’ — they need to be sustained to keep realising the benefits for social good.

How it works in practice

After Stefaan provided this step-by-step overview, we broke up into groups to workshop setting up a data collaborative for each outcome on an imagined city council’s strategic plan. Despite Stefaan’s advice, our group jumped straight to the datasets they wanted to use. This was all too familiar to me. The excitement and endless possibilities of exploring a dataset outweighed the difficult task of defining the problem we were trying to address.

With some further direction, the group agreed on the questions and then jumped to some high-level solutions. “Job done!” pronounced one participant. Having been involved in data sharing projects in the past, I knew that this couldn’t be further from the truth.

We needed a data steward who could navigate the complexities of gaining buy-in and access to the data. There was also a need for someone — perhaps the data steward — to support the group to clearly articulate the problem and questions, as well as continue to connect the policy problem with the data analysis, to achieve real impact.

Open Access to data and evidence is critical to ensure that there is more evidence-informed decision-making that results in improved social outcomes. As data and evidence becomes more openly accessible, it’s important to have stewards to guide us to ensure it’s used ethically for maximum social impact.

Brigid van Wanrooy is the Director of APO — Australia and New Zealand’s largest digital repository of grey literature public policy. APO champions open access and evidence-informed policy to improve social outcomes. Brigid has worked across academia and government in a range of research and policy roles. She is an Associate Professor at the Social innovation Institute at Swinburne University of Technology.

Editor’s note: This article was originally published by the Data & Policy blog and is reproduced with permission.