Today we are releasing the 2021 data updates from the Integrated Data Infrastructure (IDI) in OHI Data Navigator. This year we have also included data from the NZ Health Survey. With this data refresh there is some important information about what the data is telling us and why, that we need to share with you.
As you may know, the majority of the OHI Data Navigator datasets pulls from the Integrated Data Infrastructure or the IDI.
Because the IDI relies on the data collected by multiple government sources, sometimes elements, like policies, procedures, and definitions are changed. This can impact what the data is telling us. We found some significant changes when we updated the Data Navigator with the 2021 IDI data, and we want to make sure you know what’s happened and why.
The Integrated Data Infrastructure is a very large research database that holds anonymised data about people in Aotearoa, their interactions with government agencies and their responses to the Census and official surveys. Access to the IDI is restricted to authorised researchers who are working to improve outcomes for Aotearoa.
When we completed our initial scan of the data, we found that what the data was telling us was significantly different from when we first launched in 2021.
So different in fact, we had to ask us whether or not the data was reliable (TLDR – it is).
In summary, we trust this data because after a lengthy investigation into the why and how and what, we have concluded that the data is more robust and gives us a clearer picture of the experiences of young people.
In our investigation we found two key things:
- There were changes to the public sector analytical definitions ; and
- There were changes in the underlying IDI Data, including StatsNZ data handling processes.
Changes to the public sector analytical definitions.
Whilst there are undoubtedly other changes that will have an impact for many when analysing the data in the IDI; for us here at OHI Data Navigator there are three main definition changes which result in the overall change in our calculations of young people experiencing exclusion and disadvantage in Aotearoa.
These are: Qualifications, School interventions, and Police contact records.
The new definitions use different calculations to determine an individual’s highest qualification. This has resulted in a shift showing a higher overall attainment of qualifications. That’s cool! But what it does mean is our overall score of disadvantage and exclusion is lower. One data point affecting this could be that in the 2021 data 18% of 19-24 year olds received a “zero” education score, rather than the previously “medium” under the previous calculations and definitions.
The previous definition of school interventions, including truancy and stand-down events is no longer in use and there is no replacement definition at this stage. These definitions contribute to our calculations of exclusion and disadvantage in the Education and Employment component of the Data Navigator.
Police Contact Records
There have been some changes in the definitions of Police contact and also where the information comes from. Whilst this change has a minor implications for the Justice component of our calculation of exclusion and disadvantage, the data appears incomplete from 2018 onward.
In our 2021 Release, using the previous data definitions, we were seeing disadvantage and exclusion getting worse for young people. As shown below (click image to enlarge).
However, with this data refresh, the numbers seem to be getting better or plateauing.
Whilst that seems great for young people, the numbers still highlight a serious problem:
1 in 5 young people in Aotearoa are experiencing exclusion and disadvantage
Changes in underlying IDI Data and StatsNZ data handling processes
In our investigation we reviewed the IDI documentation and metadata and met with StatsNZ. There were three key things we found from this part of our investigation.
- Every government agency supplies their data in different ways. One way is that some agencies provide a completely fresh set of data for each refresh and StatsNZ erases previous data sets and adds the new one in. The data isn’t lost, but it means that any new process or rules don’t just apply to the new data, it applies to historical data as well.
- For the Data Navigator, this might mean that statistics we reported on previously no longer exist or they shift significantly enough that we have to investigate further to understand why.
- The second this is the way the data is validated by StatsNZ. This means, although StatsNZ do some overall data validations, it just isn’t possible for them to examine every possible impact of data supply changes, because there are so many way the data is being used. StatsNZ and the research community aren’t always told when data collections change either, so again we had to ask, so that we could build confidence in the data.
- We also found that there are several agencies going through system and process changes. This means that we can expect to see similar changes in future updates to the OHI Data Navigator.
The great news is, we did a lot of learning from this investigation which has led us to develop checks and balances for the next data refresh in 2022. We’ll share more about this in another post at a later date.
In the meantime, this data is more robust and reliable and whilst over time we anticipate more changes, we will adapt and respond to ensure the data you are using can be trusted to reflect the experiences of young people more accurately in our communities.
Not registered? Register here