Chapter 1.2: Data challenges faced by policy makers
This chapter reviews the main challenges in using big and open data for policy making and provides some advice on how best to address them.
Challenges in using big data
The very same features that define big data (volume, velocity, variety, veracity) become a cause of many challenges that governments often face when trying to use it for policy making.
Storage requirements: Big data requires a lot of storage space, which puts pressure on owners to constantly scale their infrastructure.
Real-time response: Big data, especially the IoT one, is being created quickly. Owners are therefore increasingly challenged to respond in real-time to achieve the best outcomes.
Integration issues: Big data comes from many different places: magnetic loops, ANPR cameras, social media streams and databases, to name just a few. So the resulting pool of data can be a hotch-potch of different formats, ranging from text and images to video files and tables. Combining all this data meaningfully can be difficult due to compatibility issues.
- Problems with talent acquisition: As was already mentioned in Chapter 0.3, generating insights from big data requires a certain level of expertise. That has driven up demand for big data experts and big data salaries have increased dramatically as a result. While not a problem for capital cities, smaller municipalities may not be able to afford a full-time big data analytics department due to the large labour costs involved.
Challenges in using open data
The growth of data portals in Europe and internationally is certainly laudable. However, the fact that governments and private companies are making some data available doesn't mean the data is ready to use immediately.
Format variability: As with big data, there can be format variability in open datasets which raises all sorts of integration issues.
Missing communication: Only a few governments see communication to developers and technical entrepreneurs as an integral part of their open data strategy. By not engaging properly with these audiences, governments miss an opportunity to create wider incentives for open data re-use.
- Limited practice of data standardisation: Looking at the private sector, it appears that a structured approach to open data remains limited to tech giants (Amazon, Google, Facebook, Microsoft etc.) who have standard rules on data and APIs embedded in their product design. The vast majority of tech companies have yet to show such level of commitment to openness and re-use.
Figure 2: Challenges inhibiting the use of big and open data
Overcoming the data challenges
PoliVisu pilot cities have faced similar challenges as they tried to leverage open and big data to improve policy making. Key lessons learned from direct experience of addressing these challenges are codified in the following recommendations:
Data storage: Technologies such as hyperconverged infrastructure, which virtualises all elements of the conventional ‘hardware-defined’ systems, can help organisations scale their infrastructure. Other technologies like compression, deduplication and tiering can reduce the amount of space and costs associated with big data storage.
Real-time analytics: To achieve the speed needed to process big data, it is worth investing in the new generation of ETL (extract, transform, load) and analytics tools as these can dramatically reduce the time it takes to generate reports.
Interoperability: Ensuring interoperability isn’t just a technical matter. It also requires 1) the existence of appropriate legal frameworks, policies and strategies (legal interoperability); 2) the selection of appropriate encoding formats to balance scalability and performance (semantic interoperability); and 3) reorganisation of internal processes in a way that transcends departmental silos (organisational interoperability).
Data skills: There are many things organisations can do to address talent shortages, for example 1) increase budgets for big data, as well as recruitment and retention efforts, 2) offer more training opportunities to current staff members, 3) purchase analytics solutions with self-service and/or machine learning capabilities
Open data stakeholders: Open data strategy is not just about opening up data but also ensuring that the right stakeholders have access to it and have all the supporting tools they need (e.g. APIs, documentation, analytics) for reuse. Governments and public bodies must consider groups other than the general who might benefit most from access to their information e.g. developers, SMEs, public sector staff.
Video 4: PoliVisu webinar on data challenges