In 2020 Statistics Poland launched a new project called “Spatial Statistical Data in the Information System of the State” Project (PDS). The project was implemented by 31 October 2022 under the Operational Programme Digital Poland under the II Priority axis – “E-administration and open government” Measure 2.1. “High availability and quality of public e-services”.
Project value
PLN 34,722,048.00 (including co-financing from EU: PLN 29,385,269.22).
Purpose of the PDS project
The main purpose of the PDS (Spatial Statistical Data) project was to expand the scope and availability of statistical information and geostatistical analysis methods that use public statistics resources. The main objective included individual specific purposes:
- improvement of the availability of e-services,
- providing new functionalities and expanding the ones of already existing services,
- development of infrastructure with elements necessary to provide e-services at a high level of maturity.
This objective was achieved by providing new and expanding existing services of the Geostatistical Portal and taking into account the resulting information developed on an ongoing basis as part of the Statistical Research Program of Public Statistics, as well as other public statistics projects, including projects implemented under grant agreements. The services created as part of the PDS project present data held by public statistics and the results of geostatistical analyses that are critical for the functioning of the state, local governments and local communities. Those data are presented in a convenient graphical form (mainly maps), accelerating decision-making processes. An additional goal was to create solutions supporting the presentation of the results of current statistical research.
Benefits of the PDS project for users
Until 2022, public statistics presented data on maps using the Geostatistics Portal, which offered only two services. In the PDS project, these services were expanded and three new services, important for external users, were developed, as well as one necessary internal service that improved the functioning of the entire system.
The idea of creating a new system as part of the PDS project emerged during direct contacts with users of public statistics data. During the conference showcasing the Geostatistical Portal project – Phase II, a number of user needs were reported, including the need to access individual data, ensuring their connection with user data and the need to perform more advanced analyses than those available in the Geostatistical Portal. The topic of this demand was raised many times during meetings (including trainings) with users of statistical information provided by Statistics Poland, including representatives of public and private units and institutions who want to perform more precise analyses related to, for example, the spatial location of investments.
User needs were considered through the prism of the current functional limitations of the Geostatistical Portal, in the context of increasing decision-making needs. Attention was drawn to the need to have solutions that, apart from geostatistical analysis, would ensure the determination of the characteristics of the spatial distribution of a given phenomenon in a given area. Users often informed about the need to provide “some method” of controlled access to individual statistical data, i.e. those that provide the widest spectrum of possibilities and the highest accuracy of analyses. It was emphasised that too high level of aggregation of statistical data is in some cases not precise enough, and a good solution could be to adjust the mesh size of the kilometre grid so that a more accurate analysis of phenomena in densely populated urban areas (i.e. less than 1 km) is possible. What is more, it was pointed out that it’s necessary to ensure the widest possible use of statistical data and use of various data of the user as well as to enable the ability to independently conduct advanced statistical analyses, especially in spatial terms.
The main idea of the PDS project was the expansion of the existing Geostatistical Portal with new functionalities, which will allow, in particular, to better meet the needs of users (citizens, entrepreneurs and public administration) as well as to improve the functioning of the public statistics service.
As part of the PDS project, the system ultimately took over the functionalities of the Geostatistics Portal and significantly expanded them. This was related not only to the takeover of the full functionality of the Geostatistics Portal, but also to the takeover (migration) of this system’s data.
As part of the PDS project, the functionalities of previously available services were expanded and modernised, including: preparing statistical analyses in any division of space, e.g. defined in the application by the user, downloaded from external spatial services, in a “dynamic” space division grid, as well as the possibility of combining statistical data and the user’s own data, geocoding user objects used for geostatistical analyses.
New services that enable the use of exploratory analyses of spatial data using statistical information, performing geostatistical modelling analyses and supporting the so-called enriching the user’s own content with geostatistical information and analyses have also been created.
These services allow users to perform advanced statistical spatial analyses on data collected in public statistics surveys. Additionally, the user can combine their own data with statistical data and create unique analyses based on them.
As part of the PDS project, there were tasks carried out aimed at improving and expanding the functionality of the services provided as part of the Geostatistics Portal project – Phase II, as well as the construction of new, advanced spatial services expected by users. The project’s products are the following public e-services:
US-01 – Service of access from computer devices to the result statistical information collected in the Portal with the possibility of performing advanced spatial analyses and to data and metadata of the spatial information infrastructure
The US-01 service is basic tool for visualizing statistical data on maps. It’s basic, but at the same time extensive and offering many data visualisation methods. In a few simple steps, the user will be able to create a visualisation tailored to his needs. In order to do this, he must perform the following actions:
- selection of data (data provided by statistics, e.g. LDB (Local Data Bank) or own data),
- selection of area (analysis for the whole of Poland or e.g. a selected voivodship),
- selection of visualisation method (e.g. choropleth map or many types of cartodiagrams),
- selection of method parameters (adapting the visualisation to your needs).
The user can save the presentation prepared in this way in the repository and share it externally via a link or social media.
The previous Geostatistics Portal offered a wide range of cartographic presentation methods: cartogram and a number of basic pie and bar cartodiagrams. As part of the system built in the PDS project, the index of methods was further expanded with advanced cartodiagrams.
The new system has expanded the possibilities of data visualisation on kilometre grids by introducing dynamic space division grids, which will be created with the protection of statistical confidentiality. Aggregating point data (e.g. population) into statistical grids is a very popular data visualisation method among users.
An important functionality for users is the ability to edit visualisation parameters after it has been created, as well as a simple visualisation of the variability of statistical data over time. Statistics has a huge amount of data collected every year, which is why the new system allows for convenient visualisation of time series on maps (e.g. using a “slider”).
US-02 – Service of access from mobile devices to the result statistical information collected on the Portal and to its visualisation on maps
The US-02 service is a presentation on a mobile device of those functionalities of the US-01 service that can be visualised on a smaller screen. The desktop and mobile applications use the same spatial analysis repository.
The mobile application for devices with Android and iOS systems allows for browsing map applications and present thematic data of the Statistics Poland using cartographic presentation methods. It displays basic statistics for territorial units as well as allows for searching for objects.
The GUS Geo mobile application for devices with iOS and Android systems has been made available and can be downloaded from the App Store and Google Play store.
US-03 – Service enabling the use of exploratory analyses of spatial data using statistical information provided by the Portal
The US-03 service enables users to perform exploratory analyses of spatial data using statistical information provided by the Geostatistics Portal.
The tools provided give users the opportunity to examine the spatial distribution of the analysed variables and to determine spatial connections, interdependencies, and the occurrence of clusters.
The statistical methods that can be used include: central tendency statistics, dispersion statistics, measures of asymmetry and concentration, variable correlation analyses. For example, by using measures of central tendency, the user can obtain information about the most typical or representative value for the studied population. However, by applying this measures to spatial data, they can additionally determine the point within the radius of which there is the greatest concentration of units with the desired characteristics. Calculation of such statistics may be used for example to determine the optimal location for a new investment.
The service also offers comparative analysis methods based on grouping objects into relatively homogeneous classes (cluster analysis). There are few most common methods for grouping objects. Furthermore, the system enables the examination of the occurrence of spatial autocorrelation, and thus the identification of spatial clusters with similar values of the observed variables. All data analysis methods available as part of the service enable combining the data of users with statistical data.
US-04 – Service enabling the execution of analyses in the field of geostatistical modelling
The purpose of another new service US-04 is to enable users to generalise/estimate the results of research on a phenomenon carried out on a selected random sample to other surveyed units or the population of these units. For this purpose, functionalities are made available to users in the following areas:
- model building (a process including data analysis and the selection of previously defined or construction of new probabilistic models),
- application of a probabilistic model allowing conclusions (estimation) about the value of the explained variable based on the results of a random sample study (i.e. based on the collected data) and the adopted probability distribution.
The service includes selected econometric models, statistics and tests used to verify the quality of these models. Additionally, users can use several spatial interpolation methods, which allows them to estimate the value of a continuous variable at points for which they do not have data. The geostatistical modelling service allows users to use their own data as well as combine it with statistical data.
US-05 – Service supporting the enrichment of the user’s own content with geostatistical information and analyses provided by the Portal
Thanks to the service, the user can access documents semantically related to the analytical work he is currently conducting. This allows for more complex analyses and avoids errors resulting from an incorrectly prepared analytical model. For example, if an analyst will perform spatial analyses for a selected city regarding population, they should automatically receive a list of related documents, which will allow them to check the results of similar analyses and thus decide to publish or re-verify the obtained results of their own analyses.
Thanks to the mechanisms of automatic content analysis (i.e. the use of the “text mining” process) and in combination with the metadata available in the new system describing available geostatistical analyses, the service offers the user the ability to supplement their own text-based content with graphic elements (including maps) that visualise the geostatistical analyses.
Thanks to this functionality, the user can find documents that may interest them and would not be shown if the information search was based only on a simple keyword search. Additionally, advanced users using programming languages can search via the API.
The Geostatistics Portal offers the following modules:
- Resource index – used to browse pre-made data visualisations on maps,
- Map portal – CSO data – enables the preparation of visualisations of CSO data,
- Geocoding – allows for attaching coordinates or boundaries of administrative divisions to tabular data,
- Analysis Studio – supports the user in preparing visualisations and analyses and sharing them with others,
- Map Studio – helps to prepare cartographic presentations of user’s own data,
- Composition Studio – allows for sharing own data visualisations using network mapping services,
- Resource Manager – used to store, edit, share and manage data.
Innovativeness of the PDS project
To sum up, the services created as part of the PDS project provide support for users in decision-making processes related to statistical and spatial information and enable the practical use of spatial and data mining analyses, both in commercial activities, as well as those conducted by government and local government administration, as well as in scientific activities. The functionalities created as part of the project enable analyses to be carried out on spatial data sets and public statistics resources, allowing for obtaining previously unavailable results, as well as optimizing business processes, significantly enriching the country’s information system. Additionally, due to their innovative and unique nature, the project products may be used in scientific research, to analyse the relationships between phenomena from various research fields. The PDS project services are a unique solution that has no equivalent among the solutions/systems available on the market.