Conceptual Approach to Data Delivery

The aim of the ESPON Database Portal is to receive, structure and display all data from ESPON Service Providers. To do so, the Project Team developed a dual approach to data, allowing a selection and structuration of the most meaningful results and, at the same time, a collection of all kinds of data that a project may have produced. The two delivery channels are referred to as the Main Data and the Other Data.

Main Data gather the most original and policy relevant indicators (“key indicators”), indicators used to calculate these indicators (“background indicators”) and their metadata, including mutual inter-indicators relations (various kind of groupings), indicator sources and possible preliminary processing applied to data during compilation. Main Data correspond to the queryable content of the Database Portal. Indicators to be included in Main Data are selected based on a dialogue between the Service Provider, ESPON EGTC (project expert) and the ESPON 2020 Database Portal (Project Team). These data and related metadata are provided by the Service Providers through the ESPON 2020 Data and Metadata Upload System (https://database.espon.eu/damf/). Main Data are mostly based upon indicators provided in ‘standard nomenclatures’ (e.g. NUTS, Functional Urban Area).

Other Data section gathers all indicators produced (compiled or calculated) by the ESPON Service Provider in the course of their project, exclusive of data already provided as part of “Main Data”. These indicators may be delivered in the form of statistical or geographic data. Collection of statistical indicators is completed through Excel templates adapted from the M4D project. It is gathered by individual file or bulk of individual files, each described briefly in the database system.

Main Data and Other Data gather ‘indicators’. An ‘indicator’ designates a measure with a unique definition, which may cover the whole or part of the ESPON Space and which may or may not be associated with a standard nomenclature (e.g. “population density”, “population potential”, “female employment rate”). Indicators under Main Data are the basic items provided in the list shown in the Search & Download section of the User Interface.

Main Data: concepts and structure

A. Datasets and indicators

Main Data is the major ESPON 2020 Database Portal entry point for the final user. Main Data is based on the following concepts:

  • A ‘dataset’ gathers one or more indicators in a thematically or statistically coherent whole. Indicators in a dataset pertains to the same theme and/or are related one between another by genetic relations. Datasets are just a vehicle for data collection. They are not visible as such for the final user of the ESPON database.
  • A ‘key indicator’ is considered by a Service Provider as one of the main output of the project. As an output of different types of data processing, it may have multiple parent indicators. All key indicators use a standard nomenclature.
  • A ‘background indicator’ is an indicator which was used in the process of creating a key indicator (either original compiled data or a significant step in the calculation process). A standard background indicator uses a standard nomenclature and is delivered using the same workflow). A non-standard background indicator is not presented in a standard nomenclature and must therefore be delivered through a dedicated sub-channel (e.g. grid data, flow data).

Metadata associated with datasets and indicators are listed and described in the Manual “How to Deliver My Data” (online document).

B. Indicators groupings (indicators groups, indicator genealogy, dimensional indicators)

The originality of the Main Data under the ESPON 2020 Database Portal is the extent to which it identifies relations between indicators. Indicators may be linked on two basis:

  1. Structural relations (‘multi’ indicators / ‘dimension’ indicators / ‘class’ indicators). A multi indicator is an indicator that has one or more dimensions and/or classes. A Dimension is a particular way to break down a multi indicator according to a criteria. A Class is an individual instance of a type under the criteria taken as dimension. For instance, “Total population” (Multi indicator) can by broken down by “Age group” (Dimension) which corresponds to specific age interval (Classes), e.g. 0 to 4, 5 to 9, 10 to 14.
  2. Genetic relations (‘indicator genealogy’). A genealogy relation can be established between one (or several) indicator(s) and another, when one (or several) indicator(s) - the ‘parents’ - were used in the process of calculating the other – the ‘child’. A methodology that describes the calculation step is associated to each genealogy relation. Genealogy relations are declared at the level of indicators. For indicator with multiple dimensions and multiple classes, genealogy is defined at the level of the ‘multi-indicator’. The genealogy then applies to all ‘dimensions’ and ‘classes’.

These relations are at the core of the ‘cumulative’ and ‘networking’ approach to indicators. This approach is cumulative as new indicators may be related to existing ones (e.g. when an ESPON project refer to indicators already available in the ESPON Database Portal). This approach is ‘network’-oriented as it allows the final user of the ESPON 2020 Database Portal to navigate between indicators on the basis of structural or genetic relations.

C. Sources and preliminary processing

Each individual data record (value of a standard indicator for an individual region) is associated with a ‘Source’ and may be assigned a ‘Preliminary Processing’.

A ‘Source’ is a set of information that allows to locate external data used in the process of compiling or calculating a data record. It includes a name, a URL and a description.

A ‘Preliminary Processing’ is a set of information that identifies calculating procedures that were applied to external data before compilation. It may be an aggregation or disaggregation, an estimation, overlay procedure between geographic layers, etc.

‘Sources’ and ‘Preliminary Processing’ ensure the systematic reference to external data.

Other Data: concept and structure

Data to be included as ‘Other Data’ may be statistical data (which may or may not use standard spatial nomenclatures), geographical data or any other relevant data (e.g. survey results). Metadata associated to each indicators and file delivered as Other Data are less extensive; these are described in the online documentations in the section “How to Deliver My Data”. In contrast with Main Data, no relation is identified between indicators provided under Other Data.

Once a Project delivery is completed, the content of Main Data and Other Data is merged in a ‘Project Archive’ that one may download from the ESPON 2020 Database Portal. The Project Archive gathers the whole data content produced by an ESPON Project.