Database major concepts
The ESPON 2020 Database Portal is meant to be as inclusive as possible for territorial data resulting from Projects. It is built as a framework open enough to accommodate most data and regulated enough to allow for a tiered provision of data to end-users. In this framework, the concept of indicator is a cornerstone of the data delivery.
An indicator is a measure, a combination of measures, or a qualification of entities which provides an idea of what territories or relations between territories are like. Indicators may be of any nature (stock, ratios, typologies, ranking) and apply to any sort of objects (territorial entities, grid cells, flows, etc.).
Depending on their function and role regarding project results, four indicator statuses can be identified (see Table 2 for a synthetic view):
(1) base indicators are basic territorial indicators provided by ESPON EGTC to the public. These are accessible from the ESPON Database Portal and are considered as reference for data processing in ESPON Projects.
(2) a key indicator is a measure, combination of measures or qualification of entities which (a) aims at covering the entire ESPON space, (b) is based on a standard ESPON nomenclature (e.g. NUTS, FUA, MUA), (c) has been selected by a Project as most policy-relevant and innovative among all their data output, (d) is associated with detailed metadata.
(3) a background indicator is a measure, combination of measures or qualification of entities which was used is the process of calculating a key indicator. It (a) aims at covering the whole ESPON space or a transnational cooperation area, (b) may or may not be based on standard ESPON nomenclature, (c) is associated with detailed metadata.
(4) other indicators are measures, combination of measures or qualification of entities which were used compiled or calculated by the project but were not selected as part of the Main Data delivery. These are subject to lighter metadata.
ESPON Data & Map update
ESPON Database Portal
-most innovative and policy-relevant
-well documented metadata
-parent of at least
one key indicator
-any indicator used, compiled or calculated by ESPON Projects
ESPON space or TCA
ESPON space or TCA
in standard nomenclature (e.g NUTS, FUA, MUA)
In Search & Download interface
In Project Archive
Main Data concepts: datasets, structural types, genealogies, sources and preliminary processing
A dataset is a collection of one (or few inter-related) key indicator, its background indicators and their related metadata.
A dataset is a collection of one or more indicators (including at least one key indicator), and the relations between these indicators.
Each dataset is checked, validated and integrated in the database separately. However, a dataset is meant to be a vehicle for data integration. It is not reflected as such in the Search and download interface.
Indicators of the Main Data channel (key, background) can either be simple or part of a multi-dimensional structure.
A Multi indicator is an indicator whose data values might be broken down or further specified in various dimensions and classes. For instance, the indicator total population can be broken down by age groups or by gender, Corine land cover classification can be broken down at level 1 by 6 classes, at level 2 by 15 classes and at level 3 by 45 classes (Table 3).
A Dimension of a given Multi indicator expresses the kind of statistical categories (or classes) into which the Multi-indicator can be broken down (e.g.Population by age group). By definition a Multi indicator must at least be associated to one Dimension. A Dimension does not have any data attached to it.
A Class is a particular instance of a given Dimension within the multi-indicator (e.g. Population by age group 0 5 years old). At least two classes should be associated to any Dimension. All Classes must have data attached to it.
A Single indicator is an indicator which is not multi-dimensional and is not meant to be broken down along several categories.
The distinction between Multi and Single indicator is relative to a project. An indicator considered as a Single can be transformed in a Multi indicator if a relevant set of classes is used to break it down.
The relations between a key indicator and its background indicators in a given dataset is referred to as indicator genealogy. Key indicators, as the main data output of the project, have to be complemented (when possible) with their background indicators. It allows to describe the calculation procedures which led from background indicators to the key indicator. Background indicators may themselves be complemented by their own background indicators. For a given genealogy relation between one or several indicators used in the calculation of another, the former are referred to as parents, while the latter is referred to as child.
Hence a project which has selected a key indicator that is a ratio, shall also deliver the count data it is based on as background data. For instance, to deliver the indicator average number of employees by enterprises at NUTS3, the project should also deliver the total number of employees at NUTS3 and the total number of enterprises at NUTS3 and provide a brief description of the methodology used to calculate the ratio. Or, as another example, if a project has produced a typology, it shall also deliver all background indicators that have been used for this purpose (Figure 3).
Describing key indicators genealogies will allow the end-user of the database to get a clearer picture of the calculation processes and to get access to raw data used by the project. This will allow the end-user to navigate in the Database Portal from one indicator to the other.
Sources and Preliminary Processing
All data sources must be carefully documented into the database. Sources includes all publications to which a Project have resorted to when compiling a given indicator. Preliminary processing refers to all necessary adjustments applied to data before it was used by the project (aggregation, disaggregation, estimation, etc).
Because Sources and Preliminary processing may be specific to territorial entity or group of territorial entities for a given indicator and a given year, each data cell is flagged with a Source and may be flagged with a Preliminary Processing, if necessary. These flags are all gathered in a separate column on the right hand-side of their respective data column. Information on how to flag data is provided in a latter section.
A Standard nomenclature is a particular set of spatial entities which is defined as a reference for data collection in the ESPON context. Standard nomenclature includes:
- NUTS (Nomenclature of Territorial Units for Statistics) which may be defined for four levels (NUTS0, NUTS1, NUTS2, NUTS3) and several reference years (1999, 2003, 2006, 2010, 2012, 2016)
- Cities (provided by the Urban Audit) with two reference years (2014, 2018)
- Functional urban areas (FUAs provided by the Urban Audit) with two reference years (2014, 2018)
- Larger Urban Zones (LUZ)
- Local Administrative Units (LAU)
- Metropolitan Areas (MET)
- Metropolitan Regions (MR)
- Morphological Urban Areas (MUA)
- Urban Morphological Zones (UMZ)
Key indicators shall be delivered in one of these standard nomenclatures to qualify as key. This allows the Portal to provide visualisation tool (charts, maps, etc&) for these highly relevant indicators.
A complete list of standard nomenclature, as well corresponding lists of entities have been compiled in a Dictionary-of-spatial-units available on the ESPON 2020 Database Documentation.
New standard nomenclatures may be added in the database system based on new identified needs. This is subject to approval by ESPON EGTC.