Preparing for data delivery

Before starting the edition of metadata in the upload interface, the Projects needs to prepare for the delivery.

Identification of main stakeholders

As already highlighted, data deliver is a three-stakeholder process. These stakeholders must be clearly identified from the start.

Table 1 Stakeholders and their role in the delivery process

Stakeholder	ESPON EGTC	Database Team	Service Provider
Role in the process	Project Data Approver	Project Data Reviewer	Project Data Manager

For the delivery process to be carried on smoothly, the Project shall identify among the project partners, a unique contact point in charge of data delivery (the project data manager). This manager is responsible for interacting with the Database team and ESPON EGTC on issues related to data delivery and for the Upload of metadata and data on the Upload interface.

At the same time, the Project is assigned a unique database reviewer who is in charge of guiding the manager into the delivery process, checking uploaded metadata and data to ensure their high-quality level and validating the delivery.

ESPON EGTC identify one or several approvers who is/are in charge of making sure that the delivery is exhaustive and is carried out with diligence. This role is usually played by project experts or persons in charge of data management at ESPON EGTC.

Database major concepts

The ESPON 2020 Database Portal is meant to be as inclusive as possible for territorial data resulting from Projects. It is built as a framework open enough to accommodate most data and regulated enough to allow for a tiered provision of data to end-users. In this framework, the concept of indicator is a cornerstone of the data delivery.

Indicator

An ‘indicator’ is a measure, a combination of measures, or a qualification of entities which provides an idea of what territories or relations between territories are like. Indicators may be of any nature (stock, ratios, typologies, ranking) and apply to any sort of objects (territorial entities, grid cells, flows, etc.).

Depending on their function and role regarding project results, four indicator statuses can be identified (see Table 2 for a synthetic view):

(1) base indicators are basic territorial indicators provided by ESPON EGTC to the public. These are accessible from the ESPON Database Portal and are considered as reference for data processing in ESPON Projects.

(2) a key indicator is a measure, combination of measures or qualification of entities which (a) aims at covering the entire ESPON space, (b) is based on a ‘standard’ ESPON nomenclature (e.g. NUTS, FUA, MUA), (c) has been selected by a Project as most policy-relevant and innovative among all their data output, (d) is associated with detailed metadata.

(3) a background indicator is a measure, combination of measures or qualification of entities which was used is the process of calculating a key indicator. It (a) aims at covering the whole ESPON space or a transnational cooperation area, (b) may or may not be based on ‘standard’ ESPON nomenclature, (c) is associated with detailed metadata.

(4) other indicators are measures, combination of measures or qualification of entities which were used compiled or calculated by the project but were not selected as part of the Main Data delivery. These are subject to lighter metadata.

Table 2 ESPON Database ‘indicator statuses’, their characteristics and relations to delivery channels

		Base Data	Main Data		Other Data
indicator status		base	key	background	other
provider		ESPON Data & Map update ESPON Database Portal	ESPON Projects
characteristics		-reference data -regularly updated	-most innovative and policy-relevant -well documented metadata	-parent of at least one key indicator -well-documented metadata	-any indicator used, compiled or calculated by ESPON Projects
spatial extent		ESPON space or TCA	ESPON space or TCA		no constraint
in standard nomenclature (e.g NUTS, FUA, MUA)		standard only	standard only	standard or non-standard	standard or non-standard
approved data delivery format			-.csv	-.csv -.shp, .tiff -other	-.csv -.shp, .tiff -other
Location	In Search & Download interface	x	x	x
Location	In Project Archive		x	x	x

Main Data’ concepts: datasets, structural types, genealogies, sources and preliminary processing

Dataset

A ‘dataset’ is a collection of one (or few inter-related) key indicator, its background indicators and their related metadata.
A dataset is a collection of one or more indicators (including at least one key indicator), and the relations between these indicators.

Each dataset is checked, validated and integrated in the database separately. However, a dataset is meant to be a vehicle for data integration. It is not reflected as such in the Search and download interface.

Structural types

Indicators of the Main Data channel (key, background) can either be simple or part of a multi-dimensional structure.

A ‘Multi’ indicator is an indicator whose data values might be broken down or further specified in various dimensions and classes. For instance, the indicator ‘total population’ can be broken down by age groups or by gender, ‘Corine land cover classification’ can be broken down at level 1 by 6 classes, at level 2 by 15 classes and at level 3 by 45 classes (Table 3).

Table 3 Examples of Multi-indicators

A ‘Dimension’ of a given ‘Multi’ indicator expresses the kind of statistical categories (or ‘classes’) into which the Multi-indicator can be broken down (e.g.“Population by age group”). By definition a ‘Multi’ indicator must at least be associated to one ‘Dimension’. A ‘Dimension’ does not have any data attached to it.

A ‘Class’ is a particular instance of a given ‘Dimension’ within the multi-indicator (e.g. Population by age group – 0 – 5 years old). At least two classes should be associated to any ‘Dimension’. All ‘Classes’ must have data attached to it.

A ‘Single’ indicator is an indicator which is not multi-dimensional and is not meant to be broken down along several categories.

The distinction between ‘Multi’ and ‘Single’ indicator is relative to a project. An indicator considered as a ‘Single’ can be transformed in a ‘Multi’ indicator if a relevant set of classes is used to break it down.

Genealogies

The relations between a key indicator and its background indicators in a given dataset is referred to as ‘indicator genealogy’. Key indicators, as the main data output of the project, have to be complemented (when possible) with their background indicators. It allows to describe the calculation procedures which led from background indicators to the key indicator. Background indicators may themselves be complemented by their own background indicators. For a given genealogy relation between one or several indicators used in the calculation of another, the former are referred to as ‘parents’, while the latter is referred to as ‘child’.

Figure 3 A case example of genealogy relation

Hence a project which has selected a key indicator that is a ratio, shall also deliver the count data it is based on as background data. For instance, to deliver the indicator ‘average number of employees by enterprises’ at NUTS3, the project should also deliver the ‘total number of employees’ at NUTS3 and ‘the total number of enterprises’ at NUTS3 and provide a brief description of the methodology used to calculate the ratio. Or, as another example, if a project has produced a typology, it shall also deliver all background indicators that have been used for this purpose (Figure 3).

Describing key indicators genealogies will allow the end-user of the database to get a clearer picture of the calculation processes and to get access to raw data used by the project. This will allow the end-user to navigate in the Database Portal from one indicator to the other.

Sources and Preliminary Processing

All data sources must be carefully documented into the database. ‘Sources’ includes all publications to which a Project have resorted to when compiling a given indicator. ‘Preliminary processing’ refers to all necessary adjustments applied to data before it was used by the project (aggregation, disaggregation, estimation, etc).

Because ‘Sources’ and ‘Preliminary processing’ may be specific to territorial entity or group of territorial entities for a given indicator and a given year, each data cell is flagged with a Source and may be flagged with a Preliminary Processing, if necessary. These flags are all gathered in a separate column on the right hand-side of their respective data column. Information on how to ‘flag’ data is provided in a latter section.

Standard nomenclatures

A ‘Standard nomenclature’ is a particular set of spatial entities which is defined as a ‘reference’ for data collection in the ESPON context. Standard nomenclature includes:

NUTS (Nomenclature of Territorial Units for Statistics) which may be defined for four levels (NUTS0, NUTS1, NUTS2, NUTS3) and several reference years (1999, 2003, 2006, 2010, 2012, 2016)
Cities (provided by the Urban Audit) with two reference years (2014, 2018)
Functional urban areas (FUAs – provided by the Urban Audit) with two reference years (2014, 2018)
Larger Urban Zones (LUZ)
Local Administrative Units (LAU)
Metropolitan Areas (MET)
Metropolitan Regions (MR)
Morphological Urban Areas (MUA)
Urban Morphological Zones (UMZ)

Key indicators shall be delivered in one of these standard nomenclatures to qualify as ‘key’. This allows the Portal to provide visualisation tool (charts, maps, etc…) for these highly relevant indicators.

A complete list of standard nomenclature, as well corresponding lists of entities have been compiled in a Dictionary-of-spatial-units available on the ESPON 2020 Database Documentation.

New standard nomenclatures may be added in the database system based on new identified needs. This is subject to approval by ESPON EGTC.

Defining the scope and structure of the delivery

This preparatory work should be performed by the Project early in the delivery process (week 2-3-4). An overview table has been set up by ESPON EGTC to guide the manager in the development of an exhaustive and structured delivery. The document is also used by ESPON EGTC for the delivery of maps and figures as a separate process. The template of this document can be downloaded from the ESPON Database online manual.

Filling in and validating the overview table for data delivery is an 8-step process (Figure 4).

(Step 1) The approver (ESPON EGTC) provides a list of maps and figures (as in the draft final report).

(Step 2) The approver identifies which maps and figures (from the list) should have data associated to it.

(Step 3) For each maps and figures pointed by the approver, the manager identifies one or several indicators. If a map or figure is associated to several indicators, the manager should duplicate the lines, in order to have only one indicator by line. The manager then

indicates in which channel the indicator will be delivered (‘Main’ vs ‘Other’)
creates an 8-digit code for each indicator
indicates which years shall be covered

In case support is needed, the reviewer may step in and provide advices on appropriate indicators to deliver in relation to certain maps or figures. To do so, the reviewer may ask the manager to compile an extract of all maps and figures presented in the final reports as image file.

(Step 4) The reviewer reviews the table and provides suggestions to enhance the delivery. These are discussed until an agreement is reached on the content of the table between the manager, the reviewer and approver.

(Step 5) the DB team transfers this list of indicators in the ‘List indicators’ spreadsheet. On the basis of a dialogue between the reviewer and the manager, indicators under ‘Main Data’ are complemented if possible, with their background indicators (together referred as ‘genealogies’) and gathered in datasets. Indicators under ‘Other Data’ are grouped in files for delivery.

(Step 6) The reviewer sends the outcoming overview table to the approver. The approver reviews the table and checks that indicators that were identified for delivery exhaustively covers the original data produced in the course of the project. The approver validates the overview table by sending an e-mail to both the manager and to the reviewer. The outcoming list of indicators is considered as binding for the upcoming delivery.

Then the delivery process itself is implemented in the Upload interface.

(Step 7) At the end of the process, the reviewer indicates if each of the listed indicators are delivered. The overview table is then sent back to the approver.

(Step 8) The approver checks if all indicators were delivered as foreseen on step 2.

Reviewers guideline

The approver (ESPON EGTC) initiates the process and sends the first draft of the overview table to the project data manager. However, if needed, a template of the overview table is available in the ‘reviewer toolbox’. The reviewer shall be proactive in this preparatory process to make sure that the overview table is produced in due time.

Approvers guideline

The approver is key in the preparatory process. He/she initiates the process by sending the first draft of the overview table to the project data manager. The template of the overview table is provided in the approver toolbox. Additional indications on how to fill it in are presented in the “Instruction” spreadsheet.

Preparing for data delivery during the project

In order to ease the delivery of data, the Project is invited, during its implementation to:

- Cautiously store metadata (definition and scope of each indicator)

- Keep track of multiple data sources

- Compile indicators as displayed on maps in the reports together with indicators used in the calculation process.

You may find more answers to your question on the preparation of data in the FAQ section of this manual

In this Page

In this section