Preparing for data delivery


Before starting the edition of metadata in the upload interface, the Projects needs to prepare for the delivery.

 Identification of main stakeholders

As already highlighted, data deliver is a three-stakeholder process. These stakeholders must be clearly identified from the start.

Table 1 Stakeholders and their role in the delivery process

Stakeholder

ESPON EGTC

Database Team

Service Provider

Role in the process

Project Data Approver

Project Data Reviewer

Project Data Manager

For the delivery process to be carried on smoothly, the Project shall identify among the project partners, a unique contact point in charge of data delivery (the project data manager). This manager is responsible for interacting with the Database team and ESPON EGTC on issues related to data delivery and for the Upload of metadata and data on the Upload interface.

At the same time, the Project is assigned a unique database reviewer who is in charge of guiding the manager into the delivery process, checking uploaded metadata and data to ensure their high-quality level and validating the delivery.

ESPON EGTC identify one or several approvers who is/are in charge of making sure that the delivery is exhaustive and is carried out with diligence. This role is usually played by project experts or persons in charge of data management at ESPON EGTC.

Database major concepts

The ESPON 2020 Database Portal is meant to be as inclusive as possible for territorial data resulting from Projects. It is built as a framework open enough to accommodate most data and regulated enough to allow for a tiered provision of data to end-users. In this framework, the concept of indicator is a cornerstone of the data delivery.

 Indicator

An indicator is a measure, a combination of measures, or a qualification of entities which provides an idea of what territories or relations between territories are like. Indicators may be of any nature (stock, ratios, typologies, ranking) and apply to any sort of objects (territorial entities, grid cells, flows, etc.).

Depending on their function and role regarding project results, four indicator statuses can be identified (see Table 2 for a synthetic view):

(1) base indicators are basic territorial indicators provided by ESPON EGTC to the public. These are accessible from the ESPON Database Portal and are considered as reference for data processing in ESPON Projects.

(2) a key indicator is a measure, combination of measures or qualification of entities which (a) aims at covering the entire ESPON space, (b) is based on a standard ESPON nomenclature (e.g. NUTS, FUA, MUA), (c) has been selected by a Project as most policy-relevant and innovative among all their data output, (d) is associated with detailed metadata.

(3) a background indicator is a measure, combination of measures or qualification of entities which was used is the process of calculating a key indicator. It (a) aims at covering the whole ESPON space or a transnational cooperation area, (b) may or may not be based on standard ESPON nomenclature, (c) is associated with detailed metadata.

(4) other indicators are measures, combination of measures or qualification of entities which were used compiled or calculated by the project but were not selected as part of the Main Data delivery. These are subject to lighter metadata.


Table 2 ESPON Database indicator statuses, their characteristics and relations to delivery channels


Base Data

Main Data

Other Data

indicator status

base

key

background

other

provider

ESPON Data & Map update
ESPON Database Portal

ESPON Projects

characteristics

-reference data
-regularly updated

-most innovative and policy-relevant
-well documented metadata

-parent of at least
one key indicator
-well-documented metadata

-any indicator used, compiled or calculated by ESPON Projects

spatial extent

ESPON space or TCA

ESPON space or TCA

no constraint

in standard nomenclature (e.g NUTS, FUA, MUA)

standard only

standard only

standard or
non-standard

standard or
non-standard

approved data

delivery format


-.csv

-.csv
-.shp, .tiff
-other

-.csv
-.shp, .tiff
-other

Location

In Search & Download interface

x

x

x


In Project Archive


x

x

x

Main Data concepts: datasets, structural types, genealogies, sources and preliminary processing

Dataset

A dataset is a collection of one (or few inter-related) key indicator, its background indicators and their related metadata.
A dataset is a collection of one or more indicators (including at least one key indicator), and the relations between these indicators.  

Each dataset is checked, validated and integrated in the database separately. However, a dataset is meant to be a vehicle for data integration. It is not reflected as such in the Search and download interface.

Structural types

Indicators of the Main Data channel (key, background) can either be simple or part of a multi-dimensional structure.

A Multi indicator is an indicator whose data values might be broken down or further specified in various dimensions and classes. For instance, the indicator total population can be broken down by age groups or by gender, Corine land cover classification can be broken down at level 1 by 6 classes, at level 2 by 15 classes and at level 3 by 45 classes (Table 3).


Table 3 Examples of Multi-indicators

A Dimension of a given Multi indicator expresses the kind of statistical categories (or classes) into which the Multi-indicator can be broken down (e.g.Population by age group). By definition a Multi indicator must at least be associated to one Dimension. A Dimension does not have any data attached to it.

A Class is a particular instance of a given Dimension within the multi-indicator (e.g. Population by age group  0  5 years old). At least two classes should be associated to any Dimension. All Classes must have data attached to it.

A Single indicator is an indicator which is not multi-dimensional and is not meant to be broken down along several categories.

The distinction between Multi and Single indicator is relative to a project. An indicator considered as a Single can be transformed in a Multi indicator if a relevant set of classes is used to break it down.

Genealogies

The relations between a key indicator and its background indicators in a given dataset is referred to as indicator genealogy. Key indicators, as the main data output of the project, have to be complemented (when possible) with their background indicators. It allows to describe the calculation procedures which led from background indicators to the key indicator. Background indicators may themselves be complemented by their own background indicators. For a given genealogy relation between one or several indicators used in the calculation of another, the former are referred to as parents, while the latter is referred to as child.



Figure 3 A case example of genealogy relation


Hence a project which has selected a key indicator that is a ratio, shall also deliver the count data it is based on as background data. For instance, to deliver the indicator average number of employees by enterprises at NUTS3, the project should also deliver the total number of employees at NUTS3 and the total number of enterprises at NUTS3 and provide a brief description of the methodology used to calculate the ratio. Or, as another example, if a project has produced a typology, it shall also deliver all background indicators that have been used for this purpose (Figure 3).

Describing key indicators genealogies will allow the end-user of the database to get a clearer picture of the calculation processes and to get access to raw data used by the project. This will allow the end-user to navigate in the Database Portal from one indicator to the other.

Sources and Preliminary Processing

All data sources must be carefully documented into the database. Sources includes all publications to which a Project have resorted to when compiling a given indicator. Preliminary processing refers to all necessary adjustments applied to data before it was used by the project (aggregation, disaggregation, estimation, etc).

Because Sources and Preliminary processing may be specific to territorial entity or group of territorial entities for a given indicator and a given year, each data cell is flagged with a Source and may be flagged with a Preliminary Processing, if necessary. These flags are all gathered in a separate column on the right hand-side of their respective data column. Information on how to flag data is provided in a latter section.

Standard nomenclatures

A Standard nomenclature is a particular set of spatial entities which is defined as a reference for data collection in the ESPON context. Standard nomenclature includes:

  • NUTS (Nomenclature of Territorial Units for Statistics) which may be defined for four levels (NUTS0, NUTS1, NUTS2, NUTS3) and several reference years (1999, 2003, 2006, 2010, 2012, 2016)
  • Cities (provided by the Urban Audit) with two reference years (2014, 2018)
  • Functional urban areas (FUAs  provided by the Urban Audit) with two reference years (2014, 2018)
  • Larger Urban Zones (LUZ)
  • Local Administrative Units (LAU)
  • Metropolitan Areas (MET)
  • Metropolitan Regions (MR)
  • Morphological Urban Areas (MUA)
  • Urban Morphological Zones (UMZ)

Key indicators shall be delivered in one of these standard nomenclatures to qualify as key. This allows the Portal to provide visualisation tool (charts, maps, etc&) for these highly relevant indicators.

A complete list of standard nomenclature, as well corresponding lists of entities have been compiled in a Dictionary-of-spatial-units available on the ESPON 2020 Database Documentation.

New standard nomenclatures may be added in the database system based on new identified needs. This is subject to approval by ESPON EGTC.

Defining the scope and structure of the delivery

This preparatory work should be performed by the Project early in the delivery process (week 2-3-4). An overview table has been set up by ESPON EGTC to guide the manager in the development of an exhaustive and structured delivery. The document is also used by ESPON EGTC for the delivery of maps and figures as a separate process. The template of this document can be downloaded from the ESPON Database online manual.

Filling in and validating the overview table for data delivery is an 8-step process (Figure 4).

(Step 1) The approver (ESPON EGTC) provides a list of maps and figures (as in the draft final report).

(Step 2) The approver identifies which maps and figures (from the list) should have data associated to it.

(Step 3) For each maps and figures pointed by the approver, the manager identifies one or several indicators. If a map or figure is associated to several indicators, the manager should duplicate the lines, in order to have only one indicator by line. The manager then

  • indicates in which channel the indicator will be delivered (Main vs Other)
  • creates an 8-digit code for each indicator
  • indicates which years shall be covered

In case support is needed, the reviewer may step in and provide advices on appropriate indicators to deliver in relation to certain maps or figures. To do so, the reviewer may ask the manager to compile an extract of all maps and figures presented in the final reports as image file.

(Step 4) The reviewer reviews the table and provides suggestions to enhance the delivery. These are discussed until an agreement is reached on the content of the table between the manager, the reviewer and approver.

(Step 5) the DB team transfers this list of indicators in the List indicators spreadsheet. On the basis of a dialogue between the reviewer and the manager, indicators under Main Data are complemented if possible, with their background indicators (together referred as genealogies) and gathered in datasets. Indicators under Other Data are grouped in files for delivery.

(Step 6) The reviewer sends the outcoming overview table to the approver.  The approver reviews the table and checks that indicators that were identified for delivery exhaustively covers the original data produced in the course of the project. The approver validates the overview table by sending an e-mail to both the manager and to the reviewer. The outcoming list of indicators is considered as binding for the upcoming delivery.

Then the delivery process itself is implemented in the Upload interface.

(Step 7) At the end of the process, the reviewer indicates if each of the listed indicators are delivered. The overview table is then sent back to the approver.

(Step 8) The approver checks if all indicators were delivered as foreseen on step 2.

Reviewers guideline

The approver (ESPON EGTC) initiates the process and sends the first draft of the overview table to the project data manager. However, if needed, a template of the overview table is available in the reviewer toolbox. The reviewer shall be proactive in this preparatory process to make sure that the overview table is produced in due time.  

Approvers guideline

The approver is key in the preparatory process. He/she initiates the process by sending the first draft of the overview table to the project data manager. The template of the overview table is provided in the approver toolbox. Additional indications on how to fill it in are presented in the Instruction spreadsheet.

Preparing for data delivery during the project

In order to ease the delivery of data, the Project is invited, during its implementation to:

-       Cautiously store metadata (definition and scope of each indicator)

-       Keep track of multiple data sources

-       Compile indicators as displayed on maps in the reports together with indicators used in the calculation process.

You may find more answers to your question on the preparation of data in the FAQ section of this manual