ESPON 2020 Core Data Updating Strategy


The Updating module is another application in the ESPON Data Portal that allows a user to insert the detected virtual changes into the final database. That module is fully integrated in the ESPON Database and it takes profit of other applications that have been developed in another working packages. All the source code can be found in the official git repository in a module called Updating[1] . Two user expert roles can be derived from Updating module and Updating strategy: one role to run server scripts and another role to update indicators in a local machine with Excel Updating Suite.

Required python packages:

  • python == 3.5.2
  • django == 1.11.22
  • psycopg2 == 2.7.1
  • pandas == 0.22
  • xlwings == 0.15.1
  • xlsxwriter == 1.0.5
  • xlrd == 1.1.0.
  • jinja2 == 2.10.1


Updating tables

To make the Updating module work properly, five tables under the updating namespace in the ESPON database are needed: tdta_metadata, tdta_eurostat, tdta_missing_pattern, tdta_nulls y tdta_report. The three first tables are the most important because they store data that is relevant for a correct working of the tool; last two tables are optional

  • tdta_metadata
    • This table contains information that is relevant to a given ESPON indicator, basically metadata and a unique identifier to group multi indicators
  • tdta_eurostat
    • This table contains information about Eurostat table names as well as parameters that are needed to extract the counterpart ESPON indicators from those datasets
  • tdta_missing_pattern
    • This table contains information related to missing patterns, i.e. a missing matrix as well several parameters to automatize the updating procedure in a local workstation
  • tdta_report
    • This table contains a detailed report about the updates for a given indicator, providing the number of rows and affected tables in the ESPON database
  • tdta_nulls
    • This table contains information about detected nulls for a given indicator

Updating command scripts

The Updating module also contains seven Django commands, namely python scripts, to be launched by an expert user in the ESPON server via line commands. The name of each script reflects the order of execution. These scripts are located in a folder named management and the name of each script reflects the order of execution. Moreover, there are other scripts useful in different stages of the updating procedure. The location folder and the main purpose of different scripts are detailed in Table 1.

Table 1: Command scripts (in blue) and auxiliary functions (in green) employed in Updating module

File name




This script will populate the tdta_metadata table


This script will download the Eurostat tables from Eurostat services and store those tsv files in a private folder in the server


This script will populate the tdta_eurostat table


This script will populate the tdta_nulls table


This script will populate the tdta_missing_pattern table


This script will build the Excel Updating Suite as a zip package


This script will update several tables: tdta_data_item, trel_data_item_source, trel_data_item_method and tdta_metadata


This file contains a class to get data from ESPON database in a homogeneous way, irrespective from their nature indicator (single, multi).


This file contains a dictionary of useful parameters to get the counterpart values from Eurostat tables


A collection of utilities in a module for several purposes: transform both ESPON and Eurostat data to pandas-like structures, helpers functions useful to another functions in different modules


A collection of functions to transform NUTS units in a hierarchical way, specially to transverse ESPON time series and generate relationships between rows

Table 2: Parametrized templates to generate at runtime the macros for the Excel Updating Suite

File name




Parametrized script to launch detected virtual changes into ESPON Excel sheets


Parametrized script to start a new manual updating session and to make differences between ESPON time series


Parametrized script to launch an Outlier detection method on initial/modified ESPON data


Parametrized script to save changes into a local database file


Parametrized script to compute spatial matrix


Parametrized script to compute thematic matrix

Updating folders

To plug this module into the Database Portal an accessible folder in the server with writing/reading permissions is needed. The files generated by this application will be stored in this folder, in a nested structure with paths relative to root directory,  to be further retrieved on demand by the application.

  • eurostat
    • The location of the folder in which datasets from Eurostat web services are downloaded. Not empty
  • macros
    • The location of the folder in which macros are generated for creating the Excel Updating Suite. A nested subfolder structure is generated. Initially empty.
  • eus
    • An initially empty folder but EUS suite is generated in a subfolder called eus, as well as xlsx files are stored in another subfolder called xslx to make the zip automatically. A nested subfolder structure is generated to get zip files per country basis or per all countries
  • db_updates
    • The location of the folder in which user uploads database dumps in sqlite3 format
  • db_reports
    • The location of the folder in which a report is generated in text format is generated when user launch updates files against the ESPON database. Initially empty

How to use the Updating module

After connecting to ESPON server, an expert user will be able to launch the command scripts. A list of all ESPON Base indicators with their unique identifiers (ind_id) as stored in he ESPON database can be found in Table 4.

The execution procedure will be as follow:

Run script

python 1st_insert_metadata –indicator_id <ind_id>

This script will read metadata from ESPON 2020 Database and will populate the majority of all fields in the tdta_metadata table. This is achieved through a BaseIndicator class (in Updating/ that populates all fields except for number_of_null

Run script

python 2nd_download_eurostat –indicator_id <ind_id>

This script will start the downloading process of tsv files from Eurostat downloading services[1]:

  1. gets the name of the Eurostat dataset as read from a EUROSTAT_EQUIVALENCES dictionary key (eurostat_file) in script
  2. checks the lastest version of downloaded tsv file (by reading the metadata.txt file with latest date). If current date (Y-M-Dformat), tsv file is not downloaded; otherwise, is downloaded
  3. if downloaded, moves those tsv files to eurostat directory with a created directory according to some categories (DEMOGRAPHY, ECONOMY, EDUCATION, EMPLOYMENT, ENERGY, ITS, R&D, SOCIETY)


Run script

python 3rd_insert_eurostat –indicator_id <ind_id>

This script will populate the tdta_eurostat table that will be used in the Updating application to get data from Eurostat and compare with pair-wise ESPON indicator so as to get the so-called virtual changes (later with script)

Run script (optional)

python 4th_insert_nulls --indicator_id <ind_id> --country_code <ctr_code>

This script will populate the tdta_nulls table that will be useful to get an idea of existing nulls in Base Indicators. Moreover, a field in tdta_metadata called number_nulls is populated after this scripts is run.

 Run 5th_insert_mp script

python 5th_insert_mp --indicator_id <ind_id> --country_code <ctr_code> This script will populate the tdta_missing_pattern table that will be useful to get the missing pattern for a given ESPON indicator and a given country. This information will be very valuable later, when script is run. One completed, there is no need to re-run until a new bunch update of Base Indicator is injected in ESPON database.  These five first scripts have only to be launched once and this is the only requirement to start with the updating procedure.

 Run script

 python 6th_create_EUS --indicator_id <ind_id> --country_code <ctr_code> --create_zip True

This script will create the so-called Excel Updating Suite that consists of a zip file with an Excel Dashboard and a bunch of files/macros, gathering information from Updating database (previous steps). The location of created zips is eus/zips folder, with either subpath to country-basis folder (for instance, inds/145/By_Country/) if country_code parameter, or subpath to all-country-basis folder (for instance, inds/145/All_countries/).

These macros are generated from parametrised python templates that are located in the macros/templates folder of the  Updating module. See Table 2

Once expert user has built the EUS zip files for all countries for a given indicator in the server, an accessible point to be downloaded will be ready. With all Excels files and macros in his local machine, the updating procedure controlled by an expert user can be start in a local workstation following the workflow as specified in the Table 3.

When this updating procedure is finished, a local sqlite3 file with all virtual changes for all the countries should be generated (see step 12, Table 3) and the expert user should upload that file to a folder (db_updates) in the ESPON server.

Run script

python 7th_importer2ESPONDB --indicator_id <ind_id>

This script will read the uploaded sqlite3 file and all virtual changes will be integrated in the final ESPON database. A txt report is generated and stored in database containing relevant information about data to be inserted in ESPON database.

Table 3. Sequential steps in the Updating workflow to updating a base indicator





Preparation of the data for the update (creation of the Excel Updating Suite)

After selecting the indicator to be updated, a process is run on the server to generate an Excel Updating Suite for each country, containing, in different sheets, the current indicator figures, the virtual changes derived from the comparison with Eurostat data, and a specific sheet with macro buttons to launch the following steps. Those Excel sheets are downloaded from the server within a ZIP file.


Detection of potential changes from Eurostat data, which are compliant with the spatial and thematic constraints of the specific indicator, when estimated data are involved.

After unzipping the Excel sheets, the process starts with one of the countries. In order to detect the potential and compliant changes, the corresponding macro should be run from the Excel Updating Suite, i.e. by pressing the Launch button


Creation of spatial and thematic validation matrices for the whole time-series.

In order to create the matrices, the corresponding buttons (Spatial matrix and Thematic matrix) in the Excel Updating Suite should be pressed.


Creation of supportive charts (line plots) of the indicator time-series.

Optional. In case of need, the expert in charge of the update can generate automatic line plots of the data, to have a graphical overview of the indicator time-series by pressing the Charts button.


Review of the resulting matrices to decide eventually if potential changes should be integrated or rejected.

The matrices provide the deviation of potential changes with regard to the spatial and thematic rules. It is up to the expert in charge of the update, to decide whether a specific deviation is acceptable or not.


Manual editing of potential changes to be rejected or modified.

All the potential changes not edited (or removed) in the corresponding Excel sheet will be eventually actual changes in the database. If some changes are considered as unacceptable, should be removed from the corresponding Excel sheet at this stage.


Detection of potential outliers in ESPON time series.

In case of need, the expert in charge of the update can launch two algorithms to detect potential outliers in the ESPON time series by pressing the Outliersbutton.


Save potential changes to the local DB.

By means of the Save button, the potential changes that remain in the Excel sheets are stored in a local database.


Introduction of additional figures from other sources and save the changes to the local DB.

There is the possibility to add manually potential changes coming from other sources (including estimated values). This is done in this step by pressing the corresponding buttons (Start manual session and Detect manual changes). If manual changes have been introduced, the potential changes need to be saved to the local DB again.


Repeat processes 1 to 9 for all the countries covered by the specific indicator.


Prepare a single file for the specific indicator including all the countries.


Upload the file to the ESPON DB Server and actually update the indicator.

Table 4. ESPON Base Indicators and unique indicator identifier as stored in ESPON database





Population, total, by broad age (0-14, 14-65, >65) groups and gender   


Life expectancy, by age, gender   


Population, total, by broad age groups (by 5s) and gender


Old age dependence ratio


Gender imbalances         


Ageing index      


Deaths by age groups and gender (NUTS2)


Deaths by age groups and gender (NUTS3) 


Live Births         


Total Population


Total Deaths


Natural Change of population


Total Net Migration          



Employment rates by sex & broad age groups (%)


Employment by sex & broad age groups (1000)


Employment by age, economic activity (Econ. sections C, F, I)


Unemployment by sex & age (1000)


Unemployment by sex & age (%)   


Employment by age, economic activity (Econ. sections A, F and K)


Long term unemployment



Early leavers from education and training (18-24 years) by sex


Participation rate in education and training (age group 25 to 64 years)


Population aged 25-64, 30-34 by educational attainment level, sex (%)          



Disposable income of private households


Gross value added at basic prices, million EURO


Households with broandband access (percentage of households)


Gross domestic product (GDP) at current market prices by Million PPS


Gross domestic product (GDP) at current market prices by Million Euro


Gross domestic product (GDP) at current market prices by Purchasing Power Standard per inhabitant



Share of energy from renewable sources (in gross final energy consumption)


Greenhouse gas emissions by source sector (EEA)   


Primary energy consumption


Energy intensity of the economy



People at risk of poverty or social exclusion (% of total population)


Severe material deprivation rate by NUTS2 regions (% of population)


People living in households with very low work intensity (population aged 0 to 59 years)         


At-risk-of-poverty rate by NUTS2 regions (% of population)



Human resources in science and technology


Patent applications to the EPO by priority year by NUTS 3 regions per million inhabitants


Total intramural R & D expenditure (GERD) by sectors of performance (Percentage of GDP)


Total R&D personnel and researchers by sectors of performance, sex (Percentage of active population)



Individuals who used the internet for interaction with public authorities           


Individuals who have never used a computer (% of population)


Individuals who ordered goods or services over the Internet for private use (% of population)


In this Page

On this section