Decision making: data driven vs. expert consultation

To measure is to know? Transport data can show ‘what’, but experts are essential to show ‘why’. New forms of transport data have enormous potential but the author argues that we need to update our theories and models to fully take advantage of it and avoid misuse. We need to be able to answer ‘what to’ questions and not only ‘what if ‘.

01. Introduction

Conventional decision making in the Transport sector has been traditionally strongly supported by field/domain experts, dedicated tools and models, always having raw data as a basis for the analysis. The recurrent problem was the lack of data or the difficulty in collecting this data. The recent technological advances have increased the quantity (and quality) of available data significantly, especially when these have been supported by a behavioural change thanks to which individuals participate actively to the creation/collection of data (Waze and Open Street Maps are examples of users’ participation in the creation of large and reliable datasets). As a result, nowadays the amount of data that can be generated and collected is enormous and of high quality.

When public authorities are faced with the task of solving concrete problems, they usually approach the solution with a “what to” type of question, to which experts respond by defining a set of scenarios and using tools (models) for answering a set of “what if” type of questions.

02. Conventional decision making process

The currently used transportation models were initially developed in the early 1950’s, where the lack of technical capabilities for data collection created the necessity for developing methodologies for the estimation of various variables (e.g. travel time based on traffic flow using the BPR function). However, nowadays new data sources and big data are able to provide information of higher value, previously not available for supporting decision making. An interesting (and real-world) example of this is to create a park without sidewalks and let the people define those by using the park. The data collected in this case will be the marks of the paths followed by the people, which after a few weeks will define a clear set of routes within the park, defined by and for the users. Once the “data collection period” ends, the sidewalks can be built in the park following the user-driven design.

Indeed, during the last years, immense developments in the computer science domain have taken place. The continuous technological advances and a societal shift towards information sharing and gamification have resulted to large-sized datasets (big data) being collected and to some extent made available (open data), which grow exponentially.

In the Transport domain, big data includes significant pieces of information that, if correctly processed, are able to provide a better understanding of both the mobility needs and the traffic status, by enriching significantly the quantity and the quality of the data collected until today. They are mainly of disaggregated nature, encompassing information of individual actors of the mobility spectrum (i.e. travellers/drivers and vehicles) in real time at a very low cost in comparison to the conventional (mostly aggregated) traffic data collection methodologies. As a counterpart to the highest quality of disaggregated data, the data filtering and processing needs have increased significantly due to this increase in the granularity of the data. The lately introduced big data can be included in the transportation modelling framework in order to provide significant improvements or even a re-formulation of the conventional and largely used transportation models and tools, if not in a new definition and development of the transportation models. Conventional modeling was mostly based on isolated traffic flow measurements with heterogeneous coverage, but current available data sets such as Floating Car Data and Bluetooth devices detectors allows for more homogeneous coverage.

The new models will be strongly based on activity-based and/or agent-based modelling techniques, which have been largely developed during the last years and are able to provide high-quality results, but require significantly large and detailed datasets. The table below highlights the most important issues of conventional transportation modelling and the new capabilities enabled by big data.

4-step Transportation modellingThe era of Data
Since the mathematical framework was already defined and delimited, the major problem was how to collect the requested data due to:Nowadays the problem is not how to collect data, but how to select the right datasets, how to clean the data and how to process it.
Time constraints and limited resources
New keyword: Big Data
Computation limitationsNew specialization: Data Scientists
High dependence on transport modellers expertise
IT capabilities constantly a step back from those of transport modellers
Less theory, more application
Now transport modelling is a step back from those of IT capabilities

A non-exhaustive list of new data sets presenting important potential for the transport sector is the following:

  • Floating Car Data collected by professional fleets and individuals.
  • Bluetooth devices detections.
  • GSM related data.
  • Social media data.

From the above datasets mobility and activity patterns can be inferred, which is the main fuel of any transportation model. In Thessaloniki the Hellenic Institute of Transport (HIT) of the Centre for Research and Technology Hellas (CERTH) is collecting most of them (only the GSM data is not available) and using it for understanding mobility and activity patterns as well as for providing mobility services. A few figures of how the data is used in Thessaloniki are presented below:

Floating Car Data

This data is collected from a fleet of 1.200 taxis and is used for calculating Origin-Destination matrices for taxi users as well as for estimating traffic status in real time.

Bluetooth devices detectors

The network of Bluetooth devices detectors is composed by 43 units detecting Bluetooth-enabled devices, which detections are used for calculating Origin-Destination matrices and travel time in real time.

Social media data

Data from social media networks (Facebook and Twitter) is collected in Thessaloniki for inferring activity patters of the population.


Big Data Europe

It is worth to mention that Thessaloniki is the pilot location for the demonstration related to the Societal Challenge of the Transport sector, which will allow for an accurate and multi-source based prediction of the traffic status in the city.

03. Conclusions

Transportation modelling is a powerful tool for supporting decision making and should be adapted to new data sources and processing capabilities, which will allow for answering to “what to” type of questions and not only to “what if” type of questions.

The era of Data in which we are now, provides both the data and the processing capabilities, but the theoretical framework should be adapted/updated in order to be able to take advantage of these two assets while avoiding data misuse.

Finally, the participation of an expert is always indispensable as data can show “what”, but it cannot always explain “why”.


I would like to thank my colleagues, with whom I conducted most of my research discussed briefly in this note: Evangelos Mitsakis, Panagiotis Tzenos, Iraklis Stamos, Manolis Chaniotakis, John Toumpalidis, Georgia Aifandopoulou.

Josep Maria Salanova Grau

Dr. Salanova graduated from the Polytechnic School of the University of Catalonia (U.P.C.), Department of Civil Engineering in 2007. In 2008-2009 he acquired the MSc on Design, Organization and Management of Transportation Systems of the Aristotle’s University of Thessaloniki. From 2010 to 2013 he conducted his PhD research in the Polytechnic School of the University of Catalonia (U.P.C.) with his dissertation entitled “Modelling of taxicab fleets in urban environment”. Currently, he is finalizing the Data Science Specialization at the Johns Hopkins University. In 2007 he worked for CENIT (Centre for Innovation in Transport) in Barcelona. He works in the Hellenic Institute of Transport since 2008, where he is leading the “Data collection and processing, algorithm design, and use of specialized transport software packages” laboratory.

  • Salanova J. M., Estrada M., Aifadopoulou G., Mitsakis E. (2011). A review of the modeling of taxi services. Procedia – Social and Behavioral Sciences (ISSN: 18770428), Vol 20, pp 150-161.
  • Gonzalez Feliu J., Morana J., Salanova J. M., Tai-yu Ma (2013). Design and scenario assessment for collaborative logistics and freight transport systems. International Journal of Transport Economics Vol. XL, No. 2, pp 207-240.
  • Mitsakis E., Salanova J. M., Chrysohoou E., Stamos I., Aifadopoulou G., (2014). Multi-criteria route choice in road networks. International Journal of Information and Decision Sciences (IJIDS), Vol. 7, No. 1, 2015 pp. 3-17.
  • Mitsakis E., Chrysohoou E., Salanova J. M., Iordanopoulos P., Aifadopoulou G. (2017). The sensor location problem: methodological approach and application. Transport 31(4) pp. 1-7.
    Salanova J. M., Maciejewski M., Bischoff J., Estrada M., Tzenos P., Stamos I. (2017). Use of probe data generated by taxis. Big Data for regional science. Routledge Advances in Regional Economics, Science and Policy. Taylor & Francis Group.
  • Salanova J. M., Mitsakis E., Stamos I. (2014). Big urban probe data for the provision of advanced traveler information services and traffic management schemes. Big Data and Urban Informatics Chicago, Illinois, 11-12 August 2014.
  • Salanova Grau J. M., Chaniotakis E., Mitsakis E., Aifantopoulou G., Big data for transportation analysis and trip generation. Big data: a new opportunity for urban transport and mobility policies, 10-11 March 2016, Seville, Spain.
  • Salanova J. M., Chaniotakis E., Mitsakis E., Aifandopoulou G., Bischoff J., Mobile data for transportation. Mobile Data, Geography, LBS, 29/06 – 01/07 2016 Tartu, Estonia.
  • Thinking highways article (

Leave a Reply

Your email address will not be published.