Automatic Travel Mode Prediction in a National Travel Survey


Goal: Showing the feasibility of automatic travel mode prediction using smartphone location data in a national travel survey. Data collection: In the fall of 2018, 1,902 respondents were randomly sampled from the Dutch population to participate in a smartphone-based travel study. A purpose-built app that collected location data and generated a diary of stops and trips was used. For the trips, respondents could label which travel mode they used. Of the respondents, 517 completed data collection for at least 7 days and a total 18,414 trips were collected, of which 5,641 were labelled. Method: Every trip consists of a string of chronological ordered GPS points. From these points, trip-level features were engineered, such as average speed. Context-location data, such as the location of public transport stops, was then added and extra features such as how many train stations were passed during a trip were calculated. In addition, the data was enriched with respondent-level characteristics, available through Dutch registries. In total 127 features were engineered. A Random Forest Algorithm was then used to predict travel modes from these features. The transport modes distinguished are: Walking, Bike, E-bike, Car, Bus, Metro, Tram, Scooter, Train, and erroneously recorded trips. This last one is unique to this research, but inherent to app-based studies. Results: For 62% of trips the correct transport mode is predicted, when treating trips as independent events. Taking into account how often respondents used a certain transport mode increases the accuracy to 70%. Collapsing similar transport modes, such as bikes and E-bikes, also positively affects the accuracy. However, not all modes of transport can be as accurately classified.

CBS Discussion Paper