Collecting effective data is a fundamental step in developing transport networks and related research. Social media have become an emerging source of data for traffic analyses. In this study, we demonstrate that the function of a city influences the utility of social media data in travel demand models by generating models for eight US cities with different functions.
Data from Twitter and Foursquare, as well as other socio-demographic information, are considered as independent variables in Origin-Destination trip regression models generated via a Random Forest regression technique. Model performance with and without use of social media data are compared via 10-fold cross-validation. The results indicate that the accuracy of the models for all eight cities improved when independent variables based on social media data were included. The performance was most improved in metropolitan areas, followed by rural and tourist areas. Inspired by this finding, we conclude that the city function influences the utility of social media data in travel demand models. Meanwhile, we create models based on trip purpose and transport mode to explore other factors that may impact the efficiency of applying social media data in transport research.