Abstract:
The ubiquitous availability of wearable sensors is responsible for driving the Internet-of-Things, but is also making an impact on sport sciences and precision medicine. While human activity recognition from smartphone data or other types of inertial measurement units (IMU) has evolved to one of the most prominent daily life examples of machine learning, the underlying process of time series feature engineering still seems to be a time-consuming process. This inhibits the development of IMU-based machine learning applications in sport science and precision medicine. This contribution discusses the automation of time series feature engineering on the basis of the FRESH algorithm (FeatuRe Extraction based on Scalable Hypothesis tests) to identify statistically significant features from synchronized IMU sensors (IMeasureU Ltd, NZ). By identifying time series characteristics in an early stage of the data science process, our approach closes feedback loops with domain experts and fosters the development of domain specific features. The automated time series feature engineering process for human activity recognition will be discussed on the basis of the Python package tsfresh, which implements the application programming interface of standard machine learning libraries like scikit-learn and has been adapted by more than 2600 data scientists since its publication in October 2016.