Time Series Feature Extraction: Human Activity Recognition
Extracted 42 statistical features from 6-channel sensor data for human activity recognition. Used bootstrap resampling for confidence intervals on UCI AReM dataset across 7 activity types.
Problem
Wearable sensors and IoT devices generate continuous multivariate time-series data for activity recognition, but raw sensor streams are noisy and high-dimensional. Requires systematic feature extraction to identify meaningful statistical patterns that distinguish different human activities for applications in healthcare monitoring, fitness tracking, and elderly care systems.
Approach
Analyzed the UCI AReM (Activity Recognition system based on Multisensor data fusion) dataset containing 7 distinct human activities (walking, standing, sitting, bending1, bending2, and others) with 6 multivariate sensor channels per activity (avg_rss12, var_rss12, avg_rss13, var_rss13, avg_rss23, var_rss23). Implemented comprehensive time-domain feature extraction computing 7 statistical features per channel: minimum, maximum, mean, median, standard deviation, first quartile (Q1), and third quartile (Q3), yielding 42 total features (7 features × 6 channels). Applied bootstrap resampling to estimate 90% confidence intervals for feature standard deviations, enabling statistical validation and feature importance ranking. Performed train/test split with bending activities using 2 test files and other activities using 3 test files. Conducted feature selection analysis to identify top 3 most discriminative features for activity classification.
Impact
Successfully extracted and validated 42 statistical time-domain features from multivariate sensor data with rigorous bootstrap confidence interval analysis. Identified key features that reliably distinguish human activities, providing foundation for classification models. Demonstrated systematic approach to time series feature engineering for wearable sensor applications, with extensibility to frequency-domain analysis and advanced ML classifiers (Random Forest, SVM).
Key Metrics
Technologies
Links
My Role
Sole developer - preprocessed multivariate sensor data from 7 activity categories, implemented time-domain feature extraction pipeline computing 42 statistical features, applied bootstrap resampling for confidence interval estimation, conducted feature importance analysis using standard deviation distributions, performed train/test data splitting, identified top 3 discriminative features, documented methodology and statistical validation in Jupyter Notebook.
Team Size: 1 person