SafeRunRL: Safe Reinforcement Learning for Adaptive Treadmill Control
Simulation-based safe RL controller using wearable physiological signals, uncertainty-aware state estimation, and action shielding.
SafeRunRL is a research-oriented engineering prototype for adaptive treadmill control from wearable physiological signals. The goal is not to claim a deployed medical safety system, but to build a realistic safe-RL project that connects wearable sensing, uncertainty-aware state estimation, human-in-the-loop control, and constrained sequential decision making.
The system estimates a user’s exercise state from PPG, ACC/IMU, heart-rate trend, signal quality, activity intensity, and user profile, then adjusts treadmill speed and incline under safety constraints. The policy is trained and evaluated in a personalized cardiovascular simulator before any hardware-level control is considered.
Core algorithmic themes:
- Wearable state estimation. Convert noisy PPG/ACC streams into heart-rate trend, recovery proxy, signal-quality index, OOD score, motion intensity, and user-state features.
- Human digital twin. Simulate individual cardiovascular response to speed, incline, fatigue, recovery, and sensor noise so that policies can be trained without unsafe real-world exploration.
- Constrained MDP formulation. Separate training goals such as target heart-rate tracking and comfort from safety costs such as overshoot, unsafe acceleration, low-SQI actions, and stop conditions.
- Safe RL with action shield. Compare rule-based, PID/MPC, PPO, and Lagrangian safe PPO controllers while projecting all actions through a hard safety supervisor.
- Stress-test evaluation. Evaluate time-in-zone, overshoot, action smoothness, safety violation rate, OOD behavior, and worst-case performance across simulated user profiles.
This project is designed to demonstrate practical reinforcement learning judgment: the RL policy is useful only when it is paired with baselines, uncertainty estimates, safety constraints, and interpretable fallback logic.