Analyzing and grouping typical patient trajectories is crucial to understanding their health state, estimating prognosis, and determining optimal treatment. The increasing availability of electronic health records (EHRs) opens the opportunity to support clinicians in their decisions with machine learning solutions. We propose the Multi-scale Health-state Variational Auto-Encoder (MHealthVAE) to learn medically informative patient representations and allow meaningful subgroup detection from sparse EHRs. We derive a novel training objective to better capture health information and temporal trends into patient embeddings and introduce new performance metrics to evaluate the clinical relevance of patient clustering results.