Scaling is caring: building data processing pipelines for scalable deployment of machine learning models in healthcare
A key challenge we face at Pacmed is quickly calibrating and deploying our tools for clinical decision support in different hospitals, where data formats may vary greatly. Using Intensive Care Units as a case study, I’ll delve into our scalable Python pipeline, which leverages Pandas’ split-apply-combine approach to perform complex feature engineering and automatic quality checks on large time-varying data, e.g. vital signs. I’ll show how we use the resulting flexible and interpretable dataframes to quickly (re)train our models to predict mortality, discharge, and medical complications.
Michele is a data scientist and machine learning engineer at Pacmed, where he builds decision support tools for medical applications. His current work mostly focuses on software for Intensive Care Units, tackling challenges such as predicting mortality, eligibility for discharge, readmission, and bed capacity. He also likes to work on improving scalability, interpretability, and speed of implementation of models in production. He's a passionate Pythonista, with a background in Biomedical Engineering and Robotics, as well as experience in applied deep learning research.