Senior Manager, Biomarker Data Scientist at Bristol Myers Squibb, Princeton, NJ, USA | Dec 2023 - present
Global Biometrics & Data Sciences | Cell Therapy and Oncology data science
Data Scientist at GyanData Pvt. Ltd., Chennai, India | Jul 2018 - Jun 2019
ML for Sports Analytics with ESPN-cricinfo: Used ESPN's ball-by-ball historical dataset for cricket matches in the past 15 years to build a machine learning tool for predicting match scores, quantifying impactful match events, and generating 'smart statistics' for players
Built modules in Python for performing optimal balls-allocation between bowlers and batsmen, estimating wicket probability at a given state, and estimating match-win probabilities by factoring in both historical and current data
Combined all the modules together to build an interactive match-simulation tool with quantification of impactful events. This tool is being used by ESPN-cricinfo since the Indian Premier League 2019 and ICC World Cup 2019 worldwide
Anomaly Detection and Prediction: Built an L1 trend-extraction routine in Python with built-in hyperparameter estimation module for a piecewise linear trend extraction on any general time-series signal; core of the algorithm uses CVXOPT for optimization
Implemented a fuzzy variant of C-means clustering on the estimated linear trends to identify sub-optimal, or anomalous operating regimes through clustering of the operating regimes based on a pre-defined optimality criterion
Performed subspace angle comparisons between principal vectors to assess cluster separations and derive process insights
Integrated all the three modules as a Python package and shipped to the end user with Sphinx generated documentation
Manager, Technology at CleanMax Solar, Mumbai, India | Jul 2017 - Jun 2018
Headed an IoT-based remote monitoring system used for managing 400+ rooftop solar plants with combined capacity of 100+ MW; implemented outlier detection algorithms and performed root-cause analysis of failures to maximize generation
Developed predictive machine learning models for identifying sub-optimal inverter performance using operational plant data and demonstrated reduced downtime at pilot sites through statistical hypothesis testing
Internships
Talent Development Academy Intern, Eli Lilly and Company | May - Aug 2023
Synthetic Molecule Design and Development (SMDD)
Pharmaceutical information extraction combining domain knowledge and natural language processing
Structure to property prediction for peptides using interpretable machine learning models
Pharmacometrics Intern, Novartis | Jun - Aug 2020
Developed bootstrapping and autocovariate search modules for nlmixr - an open-source R package developed at Novartis for performing PK/PD modeling in R
Implemented stepwise covariate modeling (SCM) and LASSO-based covariate search algorithms for improving the predictive ability of models used for studying drug effects in human trials
Developed code now part of 3 published CRAN packages in R — nlmixr2est, nlmixr2plot, nlmixr2extra
Summer Research Intern, ASEA Brown Boveri (ABB) | May - Aug 2015
Implemented a novel segment identification algorithm in MATLAB to identify 'good regions' in historical databases
Comparatively analyzed an iterative-autoregressive exogenous (ARX) algorithm with the existing system identification algorithm at ABB; proposed changes to make the algorithm more robust towards high noise conditions
Proposed unification of segment identification and iterative ARX algorithms for use in ABB's model identification toolbox
Teaching
Spring 2023: Systemic Risk Management, School of Professional Studies, Columbia University
Fall 2022: AI in Chemical Engineering, Columbia University
Summer 2022: AI in Biochemical and Chemical Engineering, Technical University of Denmark (DTU)
Fall 2021: AI in Chemical Engineering, Columbia University
Fall 2019: Math Methods in Chemical Engineering, Columbia University
Spring 2017: Introduction to Statistical Hypothesis Testing, IIT Madras
Fall 2016: Applied Time Series Analysis, IIT Madras