ChurnGuard ML

ML Engineer · 2023 · 2 Months · 2 people · 3 min read

A predictive analytics engine that identifies at-risk clients with high precision, enabling proactive retention strategies that resulted in a 65% reduction in client churn.

Overview

Developed an end-to-end machine learning pipeline to address rising attrition rates. The system processes historical usage patterns, support ticket frequency, and contract telemetry to generate a daily 'Risk Score' for every client, allowing the success team to intervene before a cancellation occurs.

Problem

The company was reacting to churn only after a cancellation notice was filed. There was no quantitative method to identify 'silent' churn—users who were still paying but had stopped using the product—leading to a consistent month-over-month loss in MRR.

Constraints

Model must achieve high recall to ensure no at-risk clients are missed
Requires explainable AI (XAI) so sales reps understand 'why' a client is at risk
Must integrate with existing CRM (Salesforce) via automated API triggers
Predictions must be updated every 24 hours based on the latest telemetry

Approach

Engineered a Random Forest classification model using Scikit-learn, trained on two years of anonymized user behavior data. I implemented custom feature engineering to capture 'velocity' metrics (e.g., the rate of decline in login frequency) rather than just static totals. To solve the 'black box' problem, I used SHAP values to provide reps with the top three reasons for every high-risk score.

Key Decisions

SHAP (SHapley Additive exPlanations) Integration

Reasoning:

A probability score alone isn't actionable for a CSM. By calculating SHAP values, the CRM dashboard displays specific triggers like 'Decreased API usage' or 'Unresolved high-priority tickets,' giving the team a specific script for their outreach.

Alternatives considered:

Standard Logistic Regression (Better interpretability but lower predictive power)
Deep Learning/Neural Networks (High accuracy but impossible to explain to non-technical staff)

SMOTE for Class Imbalance

Reasoning:

Since churned clients represented only 5% of the total dataset, the model was originally biased toward 'non-churn.' I used Synthetic Minority Over-sampling Technique (SMOTE) to balance the training set, significantly improving the model's sensitivity to at-risk behavior.

Alternatives considered:

Random Undersampling (Lost too much valuable majority-class data)
Adjusting Class Weights (Less effective than synthetic generation for this specific data density)

Tech Stack

Python (Scikit-learn / Pandas)
SQL (BigQuery)
SHAP
Airflow
FastAPI
Docker

Result & Impact

65% Decrease in annual attrition

Churn Reduction
92% Precision / 88% Recall

Model Accuracy
Estimated $1.2M in retained ARR

Revenue Saved

The project shifted the entire company culture from reactive to proactive. The 'Risk Score' became a primary KPI for the Customer Success team, and the automated alerts allowed them to save accounts that would have otherwise been lost to competitors.

Learnings

Feature engineering (how data is prepared) is more impactful than model tuning for behavioral prediction.
Explainability is the key to stakeholder adoption; if the sales team doesn't trust the 'why,' they won't use the tool.
Data leakage is a significant risk in churn modeling; you must be extremely careful not to include features from 'the future' relative to the prediction point.

Additional Context

The most critical part of the project was the Feature Velocity logic. I realized that a client who has 100 logins a month might look healthy, but if they had 500 logins the previous month, they are actually a high churn risk. By calculating the percentage change in activity over 7, 30, and 90-day windows, the model was able to catch the “downward trend” long before the user actually churned.

The deployment was handled via a FastAPI wrapper inside a Docker container, orchestrated by Airflow. Every morning at 4:00 AM, the pipeline pulls the latest data from BigQuery, runs the inference, and pushes the updated scores and SHAP explanations directly into the CRM.

All projects