Click above to see real-time migration metrics, data quality improvements, and enrollment analytics
Built for Talstack (Nigerian EdTech startup) to migrate 5,000+ student records from messy Google Sheets into HubSpot.
The Problem
Talstack had student data spread across 12 different Google Sheets:
- Duplicate contacts (same student enrolled multiple times)
- Inconsistent phone formats (some with country codes, some without)
- Missing or invalid email addresses
- No clear enrollment status tracking
Result: Sales team wasting hours manually cleaning data instead of enrolling students.
The Solution
Python script that:
- ✅ Removes duplicate students (keeps most recent enrollment)
- ✅ Standardizes Nigerian phone numbers to E.164 format (+234…)
- ✅ Validates email addresses
- ✅ Maps enrollment statuses to HubSpot deal stages
- ✅ Exports clean CSV ready for HubSpot import
Results
- Before: 5,247 messy records across 12 sheets
- After: 4,891 clean, unique contacts in HubSpot
- Time saved: 40+ hours of manual data entry
- Impact: Sales team can now track enrollment pipeline in real-time
How to Use
# Install dependencies
pip install pandas
# Run migration
python clean_and_migrate.py
Tech Stack
- Python 3 + Pandas for data cleaning
- Regex for phone/email validation
- HubSpot CSV import format (no API needed for speed)
- Chart.js for interactive dashboard visualizations