Mikhail Dubov

Mikhail Dubov

Data Scientist & Software Engineer

Work Experience

Revolut

Data Scientist • Revolut Business

London • Jul 2020 – present

  • Built a family of ML models for customer LTV; productionised them in Python & Airflow for offline predictions (marketing optimisation) / in Java for online predictions (KYC tickets)
  • Designed and prototyped an ML-based receipt OCR system for expense management
  • Built a production-grade library for an offline feature store (Python / SQLAlchemy)
  • Built a system for collection and statistical analysis of customer NPS (Python / Looker)
  • Built a range of ETL pipelines in Airflow to improve business data accessibility
  • Completed A/B test design and analysis for a range of marketing comms / app features

Smarkets

Quantitative Analyst • Trading

London • Oct 2018 – Jul 2020

Smarkets is one of Europe's leading sports trading platforms.

  • Worked on five trading strategies for market making / prop trading on Smarkets and other platforms. Owned one of these strategies from the initial design and implementation to maintenance and ops (3.7% ROI, 8.5 ann. Sharpe)
  • Built a semi-supervised ML system for classifying the trading flow
  • Built a decision tree-based system for automating hedging decisions
  • Developed an internal Python library for quant data loading
  • Extended and optimized the P&L analytics system (Python / SQLAlchemy / BigQuery)

Data Scientist • Marketing

London • Dec 2016 – Sep 2018

  • Built a customer churn prediction model improving retention by 25%
  • Implemented ETL pipelines and ML models for a customer LTV system
  • Built an internal time series analysis and anomaly detection system based on FB Prophet
  • Worked on A/B test design and analysis of marketing experiments

Mirantis

Software Engineer • OpenStack Performance

Moscow • Aug 2013 – Jun 2015

OpenStack is an open-source cloud computing platform used by CERN, Baidu, and others.

  • Was one of the core developers in OpenStack Rally (benchmark tool) and top #2 contributor with 100+ commits, 900+ code reviews
  • Implemented core components in Rally: benchmark launchers, data processing tools, API
  • Mentored the open-source community and created extensive Rally docs
  • Optimized the node listing algorithm in OpenStack Nova (resource manager), achieving a 5x performance improvement

Internships & part-time

Google

Software Engineering Intern • gTech

London • Jun 2015 – Sep 2015

  • Built a client app for an internal time series forecasting platform in Java
  • Added support for ML-based time series analysis (Google Prediction API)

Education

Higher School of Economics

M. Sc. in Data Science
B. Sc. in Software Engineering

Moscow • Sep 2010 – Oct 2016

Université Paris-Est Marne-la-Vallée

M. Sc. in Computer Science

Paris • Oct 2015 – Sep 2016

Skills

  • Languages: Python, Java, SQL, previous experience with C#, F#, R
  • Libraries: Pandas, Scikit-learn, FB Prophet, Numpy, Scipy, SpaCy
  • Software engineering: Git, Docker, Kubernetes, Pytest, JUnit
  • Data engineering: Airflow, Luigi, Postgres, Redshift, BigQuery, Exasol
  • Analytics: Looker / LookML, Periscope, Metabase

Languages

  • English (fluent, IELTS 8.5/9.0)
  • German (fluent, DSD C1)
  • French (intermediate)
  • Russian (native)