I'm Samuel Huerga
Lead Data Scientist
- Phone +34 622 255 655
- E-mail samuelhuerga87@gmail.com
- Location Madrid (Spain)
Data Scientist, R enthusiastic and shiny developer.
Always up to date on new trends and resources, I enjoy discovering insights from data.
R teacher, covering data manipulation, modeling and visualization through tidyverse ecosystem.
Work Experience
Leading a group of 2 people and providing main analytics to Marketing, Content and some Product squads
A/B tests design, analyses and insights presentation of impact on retention and funnel metrics of new functionalities
Support to provide data metrics to new products and features
Technical lead of 5 data scientists and in charge of analytical development at Growth and Marketing department of:
Algorithms to calculate LifeTime Value of new customers.
User Scoring models used for segmentation based on usage of application
Media Mix Models to attribute organic subscriptions to Marketing drivers and uplift of paid channels.
A/B tests and price elasticity experiments statistical support
Defining OKRs, designing roadmaps, planing sprints and aligning with stakeholders
New analytical solutions based on:
NLP (Natural Language Processing) algorithms, providing sentiment analysis, topic discovery and topic classification.
Deep Learning algorithms to extract and classify arousal in speech audios when people are being interviewed.
Development of Shiny web applications with more than a hundred users.
Administration of servers from scracth, setting up all needed components.
Mathematical models focused on prediction and optimization:
Econometric models, linear and logistic regression, loglinear models, SVM, neural networks, decision trees, random forests, ARIMA models, clustering, segmentation, asociation rules and heuristic optimization algorithms.
Shiny developer of big web tools with dozens of concurrent users.
Data analysis from multiple sources: Big Data platforms (Cloudera), geospatial, web pages (scraping), relational databases and flat files.
Automatization of recurrent analysis ensuring reproducibility, from data extraction to results presentation through automatic reports and shiny web tools.
Exploratory data analysis of large amount of features, regarding visualization and summarising the information to provide business oriented insights.
Teaching assistant to new employees through DataCamp.
Experience in tidyverse ecosystem (dplyr, ggplot), Rmarkdown, highcharts and Shiny.
Datawarehouses load and optimization of ETL processes with Unix, PLSQL and Business Objects.
Data extraction for Reporting and optimization processes.
Development and implementation on a Big Data environment with HIVE and sqoop.
Pilot projects investigation and technical support for data visualization tools, like QlikView and PowerPivot.
Technologies
Scripting
Proficient in R
Python
DBT
Unix
Data querying
Tidyverse (R) advocate
SQL
Databases
Redshift
Snowflake
PostgreSQL
Oracle
DB2
SQLite
Big Data
AWS
Spark
Keras
Reporting
Looker
Amplitude
Rmarkdown
Visualization
Highcharts
ggplot2
Version control
Github
Gitlab