Starry background

SKILLS

Programming Languages

Python
ExpertDeep expertise and mastery
R
ExpertDeep expertise and mastery
Java
IntermediateGood working knowledge
C++
BeginnerBasic understanding
Rust
BeginnerBasic understanding
Scala
BeginnerBasic understanding

Machine Learning & AI

PyTorch
ExpertDeep expertise and mastery
TensorFlow
ExpertDeep expertise and mastery
Scikit-Learn
ExpertDeep expertise and mastery
Transformers
ExpertDeep expertise and mastery
MLflow
AdvancedStrong knowledge and experience
Spark ML
AdvancedStrong knowledge and experience
DVC
BeginnerBasic understanding

Cloud Platforms

AWS
BeginnerBasic understanding
Azure
AdvancedStrong knowledge and experience
GCP
AdvancedStrong knowledge and experience

Languages

Spanish
ExpertDeep expertise and mastery
English
AdvancedStrong knowledge and experience
French
BeginnerBasic understanding

Tools & Frameworks

Docker
ExpertDeep expertise and mastery
Git
ExpertDeep expertise and mastery
SQL
AdvancedStrong knowledge and experience
NoSQL (ElasticSearch, MongoDB)
ProficientSolid understanding and capability

WORK

DATA SCIENTIST. Machine Learning, AI and LLM specialist

SimplekycFebruary 2022 - Present

My primary role involves designing, training, and optimizing machine learning and Deep Learning algorithms, focusing on Natural Language Processing (NLP), computer vision, and Large Language Models (LLMs) to real-world applications such as document understanding and information completion. I primarily use Python and libraries like PyTorch, Transformers, FlairNLP, and Scikit-learn to build machine learning and deep learning models. We have dockerized every project, and we manage our model's lifecycle using MLflow while annotating our data with Label Studio. We use resources in GCP and Azure for high-computation, model training, and data storage.

  • Develop Machine Learning and Deep Learning models
  • Specialize in LLMs, RAGs Agents, and advanced Prompt Engineering techniques
  • Expertise in computer vision, NLP, and machine learning models using PyTorch, Transformers, Scikit-Learn, FlairNLP, SpaCy, MLflow, DVC, and Docker
  • Cloud platforms: Azure and GCP
  • Lead Annotation Team using Label Studio

DATA SCIENTIST. Machine Learning, AI and LLM specialist

RedactameMarch 2023 - October 2023

This client required an API capable of delivering accurate responses about lengthy documents, designed to function like a teacher aiding in the understanding of specific subjects. With limited resources for data handling and storage, I constructed the API using Flask, Docker, OpenAI, and Scikit-learn, deployed via Azure App Services. It extracts text from documents using OCR for image-based pages or a text size ratio method for text-based documents.

  • Developed Python API, using Flask, SQLAlchemy, Docker, LLMs, and RAGs
  • Implemented Text Classification and Recommendation System
  • Cloud platforms: Azure

DATA SCIENTIST. Machine Learning and AI specialist

CGIOctober 2021 - January 2022

During my experience in CGI, I worked in two different projects. The first one related to computer vision and the energy sector, and the second one related to natural language processing and machine learning. In the first project, I built the architecture for several neural networks, using TensorFlow and PyTorch for image segmentation. Our objective was to segment the cells of solar modules, to have them located for posterior analysis. I built 3 different architectures, the classic U-Net, Deeplabv3 and ResUnet comparing the resultant metrics and the time performance I got the Unet model as the best one for our dataset. We used also a Tensorflow Object Detection model to locate the solar modules and then we fed with them the Unet model to locate the cells. In the second project, I designed a machine learning training system for a classification problem and collaborated with the NLP team to develop a sentiment analysis model that synthesized information from various media sources. This model was integrated with other data sources to predict potential debt defaults using Scikit-learn and XGBoost models.

  • Developed Machine Learning and Deep Learning models, object detection, and segmentation with TensorFlow, PyTorch, Scikit-Learn, and MLflow, DVC using Docker
  • Computer vision and NLP
  • Cloud platforms: AWS and Azure
  • Annotations Tools: Label Studio

DATA SCIENTIST. Machine Learning and AI specialist

Vicomtech, Computer Vision DepartmentFebruary 2021 - September 2021

My initial role in a scientific setting involved applying my academic knowledge to real-world data challenges. I collaborated with a Greek partner to develop federated learning systems, enabling productive model training using client data while addressing legal and data sensitivity concerns. We replicated the client's computational architecture, with one server acting as an orchestrator and three smaller servers hosting the data hosts. Our projects focused on recognizing individuals in images, predicting ages, extracting text, and locating images, contributing to a Europol initiative to combat child abuse and illegal border crossings. My involvement in federated learning informed my second master's thesis, which provided an overview of the approach and tested it by training an image classification model across various machines with different weight merging algorithms.

  • Researched cutting-edge technologies in Federated Learning, Deep Learning, and Machine Learning
  • Specialized in computer vision and NLP using TensorFlow, PyTorch, Scikit-Learn, MLflow, Kubeflow, and DVC
  • Docker expertise
  • Managed Ubuntu servers with high computation level

Data Engenier

FujitsuJuly 2019 - January 2021

In my first professional role, I learned fundamental skills in Python, Docker, and SQL, which have significantly benefited my current work. I was involved in a web maintenance project, creating data views for new features and resolving issues with existing ones. I used PySpark and SQLAlchemy to develop ETLs, updating client databases as needed.

  • ETLs with Python and Docker using PySpark, SQLAlchemy
  • Data Bases: Oracle, ElasticSearch, SQL, PL/SQL
  • Web Maintenance: Java, JavaScript, HTML, CSS

EDUCATION

Master's in Data Science and Big Data

University of SevilleOctober 2020 - December 2021

This degree was a challenge while working at the same time. I had to work hard to find a balance between the master's classes, my job, evaluation exercises, and social life.

The professors taught a wide range of technologies, primarily based on Python and R. I learned Python modules such as Pandas, Numpy,BeautifulSoup, nltk, PySpark (SQL and ML), Scikit-learn, Matplotlib, and TensorFlow on bothCPU and GPU, along with RAPIDS (cuML, cuDF, cuGraph). In R, we studied and used many modules, the most notable being Shiny, RMarkdown, cluster, ggplot, Hidden Markov Models(HMM), adabag, xgboost, randomForest, ROCR, rpart, caret, mlbench, and e1071 for support vector machines, Naive Bayes classifier, and generalized k-nearest neighbors. Other modules included H2O, foreign,PCAmixdata, leaps, GA (genetic algorithms), lattice, latticeExtra, spdep (Spatial Dependence),GSTAT (Spatial and Spatio-Temporal Geostatistical Modelling, Prediction, and Simulation), stats, TSA, tseries, forecast, and lmtest. The master's program also introduced technologies like Pentaho, SQL (MySQL), and NoSQL (MongoDB).

The evaluation focused on practical work and creating understandable outputs from technically complex studies for non-technical audiences. We included visual informationand key annotations for data, training, and results from the models used to solve various problems. This master's program marked my first encounter with Deep Learning, which has since become my primary area of interest.

Master's in Big Data and Business Intelligence

Inesem Bussines SchoolJanuary 2019 - January 2020

This online master's program allowed me to juggle my last year of my bachelor's degree in Mathematics, my first job, and the exams of the different subjects.

It gave me an initial understanding of applied statistics to real-world data, an introduction to Big Data technologies and tools, an overview ofBusiness Intelligence and its role in key business decision-making, web analytics, and various methods to extract crucial information.

The master's program helped me understand essential information for business decision-making, and I had my first contact with tools like Python and its main data management modules, NoSQL with MongoDB, SQL with MySQL, and web analytics with Google Analytics, as well as different types of dashboards and technologies to build them.

Bachelor's degree in Mathematics

University of SevilleOctober 2014 - July 2019

In university, I had my first contact with abstract science. Although I always enjoyed math, it was here that I discovered my passion for Mathematics—a field grounded in foundational axioms and definitions that allow for proving conjectures. I had the privilege of learning from an exceptional team of professors. I studied and understood topology, algebra, calculus, statistics, and their advanced concepts, including multivariate calculus, geometry, numerical series, linear and non-linear models, Markov models, Fourier analysis, functional analysis, abstract algebra, logic, differential equations, numerical analysis, and even basic physics.

The challenges of studying these subjects taught me the right way to learn, the importance of being patient and consistent in my work, and the need to take time to fully address problems, visualize all aspects, and plan a strategic approach to solve them. This degree ignited my passionfor science and strengthened my curiosity and researcher spirit.