Socials section
About Me

Data Scientist @ Wellhub
Focus: LLMs, Agents, Fine-tuning, RAG, Model Evaluation, Deployment
I am a Data Scientist at Wellhub, specializing in the research, development, and deployment of solutions based on Large Language Models (LLMs). My current focus areas include building autonomous AI agents, implementing fine-tuning strategies to adapt foundation models, designing retrieval-augmented generation (RAG) pipelines, and developing robust model evaluation frameworks.
My passion lies in bridging the gap between cutting-edge research and practical, production-ready AI systems. This includes deep work on evaluation methodologies for LLMs, mitigating hallucinations, ensuring model alignment and robustness, and optimizing serving and deployment pipelines for real-world scalability and reliability.
Through my work, I aim to contribute to the responsible advancement of Generative AI, combining technical excellence with a strong emphasis on safety, explainability, and user trust.
What I'm Up To
Currently Working On
I am currently focused on building scalable LLM-based agents, refining fine-tuning pipelines for domain-specific optimization, improving retrieval-augmented generation (RAG) systems, and designing rigorous evaluation strategies for LLM outputs. I draw inspiration from thought leaders like Sebastian Raschka, Eugene Yan, and Andrej Karpathy, as well as hands-on experimentation with open-source tools and models.
Reading List
I'm currently studying "Hands-On Large Language Models", exploring techniques for building, fine-tuning, and evaluating LLMs for production environments. On the personal side, I'm reading "4000 Weeks" by Oliver Burkeman, reflecting on productivity, priorities, and living intentionally in a fast-moving world.
Personal
Outside of work, I maintain a strong passion for football and have recently started learning chess — a pursuit that challenges strategic thinking, patience, and long-term planning, skills I find directly applicable to my AI research.
Main tools
- Python
- NumPy
- Pandas
- PyTorch
- PyTorch Lightning
- Jupyter
- Git
- Docker
- Kubernetes
- vLLM
- Triton Inference Server
- Ray Serve
- LangChain
- Pydantic AI
- CrewAI
- Semantic Kernel
- AutoGen
- Letta
- DeepEval
- RAGAS
- Helm (Holistic Evaluation of Language Models)
- Evals (OpenAI)
- LlamaIndex
- Milvus
- Weaviate
- Elasticsearch
- Vespa
- Airflow
- Prefect
- Prometheus
- Grafana
- Weights & Biases
- OpenTelemetry
- Streamlit
- Gradio
- Quarto
- AWS
- Google Cloud
- Azure
- Databricks
- Apache Spark
- Looker
- MLflow
- Model Context Protocol (MCP)
-
Exploring OpenELM The Intersection of Open Source and High Efficiency in AI
My analysis of OpenELM An Efficient Language Model Family with Open-source Training and Inference Framework, showcasing how Apple is pushing the boundaries of AI efficiency and accessibility.
-
Exploring the Differential Transformer A Step Forward in Language Modeling
My exploration of Differential Transformer delves into how Microsoft Research is advancing the field of language models by introducing a novel differential attention mechanism, significantly reducing attention noise to enhance learning accuracy and efficiency in long-context tasks, paving the way for more robust AI research and applications.
tags:
-
RAG with LlamaIndex, Elasticsearch and Llama3
Implement Q&A using a RAG technique (Retrieval Augmented Generation) with Elasticsearch as a vector database
tags:
-
LoRA and DoRA Implementation from Scratch 🚀
A implementation from LoRA and DoRA from scratch using PyTorch.
-
Large Language Models with MLX 🚀
A Python-based project that runs Large Language Models (LLM) applications on Apple Silicon in real-time thanks to Apple MLX.