I build and write about production-grade AI systems.
I'm Vitor Sousa, a Senior Data Scientist at Wellhub on the GenAI & Engagement team, where I build contextual bandit systems for personalized nudges, reinforcement learning pipelines, LLM-powered engagement workflows, and ML infrastructure on Kubeflow and Kafka. Previously at Farfetch, I built recommendation and size-prediction systems serving 4M+ customers across 190 countries — deep learning from scratch, learning-to-rank, and a published paper at ACM RecSys. This site goes beyond the day job — it's where I dig into research interests, build things from scratch to understand them deeply, and write about the ideas I'm most curious about.
11 articles · 3 projects
Selected writing
See also: Foundations →GDPO: Multi-Reward RL Done Right
When GRPO meets multiple rewards, advantages collapse. GDPO fixes this by normalizing each reward independently before combining. Learn why this matters for tool calling, math reasoning, and any multi-objective LLM alignment.
GRPO: Eliminating the Value Network
Group Relative Policy Optimization replaces PPO's learned value function with a simple insight: sample multiple outputs and use their relative rewards as advantages. 33% memory savings, simpler implementation, and the algorithm powering DeepSeek-R1.
PPO for Language Models: The RLHF Workhorse
Deep dive into Proximal Policy Optimization—the algorithm behind most LLM alignment. Understand trust regions, the clipped objective, GAE, and why PPO's four-model architecture creates problems at scale.
Selected projects
RAG System with LlamaIndex, Elasticsearch & Llama3
A deep dive into building a local-first retrieval-augmented generation system for document Q&A.
Elasticsearch · LlamaIndex · Llama3 · RAG · Vector Search
LoRA and DoRA Implementation
I implemented LoRA and DoRA from scratch in PyTorch to understand the methods end to end.
llms · peft · pytorch
Large Language Models with MLX
I explored chat tooling on Apple Silicon using MLX to understand the runtime and packaging story.
llms · mistral · llama2