Vitor Sousa | Home <meta name="astro-view-transitions-enabled" content="true"><meta name="astro-view-transitions-fallback" content="animate"> <script> (() => { const storageKey = 'vitor-theme'; const getPreferred = () => { try { const saved = window.localStorage.getItem(storageKey); if (saved === 'light' || saved === 'dark') return saved; } catch (error) { console.warn('Unable to access theme preference storage.', error); } return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light'; }; /** * @param {'light' | 'dark'} theme */ const applyTheme = (theme) => { const root = document.documentElement; root.dataset.theme = theme; root.style.colorScheme = theme; }; applyTheme(getPreferred()); })(); </script>

About

Hello, I'm Vitor Sousa

I'm a Data Scientist & AI Engineer building production-grade AI systems at Wellhub. My work spans the full machine learning lifecycle — from research and experimentation to deploying robust applications at scale. I specialize in large language models and intelligent agents, bringing cutting-edge ML research (LLMs, fine-tuning, RAG, reinforcement learning) into real-world use.

I created this site to share what I'm working on. Here you'll find the projects I've built and the articles (or "essays") I've written about machine learning, data science, and software engineering. The aim is to document experiments, insights, and lessons learned — bridging research and practice — rather than to craft a glossy self-promotional page.

Current Work

Right now I'm focused on a few active projects that connect research ideas with production requirements.

LLM agents

Building production-ready agents with tools, memory, and planning using frameworks like LangGraph and CrewAI.
Fine-tuning & alignment

Designing efficient LoRA/QLoRA pipelines plus alignment techniques such as DPO and RLHF to shape model behaviour.
RAG systems

Optimising retrieval-augmented generation with smart chunking, hybrid search, and reranking for sharper answers.
Personalisation

Developing contextual bandit algorithms for adaptive content personalisation and recommendations.
Evaluation frameworks

Combining automated metrics, human review, and A/B tests to monitor model performance in production.

Research interests

I'm continuously exploring new ideas in AI. A few areas currently on my mind:

Prompt optimisation

Automating prompt engineering (for example with DSPy) to systematically improve how we instruct LLMs.
RLHF & alignment

Using reinforcement learning from human feedback to align models with human preferences and safety guardrails.
Multi-agent systems

Exploring how multiple AI agents coordinate, collaborate, and reason together on complex tasks.
Hallucination mitigation

Reducing model hallucinations with retrieval, fact-checking, and verification loops.
Efficient serving

Designing low-latency, cost-effective serving architectures for deploying AI at scale.

I stay close to research from OpenAI, DeepMind, Anthropic, and voices I admire like Eugene Yan, Sebastian Raschka, Andrej Karpathy, and Chip Huyen.

Reading & learning

I keep a steady rotation of books, papers, and hobbies to broaden my perspective.

Hands-On Large Language Models

Working through practical patterns for shipping LLM applications.
Reinforcement Learning (Sutton & Barto)

Revisiting the fundamentals of reinforcement learning theory.
Agent research papers

Diving into recent publications on agent architectures and advanced ML systems.
"Four Thousand Weeks"

Taking a non-technical pause to think about time, focus, and sustainable pace.
Strategic hobbies

Learning chess and playing football — both sharpen strategic thinking and keep the work balanced.

Tech Stack

My day-to-day toolkit spans languages, frameworks, and infrastructure for training, evaluating, and shipping ML systems.

Writing

Visual comparison of regret growth curves for different bandit algorithms

Contextual Bandit Theory: Regret Bounds and Exploration

Understand the theory behind contextual bandits: regret bounds, the exploration-exploitation tradeoff, reward models, and why certain algorithms work. Math that directly informs practice.

Series

Adaptive Optimization at Scale: Contextual Bandits from Theory to Production Part 2

Nov 15, 2025 ~18 min

Read article

Decision tree diagram showing when to use contextual bandits versus alternatives

When to Use Contextual Bandits: The Decision Framework

Stop running month-long A/B tests that leave value on the table. Learn when contextual bandits are the right choice for adaptive, personalized optimization—and when to stick with simpler alternatives.

Series

Adaptive Optimization at Scale: Contextual Bandits from Theory to Production Part 1

Nov 13, 2025 ~20 min

Read article

Mermaid diagram showing three pillars of LLM evaluation: What to Evaluate (Faithfulness vs Helpfulness), How to Evaluate (Methods and Metrics), and Making it Systematic (Process and Monitoring), connected in a circular feedback loop

Beyond the Vibe Check: A Systematic Approach to LLM Evaluation

Stop relying on gut feelings to evaluate LLM outputs. Learn systematic approaches to build trustworthy evaluation pipelines with measurable metrics, proven methods, and production-ready practices. A practical guide covering faithfulness vs helpfulness, LLM-as-judge techniques, bias mitigation, and continuous monitoring.

Nov 5, 2025 ~60 min

Read article

View all articles