Bits, Bytes & Life

Exploring LLMs, edge AI, doing projects, reading papers… and the joys of life outside the lab. Some stories come straight from my wife’s pen, polished by me.

Disclaimer: This is just a collection of notes – some parts were generated with a bit of AI help and then cleaned up. A lot of it is mainly for my own reference, so things might be rough or incomplete in places.

RL Notes: Huggin Face RL Course

HFRL Unit-1 Summary Reinforcement Learning is a method where an agent learns by interacting with its environment, using trial and error and feedback from rewards.

2025/10/10 · Bits, Bytes & Life

Attention-Free Transformer: Escaping the Quadratic Bottleneck

Attention-Free Transformer: Escaping the Quadratic Bottleneck How AFT rethinks attention to achieve linear complexity while preserving performance The Transformer architecture revolutionized AI, but its self-attention mechanism comes with a fundamental limitation: quadratic complexity $\mathcal{O}(n^2)$ with sequence length. This makes processing long sequences computationally prohibitive. The Attention-Free Transformer (AFT) emerges as an elegant solution that maintains strong performance while achieving linear complexity.

2025/09/28 · Bits, Bytes & Life

Demystifying Transformers: Attention, Multi-Head Magic, and the Math Behind the Revolution

Demystifying Transformers: Attention, Multi-Head Magic, and the Math Behind the Revolution From single head to multi-head attention - understanding the architectural breakthrough that changed AI forever The Transformer architecture, introduced in the seminal “Attention Is All You Need” paper, revolutionized natural language processing by replacing recurrent networks with a purely attention-based approach. At its heart lies the self-attention mechanism - a powerful way for models to understand relationships between all words in a sequence simultaneously.

2025/09/27 · Bits, Bytes & Life

Demystifying RNNs: A Deep Dive into Dimensions and Parameters

Demystifying RNNs: A Deep Dive into Dimensions and Parameters Understanding what really happens inside Recurrent Neural Networks When learning about Recurrent Neural Networks (RNNs), many tutorials focus on the high-level concept of “memory” but gloss over the practical details of how they actually work. As someone who struggled with these details, I want to share the insights that finally made RNNs click for me.

2025/09/26 · Bits, Bytes & Life