Josh Gregory
  • Home
  • About
  • Projects
  • Blog
  • Resume

On this page

  • Overview

Transformers and Attention

Machine learning
Artificial intelligence
Thesis
Notes on the transformer architecture and the attention mechanism
Author

Josh Gregory

Published

May 4, 2024

This is going to be the first post in a series where I’m taking the notes from learning about the transformer architecture and synthesizing them, both for myself in the future and if they’re helpful for others. The paper where the transformer architecture was first introduced is by the Google Brain team and is titled “Attention is All You Need”.

These notes are based on the amazing YouTube videos by Grant Sanderson of 3Blue1Brown fame. The entire playlist can be found here.

Overview

Citation

BibTeX citation:
@online{gregory2024,
  author = {Gregory, Josh},
  title = {Transformers and {Attention}},
  date = {2024-05-04},
  url = {https://joshgregory42.github.io/posts/2022-10-24-my-blog-post/},
  langid = {en}
}
For attribution, please cite this work as:
Gregory, Josh. 2024. “Transformers and Attention.” May 4, 2024. https://joshgregory42.github.io/posts/2022-10-24-my-blog-post/.
  • © 2025 Josh Gregory