COSCUP 2025

Let's build a Transformer: JAX Source code explained from scratch
2025-08-10 , RB105

Transformer architecture can be used for various NLP and CV tasks. They are pre-trained to generate text and images based on large datasets. Building a transformer from scratch allows us to customize the component for our application. The goal of this talk is to demonstrate how the transformer model could be implemented on JAX. To do so, we're going to build a general purpose transformer completely from scratch all with JAX.


Target Audience:

Targeted at developers with intermediate knowledge of Python and transformers and would like to get familiar with what and how JAX can achieve for building a transformer. Additionally, this talk explores how Flax NNX can support to implement the model architecture.

Difficulty:

進階

John is a Senior AI Engineer, currently focused on developing NLP applications.

He is deeply motivated by challenges and tends to be excited by breaking conventional ways of thinking and doing. With prior experiences in Software Engineering, he works on combining the latest AI technology and engineering to transform challenges into practical solutions.

This speaker also appears in: