2024-08-04, 11:20–11:50 (Asia/Taipei), TR211
Meta Llama 3 is the state-of-the-art open-source large language model. The pretrained language models can support a broad range of use cases. Llama 3 has several key improvements compared to Llama 2. The goal of this talk is to help developers unlock the power of Llama 3. To do so, we're going to deliver a step-by-step walkthrough to explain the source code and concepts of Llama 3.
For this talk, we're going to take a bottom-up approach and start from the tokenizer. Here, we'll talk about how Llama 3 address tokenizing and encoding/decoding text using the Tiktoken tokenizer.
And then we'll move on to the model, where we'll get a better understand about how Llama 3 was built.
Finally, we'll tying all of these pieces together at the top level and talk a bit about how Llama 3 generates text sequences based on provided prompts.
This talk should be pretty helpful for developers to get an idea of how the building blocks of Llama 3 could be implemented with PyTorch. The code and concepts introduced are potentially transferrable to implement other generative language models.
John is a Senior Software Developer at CMoney, currently focusing on developing core modules for the engineering team.
He is deeply motivated by challenges and tends to be excited by breaking conventional ways of thinking and doing. With prior experiences in Machine Learning research, he works on combining the latest AI technology and engineering to build fun and creative applications.