How to train NanoGPT with Deep Lake streamable dataloader

In this project lesson, we walk through a training example of Andrej Karpathy’s NanoGPT. This is the easiest, swiftest repository for training and fine-tuning medium-sized GPTs. The code to train the model is just a 300-line boilerplate training loop and a 300-line GPT model definition completing GPT2. While NanoGPT was designed to be run locally, we built on this and overcame the speed constraints of local training by replacing the local data loader with Deep Lake’s streamable data loader.