The input pipeline of our training process is the more complex part of the entire transformer build. It consists of us taking our raw OSCAR training data, transforming it, and preparing it for Masked-Language Modeling (MLM). Finally, we load our data into a DataLoader ready for training!
Top comments (0)