How to use pre-trained models in your next business project

May 24, 2020 admin 0 Comments

Most of the new deep learning models being released, especially in NLP, are very, very large: They have parameters ranging from hundreds of millions to tens of billions. Given good enough architecture, the larger the model, the more learning capacity it has. Thus, these new models have huge learning capacity and are trained on very, very large datasets. Because of that, they learn the entire distribution of the datasets they are trained on. One can say that they encode compressed knowledge of these datasets. This allows these models to be used for very interesting applications—the most common one being transfer learning. Transfer learning is fine-tuning pre-trained models on…