Tensorflow(@CVision)
13K subscribers
1.11K photos
190 videos
67 files
2.1K links
اخبار حوزه یادگیری عمیق و هوش مصنوعی
مقالات و یافته های جدید یادگیری عمیق
بینایی ماشین و پردازش تصویر

TensorFlow, Keras, Deep Learning, Computer Vision

سایت دوره
http://class.vision

👨‍💻👩‍💻پشتیبان دوره ها:
@classvision_support
Download Telegram
New Google Brain Optimizer Reduces BERT Pre-Training Time From Days to Minutes

کاهش مدت زمان pre-training مدل زبانی BERT از سه روز به 76 دقیقه با ارائه یک تابع بهینه ساز جدید!

Google Brain researchers have proposed LAMB (Layer-wise Adaptive Moments optimizer for Batch training), a new optimizer which reduces training time for its NLP training model BERT (Bidirectional Encoder Representations from Transformers) from three days to just 76 minutes.

لینک مقاله: https://arxiv.org/abs/1904.00962
لینک بلاگ پست: https://medium.com/syncedreview/new-google-brain-optimizer-reduces-bert-pre-training-time-from-days-to-minutes-b454e54eda1d

#BERT #language_model #optimizer
Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT

Knowledge Distillation — Transferring generalization capabilities

Knowledge distillation (sometimes also referred to as teacher-student learning) is a compression technique in which a small model is trained to reproduce the behavior of a larger model (or an ensemble of models). It was introduced by Bucila et al. and generalized by Hinton et al. a few years later.
Another way to understand distillation is that it prevents the model to be too sure about its prediction (similarly to label smoothing).

We want to compress a large language model (like BERT) using distilling. For distilling, we’ll use the Kullback-Leibler loss since the optimizations are equivalent. When computing the gradients with respect to the student distribution we obtain the same gradients.

Blog post: https://medium.com/huggingface/distilbert-8cf3380435b5

Code: https://github.com/huggingface/pytorch-transformers/tree/master/examples/distillation

#language_model #BERT
Fast-Bert

This library will help you build and deploy BERT based models within minutes:

Fast-Bert is the deep learning library that allows developers and data scientists to train and deploy BERT and XLNet based models for natural language processing tasks beginning with Text Classification.

The work on FastBert is built on solid foundations provided by the excellent Hugging Face BERT PyTorch library and is inspired by fast.ai and strives to make the cutting edge deep learning technologies accessible for the vast community of machine learning practitioners.

With FastBert, you will be able to:

Train (more precisely fine-tune) BERT, RoBERTa and XLNet text classification models on your custom dataset.

Tune
model hyper-parameters such as epochs, learning rate, batch size, optimiser schedule and more.

Save and deploy trained
model for inference (including on AWS Sagemaker).

Fast-Bert will support both multi-class and multi-label text classification for the following and in due course, it will support other NLU tasks such as Named Entity Recognition, Question Answering and Custom Corpus fine-tuning.

Blog post: https://medium.com/huggingface/introducing-fastbert-a-simple-deep-learning-library-for-bert-models-89ff763ad384

Code: https://github.com/kaushaltrivedi/fast-bert

#language_model #BERT
کد و وزن های مدل زبانی 1.5 میلیارد پارامتری GPT-2 منتشر شد...

OpenAI announced the final staged release of its 1.5 billion parameter language model GPT-2, along with all associated code and model weights

لینک خبر:
https://twitter.com/OpenAI/status/1191764001434173440
لینک بلاگ پست:
https://medium.com/syncedreview/openai-releases-1-5-billion-parameter-gpt-2-model-c34e97da56c0

#language_model #gpt2 #nlp #openai