The bond market is a safe, reliable investment for some or a dull market for others, yet not to argue is the one that makes up to headlines in any significant economic development. The bond market though indecipherable yet is packed with important indicators about the economy. This very bond market and its yield curve are also debatably are a simple model to forecast a recession. It is debatable as it is non-trivial to ignore the yield curve as it stores predictive records due to shifts in the economic landscape.
The yield curve is originally called the “fear gauge” by…
人間には『善の心』ともう1つの『悪の心』が存在しているんです!
“Every human being has a “good heart” and another “evil heart”!
Optimization of both is required to excel, it is a non-trivial quest.”
Follow the link to the entire series by clicking here: The complete guide to NLP with fastai
See also The Annotated Transformer from Harvard NLP.
Nvidia AI researcher Chip Huyen wrote a great post Top 8 trends from ICLR 2019 in which one of the trends is that RNN is losing its luster with researchers.
There’s a good reason for this, RNNs can be a pain: parallelization can be tricky and they can be difficult to debug. …
Follow the link to the entire series by clicking here: The complete guide to NLP with fastai
This is exciting because the performance over our this journey of learning NLP can be summarized as below:
Follow the link to the entire series by clicking here: The complete guide to NLP with fastai
So let’s get started….
We were using RNNs as part of our language model in the previous lesson. Today, we will dive into more details of what RNNs are and how they work. We will do this using the problem of trying to predict the English word version of numbers.
Let’s predict what should come next in this sequence:
eight thousand one , eight thousand two , eight thousand three , eight thousand four , eight thousand five , eight thousand six …
Follow the link to the entire series by clicking here: The complete guide to NLP with fastai
In [6]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline
import pandas as pd
import re
Jeremy Howard’s lecture is the basis for this review: a three-part review plan
* regex workflow
* svd
* transfer learning.
regex
is used every day in NLP work, and that it is essential for machine learning practitioners to develop a working knowledge of regex
. Since we've already done deep dives into svd
and into transfer learning
, we'll focus on the regex
part of this review.
regex
exerciseTo illustrate the power of regex
and familiarize us with the…
Follow the link to the entire series by clicking here: The complete guide to NLP with fastai
The BLEU metric has been introduced in this article to come with some kind of way to evaluate the performance of translation models. It's based on the precision you hit with n-grams in your prediction compared to your target. Let's see this as an example. Imagine you have the target sentence
the cat is walking in the garden
and your model gives you the following output
the cat is running in the fields
We are going to compute the precision, which is the…
Follow the link to the entire series by clicking here: The complete guide to NLP with fastai
Constructing a Language Model and a Sentiment Classifier for IMDB movie reviews
Transfer learning has been widely used with great success in computer vision for several years, but only in the last year or so has it been successfully applied to NLP (beginning with ULMFit, which we will use here, which was built upon by BERT and GPT-2).
As Sebastian Ruder wrote in The Gradient last summer, NLP’s ImageNet moment has arrived.
We will first build a language model for IMDB movie reviews…
Follow the link to the entire series by clicking here: The complete guide to NLP with fastai
We are going to create an IMDb
language model starting with the pre-trained weights from the wikitext-103
language model.
Now let’s grab the full IMDb
dataset for what follows.
In [4]:
path = untar_data(URLs.IMDB)
path.ls()Out[4]:
[WindowsPath('C:/Users/cross-entropy/.fastai/data/imdb/data_clas.pkl'),
.... WindowsPath('C:/Users/cross-entropy/.fastai/data/imdb/vocab_lm.pkl')]In [12]:(path/'train').ls()Out[12]:
[WindowsPath('C:/Users/cross-entropy/.fastai/data/imdb/train/labeledBow.feat'),
...
WindowsPath('C:/Users/cross-entropy/.fastai/data/imdb/train/unsupBow.feat')]
The reviews are in a training and test set following an imagenet structure. The only difference is that there is an unsup
folder in train
that contains the unlabelled data.
Follow the link to the entire series by clicking here: The complete guide to NLP with fastai
IMDb Sentiment Classifier
We’ll now use transfer learning to create a classifier
, again starting from the pretrained weights of the wikitext-103
language model. We'll also need the IMDb language model
an encoder that we saved previously.
databunch
Using fastai’s flexible API, we will now create a different kind of databunch
object, one that is suitable for a classifier rather than a for language model (as we did in 2A). This time we'll keep the labels for the IMDb
movie reviews data.
Add the try-except
wrapper workaround for…
Data Scientist by profession and just lazy by nature.