Textmodel integration #250

Chandu-4444 · 2022-07-21T11:30:17Z

Can do the following:

lm = FastText.LanguageModel(true)
classifier = FastText.TextClassifier(lm)
FastText.train_classifier!(classifier) # Would throw an error as I haven't fully enabled the model to work with FastAI's data container.

lorenzoh · 2022-07-21T18:11:19Z

I think you've readded some files from TextModels.jl that we don't need, could you remove those? 🙂

Chandu-4444 · 2022-07-21T18:33:07Z

Sure! Will clean up in the next commit.

Chandu-4444 · 2022-07-21T21:08:33Z

Deleted the files from TextModels.jl that are unnecessary.
Add a utility function for padding the batches for sequence data (Very naively implemented, as a starting point though)

julia> batches = FastText.load_batchseq(data, task)
julia> batches[1][1]
92-element Vector{Vector{Int64}}:
 [25000, 25000, 25000, 25000, 25000, 25000, 25000, 25000]
 [633779, 633779, 633779, 633779, 633779, 633779, 633779, 633779]
 [2731, 34, 315, 354, 2087, 2209, 70, 1307]
 [44047, 435, 633779, 633779, 6589, 633779, 633779, 205]
 ⋮
 [0, 0, 0, 0, 0, 213, 0, 0]
 [0, 0, 0, 0, 0, 25, 0, 0]
 [0, 0, 0, 0, 0, 1778, 0, 0]

julia> batches[1][2]
8-element Vector{Int64}:
 1
 1
 1
 1
 1
 0
 1
 1

Chandu-4444 · 2022-07-25T09:58:49Z

fastai's way of encoding data doesn't include the removal of stop words (and Jeremy recommends it). So, I removed the stop word removal step.
I've added a vocab_size keyword argument to TextClassificationSingle.
Added <unk>, <pad> to the vocabulary.

A batch loader for textdata with padding length = max(sentence length in a batch).

ToucheSir · 2022-07-31T20:57:47Z

Should the vocab CSV files be checked in? I would've assumed they would be artifacts or DataDeps as well.

ToucheSir

Porting the TextModels code is going to be a big task, but for now the priority is just to make things work e2e :)

FastText/src/models/train_text_classifier.jl

FastText/src/models/pretrain_lm.jl

ToucheSir · 2022-07-31T21:02:33Z

FastText/src/models/pretrain_lm.jl

+        Chain(
+            de,
+            VarDrop(in_drop_prob),
+            AWD_LSTM(embedding_size, hid_lstm_sz, hid_drop_prob; init = (dims...) -> init_weights(1/hid_lstm_sz, dims...)),


Likewise, I think this repeated lambda should be pulled out into a helper function of some sort.

FastText/src/models/pretrain_lm.jl

FastText/src/models/dataloader.jl

FastText/src/models/custom_layers.jl

ToucheSir · 2022-07-31T21:21:08Z

FastText/src/models/custom_layers.jl

+function asgd_step!(iter::Integer, layer::AWD_LSTM)
+    if (iter >= layer.T) & (layer.T > 0)
+        p = get_trainable_params([layer])
+        avg_fact = 1/max(iter - layer.T + 1, 1)
+        if avg_fact != 1
+            layer.accum = layer.accum .+ p
+            for (ps, accum) in zip(p, layer.accum)
+                ps .= avg_fact*accum
+            end
+        else
+            layer.accum = deepcopy(p)   # Accumulator for ASGD
+        end
+    end
+    return
+end


@lorenzoh I don't believe we have any utils for this in Flux or FastAI, right? Haven't kept up with the list of callbacks.

I don’t think so. This also seems very model-specific so it’s maybe best to keep it as part of the model.

FastText/src/models/custom_layers.jl

Co-authored-by: Brian Chen <[email protected]>

@ToucheSir

Minor changes as suggested by @ToucheSir

Chandu-4444 · 2022-08-08T09:57:43Z

julia> data, blocks = load(datarecipes()["imdb"])
((mapobs(loadfile, ObsView(::MLDatasets.FileDataset{typeof(identity), String}, ::Vector{Int64})), mapobs(parentname, ObsView(::MLDatasets.FileDataset{typeof(identity), String}, ::Vector{Int64}))), (Paragraph(), Label{String}(["neg", "pos"])))

julia> task = TextClassificationSingle(blocks, data)
SupervisedTask(Paragraph -> Label{String})

julia> model = FastAI.taskmodel(task, FastText.LanguageModel(false, task))
#90 (generic function with 1 method)

julia> batches = FastText.load_batchseq(data, task)
6250-element Vector{Tuple{Vector{Vector{Int64}}, WARNING: both Losses and NNlib export "ctc_loss"; uses of it in module Flux must be qualified
Flux.OneHotArray{UInt32, 2, 1, 2, Vector{UInt32}}}}:
 ([[35, 35, 35, 35], [3, 3, 3, 9], [40, 18025, 15, 14], [224, 10, 3541, 3040], [737, 34, 24, 505], [49, 7, 809, 3], [4, 4, 221, 3836], [1927, 104, 4, 
3], [7, 16, 629, 28440], [6, 351, 7, 17]  …  [2, 2, 2, 44], [2, 2, 2, 3], [2, 2, 2, 9839], [2, 2, 2, 17], [2, 2, 2, 1041], [2, 2, 2, 27], [2, 2, 2, 3], [2, 2, 2, 3836], [2, 2, 2, 3], [2, 2, 2, 28440]], [0 0 1 1; 1 1 0 0])

julia> using FluxTraining

julia> td, vd = splitobs(batches, at=0.9)

julia> using Flux

julia> learner = Learner(model, Flux.Losses.logitcrossentropy, callbacks=[Metrics(accuracy)]; data=(td, vd))
Learner()

julia> fit!(learner, 1)
Epoch 1 TrainingPhase():   0%|█                                                                                               |  ETA: 4 days, 3:35:31

Change `AWD_LSTM` to `WeightDroppedLSTM`

darsnack

Just reviewing the weight dropped LSTM for now

darsnack · 2022-08-23T16:00:57Z

FastText/src/models/custom_layers.jl

+    h::S
+    c::S


These are now replaced by state0 and stored in Recur

This comment is still outstanding.

FastText/src/models/custom_layers.jl

darsnack

More changes from the call

FastText/src/models/custom_layers.jl

darsnack · 2022-08-23T17:30:07Z

FastText/src/models/custom_layers.jl

+        true,
+        (Flux.zeros32(out, 1), Flux.zeros32(out, 1))
+    )
+    cell.b[gate(out, 2)] .= 1


Why does it only mask part of the bias @Chandu-4444?

Since it is ported from TextModels.jl I'm not completely sure of this. Even Flux.jl has this line this.
Would be helpful if someone could explain the reason.

FastText/src/models/custom_layers.jl

darsnack · 2022-08-23T17:58:33Z

FastText/src/models/custom_layers.jl

+testmode!(m::DroppedEmbeddings, mode=true) =
+  (m.active = (isnothing(mode) || mode == :auto) ? nothing : !mode; m)
+
+function reset_masks!(de::DroppedEmbeddings)


Overload Flux.reset! here

Bumping this comment

FastText/src/models/custom_layers.jl

Co-authored-by: Kyle Daruwalla <[email protected]>

ToucheSir · 2022-08-23T22:54:57Z

FastText/src/models/custom_layers.jl

+mutable struct VarDrop{F}
+    p::F
+    mask
+    reset::Bool
+    active::Bool
+end
+
+VarDrop(p::Float64=0.0) = VarDrop(p, Array{Float32, 2}(UndefInitializer(), 0, 0), true, true)
+
+function (vd::VarDrop)(x)
+    vd.active || return x
+    if vd.reset
+        vd.mask = drop_mask(x, vd.p)
+        vd.reset = false
+    end
+    return (x .* vd.mask)
+end


As promised, here's the proposal from our call today:

mutable struct VarDropCell{F} p::F active::Union{Bool, Nothing} # matches other norm layers end @functor VarDropCell VarDropCell(p::Real=0.0) = VarDropCell(p, nothing) function (vd::VarDropCell)((has_mask, mask), x) if Flux._isactive(vd) mask = has_mask ? mask : drop_mask(x, vd.p) return (true, mask), x .* mask elseif !has_mask return (has_mask, mask), x else error("Mask set but layer is in test mode. Call `reset!` to clear the mask.") end end # The single-element array keeps Recur happy. # Limitation: typeof(p) must == typeof(<inputs>) VarDrop(p::Real) = Recur(VarDropCell(p), ones(typeof(p), 1, 1))

Plus bits to make testmode!, _isactive and reset! work properly.

Co-authored-by: Kyle Daruwalla <[email protected]>

at `DroppedEmbeddings`.

Chandu-4444 · 2022-08-31T13:39:08Z

The changes have been merged to #258.

`TextClassifier`.

darsnack

Try this

darsnack · 2022-08-31T14:31:11Z

FastText/src/models/custom_layers.jl

+    h::S
+    c::S


This comment is still outstanding.

FastText/src/models/train_text_classifier.jl

Co-authored-by: Kyle Daruwalla <[email protected]>

lorenzoh · 2022-08-02T09:20:14Z

FastText/src/models/custom_layers.jl

+function asgd_step!(iter::Integer, layer::AWD_LSTM)
+    if (iter >= layer.T) & (layer.T > 0)
+        p = get_trainable_params([layer])
+        avg_fact = 1/max(iter - layer.T + 1, 1)
+        if avg_fact != 1
+            layer.accum = layer.accum .+ p
+            for (ps, accum) in zip(p, layer.accum)
+                ps .= avg_fact*accum
+            end
+        else
+            layer.accum = deepcopy(p)   # Accumulator for ASGD
+        end
+    end
+    return
+end


I don’t think so. This also seems very model-specific so it’s maybe best to keep it as part of the model.

FastText/src/models/custom_layers.jl

darsnack

Some of the issues that you posted on Zulip should be resolved by this

FastText/src/models/custom_layers.jl

darsnack · 2022-09-07T13:50:32Z

FastText/src/models/custom_layers.jl

+    maskWi = Flux.dropout_mask(Flux.rng_from_array(), layer.cell.Wi, layer.cell.p)
+    maskWh = Flux.dropout_mask(Flux.rng_from_array(), layer.cell.Wh, layer.cell.p)


Suggested change

maskWi = Flux.dropout_mask(Flux.rng_from_array(), layer.cell.Wi, layer.cell.p)

maskWh = Flux.dropout_mask(Flux.rng_from_array(), layer.cell.Wh, layer.cell.p)

maskWi = Flux.dropout_mask(Flux.rng_from_array(layer.cell.Wi), layer.cell.Wi, layer.cell.p)

maskWh = Flux.dropout_mask(Flux.rng_from_array(layer.cell.Wh), layer.cell.Wh, layer.cell.p)

FastText/src/models/custom_layers.jl

darsnack · 2022-09-07T13:51:07Z

FastText/src/models/custom_layers.jl

+testmode!(m::DroppedEmbeddings, mode=true) =
+  (m.active = (isnothing(mode) || mode == :auto) ? nothing : !mode; m)
+
+function reset_masks!(de::DroppedEmbeddings)


Bumping this comment

Co-authored-by: Kyle Daruwalla <[email protected]>

at `Flux.rng_from_array`

Chandu-4444 added 2 commits July 19, 2022 21:39

Integrate ULMFiT (initial)

8194f3f

Add Paragraph to train_classifier

7662230

Add batchseq to pad batch (naive version)

1418e50

Chandu-4444 added 3 commits July 22, 2022 02:39

Remove Project.toml changes.

b682105

Add vocab_size to TextClassificationTask

c1064bc

Add vocab_size to encodings

345ced1

Chandu-4444 added 3 commits July 25, 2022 20:58

Test batches integration with model.

46b6826

Update load_batchseq function.

388e8ac

Clean up useless code from TextModels.jl.

8bfe705

ToucheSir reviewed Jul 31, 2022

View reviewed changes

Chandu-4444 and others added 6 commits August 2, 2022 15:45

Update FastText/src/models/pretrain_lm.jl

2482aaa

Co-authored-by: Brian Chen <[email protected]>

Update FastText/src/models/dataloader.jl

8bc930d

Co-authored-by: Brian Chen <[email protected]>

Update FastText/src/models/custom_layers.jl

3882b1d

Co-authored-by: Brian Chen <[email protected]>

Update FastText/src/models/custom_layers.jl

3057989

Co-authored-by: Brian Chen <[email protected]>

Add reset! for AWD_LSTM.

307fde1

Minor changes as suggested by @ToucheSir

Add textlearner.

075a21e

Chandu-4444 added 3 commits August 8, 2022 15:39

Complere text classification pipeline.

f16ec2c

Upadate LanguageModel to use Flux.reset!.

3469630

Change `AWD_LSTM` to `WeightDroppedLSTM`

Include models.jl file.

974a622

darsnack requested changes Aug 23, 2022

View reviewed changes

Chandu-4444 and others added 3 commits August 24, 2022 02:02

Update FastText/src/models/custom_layers.jl

0d44bbd

Co-authored-by: Kyle Daruwalla <[email protected]>

Update FastText/src/models/custom_layers.jl

4ad9a12

Co-authored-by: Kyle Daruwalla <[email protected]>

Update FastText/src/models/custom_layers.jl

080a018

Co-authored-by: Kyle Daruwalla <[email protected]>

Chandu-4444 and others added 5 commits August 24, 2022 02:02

Update FastText/src/models/custom_layers.jl

922334d

Co-authored-by: Kyle Daruwalla <[email protected]>

Update FastText/src/models/custom_layers.jl

ce34be1

Co-authored-by: Kyle Daruwalla <[email protected]>

Update FastText/src/models/custom_layers.jl

6cea902

Co-authored-by: Kyle Daruwalla <[email protected]>

Update FastText/src/models/custom_layers.jl

5159fb8

Co-authored-by: Kyle Daruwalla <[email protected]>

Update FastText/src/models/custom_layers.jl

c6b69f7

Co-authored-by: Kyle Daruwalla <[email protected]>

ToucheSir reviewed Aug 23, 2022

View reviewed changes

Chandu-4444 and others added 5 commits August 24, 2022 19:08

Update FastText/src/models/custom_layers.jl

164fe21

Co-authored-by: Kyle Daruwalla <[email protected]>

Add suggestions and improvements from the call.

1ce6df2

Use previous VarDrop code for using in colab.

712d9c8

Use NNlib for scalar indexing

6d6504b

at `DroppedEmbeddings`.

Updates to Project.toml

5af6edb

Chandu-4444 mentioned this pull request Aug 31, 2022

Text generation recipe #258

Open

Modify type params for LanguageModel and

659e6d1

`TextClassifier`.

darsnack requested changes Sep 1, 2022

View reviewed changes

Chandu-4444 and others added 2 commits September 2, 2022 01:06

Update FastText/src/models/train_text_classifier.jl

0a151b2

Co-authored-by: Kyle Daruwalla <[email protected]>

Update FastText/src/models/train_text_classifier.jl

12bc9ab

Co-authored-by: Kyle Daruwalla <[email protected]>

lorenzoh requested changes Sep 6, 2022

View reviewed changes

Chandu-4444 added 2 commits September 6, 2022 23:39

Update dtypes to avoid CuArray errors.

35c345f

Add callable TextClassifier

cceee46

darsnack requested changes Sep 7, 2022

View reviewed changes

Chandu-4444 and others added 5 commits September 8, 2022 11:36

Update FastText/src/models/custom_layers.jl

f7d51f6

Co-authored-by: Kyle Daruwalla <[email protected]>

Update FastText/src/models/custom_layers.jl

90c9a79

Co-authored-by: Kyle Daruwalla <[email protected]>

Update Flux.reset!()

7e7de6d

Update few Flux.dropout functions.

c1418b3

Update code to avoid non-differentiable error

2c04d19

at `Flux.rng_from_array`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Textmodel integration #250

Textmodel integration #250

Chandu-4444 commented Jul 21, 2022

lorenzoh commented Jul 21, 2022

Chandu-4444 commented Jul 21, 2022

Chandu-4444 commented Jul 21, 2022

Chandu-4444 commented Jul 25, 2022

ToucheSir commented Jul 31, 2022

ToucheSir left a comment

ToucheSir Jul 31, 2022

ToucheSir Jul 31, 2022

lorenzoh Aug 2, 2022

Chandu-4444 commented Aug 8, 2022 •

edited

Loading

darsnack left a comment

darsnack Aug 23, 2022

darsnack Aug 31, 2022

darsnack left a comment

darsnack Aug 23, 2022

Chandu-4444 Aug 26, 2022 •

edited

Loading

darsnack Aug 23, 2022

darsnack Sep 7, 2022

ToucheSir Aug 23, 2022 •

edited

Loading

Chandu-4444 commented Aug 31, 2022

darsnack left a comment

darsnack Aug 31, 2022

lorenzoh Aug 2, 2022

darsnack left a comment

darsnack Sep 7, 2022

darsnack Sep 7, 2022

		maskWi = Flux.dropout_mask(Flux.rng_from_array(), layer.cell.Wi, layer.cell.p)
		maskWh = Flux.dropout_mask(Flux.rng_from_array(), layer.cell.Wh, layer.cell.p)

Textmodel integration #250

Are you sure you want to change the base?

Textmodel integration #250

Conversation

Chandu-4444 commented Jul 21, 2022

lorenzoh commented Jul 21, 2022

Chandu-4444 commented Jul 21, 2022

Chandu-4444 commented Jul 21, 2022

Chandu-4444 commented Jul 25, 2022

ToucheSir commented Jul 31, 2022

ToucheSir left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Chandu-4444 commented Aug 8, 2022 • edited Loading

darsnack left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darsnack left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Chandu-4444 Aug 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ToucheSir Aug 23, 2022 • edited Loading

Choose a reason for hiding this comment

Chandu-4444 commented Aug 31, 2022

darsnack left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darsnack left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Chandu-4444 commented Aug 8, 2022 •

edited

Loading

Chandu-4444 Aug 26, 2022 •

edited

Loading

ToucheSir Aug 23, 2022 •

edited

Loading