Skip to content

Chapter 10, Page 313 #81

@absognety

Description

@absognety

Hi There, In Chapter 10 - Creating Text Embedding Models from Part III, in the section of Fine-Tuning an Embedding Model on Page 313, I think there is a typo in this sentence

After training our cross-encoder, we use the remaining 400,000 sentence pairs (from
our original dataset of 50,000 sentence pairs) as our silver dataset (step 2):

After taking subset of 10,000 documents, there would have been 40,000 documents as remaining from the original dataset of 50,000 sentence pairs.

This is after the following code sample

# Train a cross-encoder on the gold dataset
cross_encoder = CrossEncoder("bert-base-uncased", num_labels=2)
cross_encoder.fit(
train_dataloader=gold_dataloader,
epochs=1,
show_progress_bar=True,
warmup_steps=100,
use_amp=False
)

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions