-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Open
Description
Hi There, In Chapter 10 - Creating Text Embedding Models
from Part III, in the section of Fine-Tuning an Embedding Model
on Page 313, I think there is a typo in this sentence
After training our cross-encoder, we use the remaining 400,000 sentence pairs (from
our original dataset of 50,000 sentence pairs) as our silver dataset (step 2):
After taking subset of 10,000 documents, there would have been 40,000 documents as remaining from the original dataset of 50,000 sentence pairs.
This is after the following code sample
# Train a cross-encoder on the gold dataset
cross_encoder = CrossEncoder("bert-base-uncased", num_labels=2)
cross_encoder.fit(
train_dataloader=gold_dataloader,
epochs=1,
show_progress_bar=True,
warmup_steps=100,
use_amp=False
)
Thanks!
Metadata
Metadata
Assignees
Labels
No labels