Closed
Description
System Info
python: 3.11.9
transformers: 4.43.3
torch: 2.4.0+cu121
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
When using pipeline to generate text using llama3.1 8B, even if I have fixed all the random seeds, with the same prompt, the output will be different everytime. If I set do_sample=False, output will be the same each time.
I understand that do_sample does top_p sampling (or top_k) and therefore there are randomness, but since I have fixed the seed, shouldn't they be the same?
below is the script to reproduce:
import os, random, numpy as np
import transformers
import torch
cache_dir = "some_dir"
model_size="8B"
model_id = f"meta-llama/Meta-Llama-3.1-{model_size}-Instruct"
def seed_everything(seed=1):
os.environ['PYTHONHASHSEED'] = str(seed)
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
# torch.use_deterministic_algorithms(True)
seed_everything(1)
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16, "cache_dir": cache_dir},
device_map="auto",
# do_sample=False
)
message = [
{"role": "system", "content": "You are a helpful assistant that generate random sentences."},
{"role": "user", "content": "please generate a random sentence."}
]
for _ in range(5):
outputs = pipeline(
message,
max_new_tokens = 2048
)
print(outputs[0]["generated_text"][-1]['content'])
Expected behavior
when you fixed the random seed, with the same prompt, each generation should give the same results.