$ torchrun --nproc_per_node 1 test2_text_completion.py \
--tokenizer_path tokenizer.model --max_seq_len 128 --max_batch_size 4 --ckpt_dir llama-2-7b/ \
--prompt "Hello everyone, I'm LLAMA-2"
...
Loaded in 4.32 seconds
Hello everyone, I'm LLAMA-2 000. I'm a new member here. I'm from China. I'm a girl, I'm 16. I like animals, especially lamas. I like to play games, especially the games on the internet. I like to make friends. I hope I can make