Llama3 in Google Colab

This is the link for Llama3 model https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct. Make sure that you already has access to the Llama3 model from this following picture:

Make sure the request status is accepted.

Run this script, and then input your token from huggingface.

!huggingface-cli login

Run this script, and customize the messages that you want.

# Use a pipeline as a high-level helper
from transformers import pipeline

messages = [
    {"role": "user", "content": "Do you know elon musk?"},
]
pipe = pipeline("text-generation", model="meta-llama/Llama-3.2-1B-Instruct", max_new_tokens=50)
pipe(messages)

This is the result.

Output:
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
[{'generated_text': [{'role': 'user', 'content': 'Do you know elon musk?'},
   {'role': 'assistant',
    'content': 'Yes, I do know Elon Musk. He is a South African-born entrepreneur, inventor, and business magnate. Born on June 28, 1971, Musk is best known for his ambitious goals in revolutionizing various industries, including:\n\n1'}]}]

All of the scripts are available on Google Colab