Skip to content

Monday #1335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 132 commits into
base: system
Choose a base branch
from
Open

Monday #1335

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
132 commits
Select commit Hold shift + click to select a range
c7c0a30
Remove unused Any import
kit1980 Jul 18, 2023
89092cd
added type hint in example code
tomorrmato Jul 19, 2023
3805003
Update download.sh to not use hardcoded bash path
vinnymeller Jul 20, 2023
7c0a08e
Update LICENSE
sagindykovsl Jul 21, 2023
d3b26d0
Update MODEL_CARD.md
sagindykovsl Jul 21, 2023
f5af1e5
Update README.md
sagindykovsl Jul 21, 2023
3cd7ef6
Remove linkshim workaround from README
Daniel15 Jul 21, 2023
99e19d4
Update README.md
eltociear Jul 22, 2023
a511b0d
Merge pull request #626 from facebookresearch/system
ruanslv Aug 4, 2023
82ce861
updates
ruanslv Aug 7, 2023
14dcd8e
compute token logprobs after completed token is sampled
huy-ha Aug 9, 2023
9eb31e5
fix line separators in download.sh for wsl2
MarcoSteinke Aug 9, 2023
14441f1
still return log probs when no completion required
huy-ha Aug 9, 2023
008385a
Update UPDATES.md
jspisak Aug 11, 2023
ea9f33d
Merge pull request #664 from jspisak/main-2
ruanslv Aug 11, 2023
c25b02d
fix max_batch_size for chat example
yanxiyue Aug 22, 2023
a668741
Merge pull request #703 from yanxiyue/main
jspisak Aug 26, 2023
e36eaa3
Merge pull request #512 from SulimanSagindykov/patch-1
jspisak Aug 26, 2023
1a24068
Merge pull request #511 from SulimanSagindykov/patch-2
jspisak Aug 26, 2023
cb8f042
add docstrings
rajveer43 Aug 28, 2023
7bcee80
update comments in model.py
rajveer43 Aug 28, 2023
8cd608c
added remanjg docs
rajveer43 Aug 28, 2023
a102a59
Update download.sh to resume download of partially downloaded files
samuelselvan Aug 29, 2023
ce27a98
Update README.md
NinoRisteski Aug 31, 2023
6a9bec0
Merge pull request #742 from NinoRisteski/patch-1
vubui Aug 31, 2023
49f0749
Merge pull request #510 from SulimanSagindykov/patch-3
luccabb Aug 31, 2023
1df1f88
Merge pull request #738 from facebookresearch/download-sh-continue
vubui Aug 31, 2023
b5ecd3a
Merge pull request #358 from kit1980/patch-1
vubui Aug 31, 2023
0137f15
Merge pull request #729 from rajveer43/rajveer-patch-1
paksha Aug 31, 2023
ec6a3fd
Merge pull request #650 from MarcoSteinke/main
paksha Aug 31, 2023
4c370db
Merge pull request #492 from eltociear/patch-1
paksha Aug 31, 2023
8992dea
Merge pull request #390 from tomorrmato/add_example_type_hint
paksha Aug 31, 2023
4649acd
use 'md5' instead of 'md5sum' if Applie Silicon
godpeny Aug 28, 2023
6aba873
Merge pull request #490 from Daniel15/patch-1
leonwan23 Sep 1, 2023
a255a05
Merge pull request #727 from godpeny/feat/dl_script
ghk Sep 1, 2023
7565eb6
make download.sh executable (#695)
dangbert Sep 1, 2023
1446089
making a small change to avoid a confusion
Sep 2, 2023
346627f
Merge branch 'facebookresearch:main' into fix
Sep 3, 2023
eb07062
Fix download.sh shebang for NixOS
jheidbrink Sep 3, 2023
8432e48
Update model.py
Sep 3, 2023
8580eb9
Update model.py
Sep 3, 2023
4e24858
Merge pull request #754 from JaredLevi18/fix
jspisak Sep 3, 2023
7706271
Merge pull request #451 from vinnymeller/download-sh-more-flexible
lmarcon Sep 5, 2023
dd6dbbf
Create FAQ.md
jspisak Sep 7, 2023
5827703
Update README.md
jspisak Sep 7, 2023
c769dfd
Merge pull request #647 from huy-ha/main
bashnick Sep 7, 2023
bfbbf1d
Update FAQ.md
sekyondaMeta Sep 8, 2023
646e6d6
Update FAQ.md
sekyondaMeta Sep 8, 2023
797f929
Update FAQ.md
sekyondaMeta Sep 8, 2023
fb624f4
Update FAQ.md
sekyondaMeta Sep 8, 2023
bb2f693
Update FAQ.md
jspisak Sep 8, 2023
7350119
Update FAQ.md
jspisak Sep 9, 2023
2db73a5
Merge pull request #769 from sekyondaMeta/FAQ-updates
jspisak Sep 9, 2023
6c2f236
Update README.md
sekyondaMeta Sep 9, 2023
d06e1e1
Update README.md
sekyondaMeta Sep 9, 2023
001b672
Update README.md
sekyondaMeta Sep 9, 2023
f2e6eac
Update README.md
sekyondaMeta Sep 9, 2023
ac19393
Update README.md
sekyondaMeta Sep 9, 2023
1bc5221
Merge pull request #755 from jheidbrink/main
jspisak Sep 10, 2023
c9c493f
add seed
Sep 11, 2023
46646b8
Merge pull request #775 from sekyondaMeta/readmeUpdate
sekyondaMeta Sep 11, 2023
d7e2e37
Update FAQ.md
jspisak Sep 14, 2023
7173899
Merge pull request #779 from javier-m/add-seed
jspisak Sep 15, 2023
4869110
Update FAQ.md
jspisak Sep 16, 2023
d58f9ae
Update FAQ.md
jspisak Sep 16, 2023
9f0e393
Update README.md
jspisak Sep 17, 2023
a5e37ce
Update MODEL_CARD.md
jspisak Sep 20, 2023
5c10818
Update FAQ.md
jspisak Sep 20, 2023
843e41f
Merge pull request #814 from facebookresearch/jspisak-patch-2
jspisak Sep 21, 2023
b00a461
Merge pull request #813 from facebookresearch/jspisak-patch-1
jspisak Sep 21, 2023
4660bd3
Add "--continue" flag to wget for model binary in order to resume dow…
kierenAW Sep 23, 2023
f29c9a8
Update README.md
jspisak Sep 26, 2023
5e13e29
Merge pull request #829 from facebookresearch/jspisak-patch-3
jspisak Sep 26, 2023
7e1b864
Merge pull request #822 from kierenAW/main
samuelselvan Sep 29, 2023
98851c3
Update FAQ.md
sekyondaMeta Oct 11, 2023
5d9bb58
Update FAQ.md
sekyondaMeta Oct 11, 2023
0da077c
Update FAQ.md
jspisak Oct 11, 2023
556949f
Merge pull request #851 from sekyondaMeta/FAQ-updates
jspisak Oct 11, 2023
f9ddb1d
change "Content Length" to "Context Length MODEL_CARD.md
yonashub Oct 15, 2023
6b8cff0
Merge pull request #859 from yonashub/patch-1
jspisak Oct 15, 2023
0cc2987
Update issue templates
subramen Oct 16, 2023
1c95a19
Merge pull request #860 from facebookresearch/add-issue-template
jspisak Oct 16, 2023
06faf3a
Add FAQs
subramen Oct 18, 2023
786af96
Update README.md
jspisak Nov 2, 2023
3f750f4
Merge pull request #890 from facebookresearch/jspisak-patch-4
jspisak Nov 2, 2023
664ddc8
Delete FAQ.md
jspisak Nov 2, 2023
b5cd38a
Merge pull request #891 from facebookresearch/jspisak-patch-5
jspisak Nov 2, 2023
7909dee
Correct "bug," typo to "bug", in README.md
JacobHelwig Nov 2, 2023
54d4463
Merge pull request #897 from JacobHelwig/main
jspisak Nov 2, 2023
e9077bd
Fix key-value caching for seqlen != 1
flu0r1ne Nov 3, 2023
9cd8d50
Update issue templates
subramen Nov 8, 2023
dccf644
fix faq link
subramen Nov 8, 2023
94b055f
Update README.md
jspisak Nov 10, 2023
4835a30
Merge pull request #916 from facebookresearch/jspisak-patch-6
jspisak Nov 10, 2023
6b3154b
Update transformer mask comment
flu0r1ne Nov 13, 2023
cd0719d
Correct KV comment seqlen -> seqlen + cache_len
flu0r1ne Nov 13, 2023
ef351e9
Merge pull request #900 from flu0r1ne/main
ruanslv Nov 14, 2023
53b227b
Update README.md
ryanhankins Feb 21, 2024
3f61918
Merge pull request #1033 from ryanhankins/patch-1
jspisak Feb 23, 2024
c28bdb5
Updating contributor guide
fbnav Feb 28, 2024
6796a91
Merge pull request #1046 from facebookresearch/update-contributing_guide
jspisak Feb 28, 2024
acdb925
Update README.md
ShorthillsAI Mar 1, 2024
a0a4da8
Merge pull request #1053 from shorthills-ai/main
jspisak Mar 1, 2024
11ebe80
Update README.md
ShorthillsAI Mar 6, 2024
9a001c7
Merge pull request #1058 from shorthills-ai/main
jspisak Mar 6, 2024
0b46616
change LLaMA to Llama in README
jeffxtang Mar 13, 2024
2f58b8d
Merge pull request #1063 from jeffxtang/LLaMA_lowercase
jspisak Mar 13, 2024
826ad11
Update README.md
jspisak Mar 20, 2024
52afd48
Merge pull request #1076 from meta-llama/jspisak-patch-7
jspisak Mar 20, 2024
1e83758
update the code to use the module's __call__
mst272 Mar 21, 2024
54c22c0
Merge pull request #1077 from mst272/main
subramen Mar 21, 2024
1f9a8d7
Update MODEL_CARD.md
MattGurney Mar 23, 2024
fd73089
Update README.md
osanseviero Apr 8, 2024
04b200c
Merge pull request #1091 from osanseviero/patch-1
samuelselvan Apr 9, 2024
b8348da
Merge pull request #1079 from MattGurney/fix-model-card
samuelselvan Apr 9, 2024
893ff97
README: LLama 2 is no longer the latest version
dandv May 14, 2024
be327c4
Merge pull request #1124 from dandv/patch-1
jspisak May 14, 2024
c0098be
Update download.sh
hyungupark May 15, 2024
12b676b
Update download.sh
samuelselvan Jul 23, 2024
66bc730
Update download.sh
samuelselvan Jul 23, 2024
227d378
Merge pull request #1125 from hyungupark/patch-1
samuelselvan Jul 23, 2024
8fac8be
Update README.md
jspisak Jul 23, 2024
ff2e4fd
Update bug_report.md
giandalia1 Jan 24, 2025
689c7f2
Update README.md
amitsangani Jan 26, 2025
bf159f3
Create django.yml
giandalia1 Jan 29, 2025
da130a0
Update example_text_completion.py
giandalia1 Mar 28, 2025
43e90e8
Merge branch 'meta-llama:main' into main
giandalia1 Apr 1, 2025
77e6153
Update README.md
giandalia1 Apr 6, 2025
e0e4816
Merge pull request #1 from giandalia1/giandalia1-patch-1
giandalia1 Apr 6, 2025
f2306eb
Update README.md
giandalia1 Apr 28, 2025
a689377
Update tokenizer.py
giandalia1 Aug 2, 2025
6448dd5
Update generation.py
giandalia1 Aug 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
name: Bug report
about: Create a report to help us reproduce and fix the issue
title: ''
labels: ''
assignees: ''

---

**Before submitting a bug, please make sure the issue hasn't been already addressed by searching through the [FAQs](https://ai.meta.com/llama/faq/) and [existing/past issues](https://github.com/facebookresearch/llama/issues)**

## Describe the bug
<Please provide a clear and concise description of what the bug is. If relevant, please include a _minimal_ (least lines of code necessary) _reproducible_ (running this will give us the same result as you get) code snippet. Make sure to include the relevant imports.>

### Minimal reproducible example
<Remember to wrap the code in ```` ```triple-quotes blocks``` ````>

```python
# sample code to repro the bug
```

### Output
<Remember to wrap the output in ```` ```triple-quotes blocks``` ````>

```
<paste stacktrace and other outputs here>
```

## Runtime Environment
- Model: [eg: `llama-2-7b-chat`]
- Using via huggingface?: [yes/no]
- OS: [eg. Linux/Ubuntu, Windows]
- GPU VRAM:
- Number of GPUs:
- GPU Make: [eg: Nvidia, AMD, Intel]

**Additional context**
Add any other context about the problem or environment here.
30 changes: 30 additions & 0 deletions .github/workflows/django.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Django CI

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]

jobs:
build:

runs-on: ubuntu-latest
strategy:
max-parallel: 4
matrix:
python-version: [3.7, 3.8, 3.9]

steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install Dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Run Tests
run: |
python manage.py test
8 changes: 7 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@ We want to make contributing to this project as easy and transparent as
possible.

## Pull Requests
We actively welcome your pull requests.
We welcome your pull requests.

### For requests regarding bug-fixes or improvements to the core model:

1. Fork the repo and create your branch from `main`.
2. If you've added code that should be tested, add tests.
Expand All @@ -12,6 +14,10 @@ We actively welcome your pull requests.
5. Make sure your code lints.
6. If you haven't already, complete the Contributor License Agreement ("CLA").

### For requests regarding new feature support, adding additional platform support and model use cases, please contribute to the [llama-recipes repo](https://github.com/facebookresearch/llama-recipes).
<br><br>


## Contributor License Agreement ("CLA")
In order to accept your pull request, we need you to submit a CLA. You only need
to do this once to work on any of Meta's open source projects.
Expand Down
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ owner of such derivative works and modifications.
c. If you institute litigation or other proceedings against Meta or any entity
(including a cross-claim or counterclaim in a lawsuit) alleging that the Llama
Materials or Llama 2 outputs or results, or any portion of any of the foregoing,
constitutes infringement of intellectual property or other rights owned or licensable
constitutes an infringement of intellectual property or other rights owned or licensable
by you, then any licenses granted to you under this Agreement shall terminate as of
the date such litigation or claim is filed or instituted. You will indemnify and hold
harmless Meta from and against any claim by any third party arising out of or related
Expand Down
10 changes: 6 additions & 4 deletions MODEL_CARD.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ Meta developed and released the Llama 2 family of large language models (LLMs),

**Output** Models generate text only.

**Model Architecture** Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align to human preferences for helpfulness and safety.
**Model Architecture** Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

||Training Data|Params|Content Length|GQA|Tokens|LR|
||Training Data|Params|Context Length|GQA|Tokens|LR|
|---|---|---|---|---|---|---|
Llama 2|*A new mix of publicly available online data*|7B|4k|&#10007;|2.0T|3.0 x 10<sup>-4</sup>
Llama 2|*A new mix of publicly available online data*|13B|4k|&#10007;|2.0T|3.0 x 10<sup>-4</sup>
Expand All @@ -33,7 +33,9 @@ Llama 2|*A new mix of publicly available online data*|70B|4k|&#10004;|2.0T|1.5 x
# **Intended Use**
**Intended Use Cases** Llama 2 is intended for commercial and research use in English. Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks.

**Out-of-scope Uses** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in languages other than English. Use in any other way that is prohibited by the Acceptable Use Policy and Licensing Agreement for Llama 2.
**Out-of-scope Uses** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 2 Community License. Use in languages other than English**.

**Note: Developers may fine-tune Llama 2 models for languages beyond English provided they comply with the Llama 2 Community License and the Acceptable Use Policy.

# **Hardware and Software**
**Training Factors** We used custom training libraries, Meta's Research Super Cluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute.
Expand Down Expand Up @@ -69,7 +71,7 @@ For all the evaluations, we use our internal evaluations library.
|Llama 2|13B|24.5|66.9|55.4|65.8|28.7|54.8|39.4|39.1|
|Llama 2|70B|**37.5**|**71.9**|**63.6**|**69.4**|**35.2**|**68.9**|**51.2**|**54.2**|

**Overall performance on grouped academic benchmarks.** *Code:* We report the average pass@1 scores of our models on HumanEval and MBPP. *Commonsense Reasoning:* We report the average of PIQA, SIQA, HellaSwag, WinoGrande, ARC easy and challenge, OpenBookQA, and CommonsenseQA. We report 7-shot results for CommonSenseQA and 0-shot results for all other benchmarks. *World Knowledge:* We evaluate the 5-shot performance on NaturalQuestions and TriviaQA and report the average. *Reading Comprehension:* For reading comprehension, we report the 0-shot average on SQuAD, QuAC, and BoolQ. *MATH:* We report the average of the GSM8K (8 shot) and MATH (4 shot) benchmarks at top 1.
**Overall performance on grouped academic benchmarks.** *Code:* We report the average pass@1 scores of our models on HumanEval and MBPP. *Commonsense Reasoning:* We report the average of PIQA, SIQA, HellaSwag, WinoGrande, ARC easy and challenge, OpenBookQA, and CommonsenseQA. We report 7-shot results for CommonSenseQA and 0-shot results for all other benchmarks. *World Knowledge:* We evaluate the 5-shot performance on NaturalQuestions and TriviaQA and report the average. *Reading Comprehension:* For reading comprehension, we report the 0-shot average on SQuAD, QuAC, and BoolQ. *MATH:* We report the average of the GSM8K (8 shot) and MATH (4 shot) benchmarks at the top 1.

|||TruthfulQA|Toxigen|
|---|---|---|---|
Expand Down
82 changes: 57 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,74 @@
# Llama 2
## **Note of deprecation**

We are unlocking the power of large language models. Our latest version of Llama is now accessible to individuals, creators, researchers and businesses of all sizes so that they can experiment, innovate and scale their ideas responsibly.
Thank you for developing with Llama models. As part of the Llama 3.1 release, we’ve consolidated GitHub repos and added some additional repos as we’ve expanded Llama’s functionality into being an e2e Llama Stack. Please use the following repos going forward:
- [llama-models](https://github.com/meta-llama/llama-models) - Central repo for the foundation models including basic utilities, model cards, license and use policies
- [PurpleLlama](https://github.com/meta-llama/PurpleLlama) - Key component of Llama Stack focusing on safety risks and inference time mitigations
- [llama-toolchain](https://github.com/meta-llama/llama-toolchain) - Model development (inference/fine-tuning/safety shields/synthetic data generation) interfaces and canonical implementations
- [llama-agentic-system](https://github.com/meta-llama/llama-agentic-system) - E2E standalone Llama Stack system, along with opinionated underlying interface, that enables creation of agentic applications
- [llama-cookbook](https://github.com/meta-llama/llama-recipes) - Community driven scripts and integrations

This release includes model weights and starting code for pretrained and fine-tuned Llama language models — ranging from 7B to 70B parameters.
If you have any questions, please feel free to file an issue on any of the above repos and we will do our best to respond in a timely manner.

This repository is intended as a minimal example to load [Llama 2](https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/) models and run inference. For more detailed examples leveraging HuggingFace, see [llama-recipes](https://github.com/facebookresearch/llama-recipes/).
Thank you!

## System Prompt Update

### Observed Issue
We received feedback from the community on our prompt template and we are providing an update to reduce the false refusal rates seen. False refusals occur when the model incorrectly refuses to answer a question that it should, for example due to overly broad instructions to be cautious in how it provides responses.
# (Deprecated) Llama 2

### Updated approach
Based on evaluation and analysis, we recommend the removal of the system prompt as the default setting. Pull request [#626](https://github.com/facebookresearch/llama/pull/626) removes the system prompt as the default option, but still provides an example to help enable experimentation for those using it.
We are unlocking the power of large language models. Llama 2 is now accessible to individuals, creators, researchers, and businesses of all sizes so that they can experiment, innovate, and scale their ideas responsibly.

## Download
This release includes model weights and starting code for pre-trained and fine-tuned Llama language models — ranging from 7B to 70B parameters.

This repository is intended as a minimal example to load [Llama 2](https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/) models and run inference. For more detailed examples leveraging Hugging Face, see [llama-cookbook](https://github.com/facebookresearch/llama-recipes/).

⚠️ **7/18: We're aware of people encountering a number of download issues today. Anyone still encountering issues should remove all local files, re-clone the repository, and [request a new download link](https://ai.meta.com/resources/models-and-libraries/llama-downloads/). It's critical to do all of these in case you have local corrupt files. When you receive the email, copy *only* the link text - it should begin with https://download.llamameta.net and not with https://l.facebook.com, which will give errors.**
## Updates post-launch

See [UPDATES.md](UPDATES.md). Also for a running list of frequently asked questions, see [here](https://ai.meta.com/llama/faq/).

## Download

In order to download the model weights and tokenizer, please visit the [Meta AI website](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) and accept our License.
In order to download the model weights and tokenizer, please visit the [Meta website](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) and accept our License.

Once your request is approved, you will receive a signed URL over email. Then run the download.sh script, passing the URL provided when prompted to start the download. Make sure that you copy the URL text itself, **do not use the 'Copy link address' option** when you right click the URL. If the copied URL text starts with: https://download.llamameta.net, you copied it correctly. If the copied URL text starts with: https://l.facebook.com, you copied it the wrong way.
Once your request is approved, you will receive a signed URL over email. Then run the download.sh script, passing the URL provided when prompted to start the download.

Pre-requisites: make sure you have `wget` and `md5sum` installed. Then to run the script: `./download.sh`.
Pre-requisites: Make sure you have `wget` and `md5sum` installed. Then run the script: `./download.sh`.

Keep in mind that the links expire after 24 hours and a certain amount of downloads. If you start seeing errors such as `403: Forbidden`, you can always re-request a link.

### Access on Hugging Face
### Access to Hugging Face

We are also providing downloads on [Hugging Face](https://huggingface.co/meta-llama). You must first request a download from the Meta AI website using the same email address as your Hugging Face account. After doing so, you can request access to any of the models on Hugging Face and within 1-2 days your account will be granted access to all versions.
We are also providing downloads on [Hugging Face](https://huggingface.co/meta-llama). You can request access to the models by acknowledging the license and filling in the form in the model card of a repo. After doing so, you should get access to all the Llama models of a version (Code Llama, Llama 2, or Llama Guard) within 1 hour.

## Setup
## Quick Start

In a conda env with PyTorch / CUDA available, clone the repo and run in the top-level directory:
You can follow the steps below to quickly get up and running with Llama 2 models. These steps will let you run quick inference locally. For more examples, see the [Llama 2 cookbook repository](https://github.com/facebookresearch/llama-recipes).

1. In a conda env with PyTorch / CUDA available clone and download this repository.

2. In the top-level directory run:
```bash
pip install -e .
```
3. Visit the [Meta website](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) and register to download the model/s.

4. Once registered, you will get an email with a URL to download the models. You will need this URL when you run the download.sh script.

5. Once you get the email, navigate to your downloaded llama repository and run the download.sh script.
- Make sure to grant execution permissions to the download.sh script
- During this process, you will be prompted to enter the URL from the email.
- Do not use the “Copy Link” option but rather make sure to manually copy the link from the email.

6. Once the model/s you want have been downloaded, you can run the model locally using the command below:
```bash
torchrun --nproc_per_node 1 example_chat_completion.py \
--ckpt_dir llama-2-7b-chat/ \
--tokenizer_path tokenizer.model \
--max_seq_len 512 --max_batch_size 6
```
pip install -e .
```
**Note**
- Replace `llama-2-7b-chat/` with the path to your checkpoint directory and `tokenizer.model` with the path to your tokenizer model.
- The `–nproc_per_node` should be set to the [MP](#inference) value for the model you are using.
- Adjust the `max_seq_len` and `max_batch_size` parameters as needed.
- This example runs the [example_chat_completion.py](example_chat_completion.py) found in this repository but you can change that to a different .py file.

## Inference

Expand All @@ -56,7 +86,7 @@ All models support sequence length up to 4096 tokens, but we pre-allocate the ca

These models are not finetuned for chat or Q&A. They should be prompted so that the expected answer is the natural continuation of the prompt.

See `example_text_completion.py` for some examples. To illustrate, see command below to run it with the llama-2-7b model (`nproc_per_node` needs to be set to the `MP` value):
See `example_text_completion.py` for some examples. To illustrate, see the command below to run it with the llama-2-7b model (`nproc_per_node` needs to be set to the `MP` value):

```
torchrun --nproc_per_node 1 example_text_completion.py \
Expand All @@ -70,23 +100,23 @@ torchrun --nproc_per_node 1 example_text_completion.py \
The fine-tuned models were trained for dialogue applications. To get the expected features and performance for them, a specific formatting defined in [`chat_completion`](https://github.com/facebookresearch/llama/blob/main/llama/generation.py#L212)
needs to be followed, including the `INST` and `<<SYS>>` tags, `BOS` and `EOS` tokens, and the whitespaces and breaklines in between (we recommend calling `strip()` on inputs to avoid double-spaces).

You can also deploy additional classifiers for filtering out inputs and outputs that are deemed unsafe. See the llama-recipes repo for [an example](https://github.com/facebookresearch/llama-recipes/blob/main/inference/inference.py) of how to add a safety checker to the inputs and outputs of your inference code.
You can also deploy additional classifiers for filtering out inputs and outputs that are deemed unsafe. See the llama-cookbook repo for [an example](https://github.com/facebookresearch/llama-recipes/blob/main/examples/inference.py) of how to add a safety checker to the inputs and outputs of your inference code.

Examples using llama-2-7b-chat:

```
torchrun --nproc_per_node 1 example_chat_completion.py \
--ckpt_dir llama-2-7b-chat/ \
--tokenizer_path tokenizer.model \
--max_seq_len 512 --max_batch_size 4
--max_seq_len 512 --max_batch_size 6
```

Llama 2 is a new technology that carries potential risks with use. Testing conducted to date has not — and could not — cover all scenarios.
In order to help developers address these risks, we have created the [Responsible Use Guide](Responsible-Use-Guide.pdf). More details can be found in our research paper as well.

## Issues

Please report any software “bug,” or other problems with the models through one of the following means:
Please report any software “bug”, or other problems with the models through one of the following means:
- Reporting issues with the model: [github.com/facebookresearch/llama](http://github.com/facebookresearch/llama)
- Reporting risky content generated by the model: [developers.facebook.com/llama_output_feedback](http://developers.facebook.com/llama_output_feedback)
- Reporting bugs and security concerns: [facebook.com/whitehat/info](http://facebook.com/whitehat/info)
Expand All @@ -106,5 +136,7 @@ See the [LICENSE](LICENSE) file, as well as our accompanying [Acceptable Use Pol
2. [Llama 2 technical overview](https://ai.meta.com/resources/models-and-libraries/llama)
3. [Open Innovation AI Research Community](https://ai.meta.com/llama/open-innovation-ai-research-community/)

## Original LLaMA
For common questions, the FAQ can be found [here](https://ai.meta.com/llama/faq/) which will be kept up to date over time as new questions arise.

## Original Llama8=(
The repo for the original llama release is in the [`llama_v1`](https://github.com/facebookresearch/llama/tree/llama_v1) branch.
Loading