Skip to content

Commit e7c6fc6

Browse files
docs: add cpu benchmark (#1366)
* cpu benchmark * try to fix formatting * cleanup * cleanup --------- Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com>
1 parent aa57bd8 commit e7c6fc6

File tree

1 file changed

+15
-1
lines changed

1 file changed

+15
-1
lines changed

docs/source/non_cuda_backends.mdx

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,4 +24,18 @@ Thank you for your support!
2424

2525
### Intel
2626

27-
### AMD
27+
The following performance data is collected from Intel 4th Gen Xeon (SPR) platform. The tables show speed-up and memory compared with different data types of [Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf).
28+
29+
#### Inference (CPU)
30+
31+
| Data Type | BF16 | INT8 | NF4 | FP4 |
32+
|---|---|---|---|---|
33+
| Speed-Up (vs BF16) | 1.0x | 0.6x | 2.3x | 0.03x |
34+
| Memory (GB) | 13.1 | 7.6 | 5.0 | 4.6 |
35+
36+
#### Fine-Tuning (CPU)
37+
38+
| Data Type | AMP BF16 | INT8 | NF4 | FP4 |
39+
|---|---|---|---|---|
40+
| Speed-Up (vs AMP BF16) | 1.0x | 0.38x | 0.07x | 0.07x |
41+
| Memory (GB) | 40 | 9 | 6.6 | 6.6 |

0 commit comments

Comments
 (0)