Skip to main content
llm-jp-eval
Projects
offline-benchmark
Reports
vllm vs Transformers推論速度表
Log in
Sign up
Share
Comment
Star
vllm vs Transformers推論速度表
Kei Kamata
Created on September 27
|
Last edited on September 27
Comment
L4_24GB_vllmSection 1
runs
.
summary
["
L4_24GB_vllm_table
"]
⏎
Filter
8B
2048
8
6857.371
6822.747
34.624
0.4853
8B
2048
4
7242.26
7209.679
32.581
0.4859
8B
2048
2
8770.189
8721.984
48.205
0.4874
8B
2048
4
7042.752
7010.338
32.414
0.4879
8B
2048
1
19450.007
19417.978
32.029
0.4865
model size
precision
max_model_len
tensor parallel
run
(inference)
(init)
avg score
1
2
3
4
5
Run: warm-brook-386
1
L4_24GB_transformers
Run: warm-brook-386
1
Add a comment
List<File<(table)>>
Ops
.contents
.count
.digest
.dropna
.filter((row) => row)
.isNone
.join(, (row) => row, (row) => row, "", "", , )
.joinToStr("")
.map((row, index) => row)
.merge("")
.size
.table
.table("")
[]
.project
.run