model-latency-benchmarking Workspace – Weights & Biases

Skip to main content

Assets

BaselineHF:v35

Name

BaselineHF(37 versions)

Last updated

10 months ago

Storage size

578B (76.0kB from all versions)

Path

Value

model_name

Qwen/Qwen2.5-0.5B-Instruct-GGUF

device

cpu

llm_model

<llama_cpp.llama.Llama object at 0x7a4344df3af0>

tokenizer

null

use_torch_compile

true

torch_dtype

torch.bfloat16

set_threads_and_interop

false

thread_count

6

max_new_tokens

1

use_llamacpp

true

predict

BaselineHF.predict:v13