Object: LiteLLMModel
Object
model_name
system_prompt
temp
max_tokens
top_p
max_retries
predict
Category
User
deepseek-r1
These are trick questions designed to expose weaknesses in reasoning, not just knowledge. Focus on the literal details, not the expected scenario. These are often variations of riddles with subtle, easy-for-humans flaws.
Rephrase: Carefully rewrite the question, noting any seemingly unimportant details that could change the meaning. Consider regional language differences.
Facts: List the explicit factual statements from the question.
Analyze: Evaluate physics (under real world conditions not ...
0.4
20,480
0.95
3
Model
deepseek-r1-70b
These are trick questions designed to expose weaknesses in reasoning, not just knowledge. Focus on the literal details, not the expected scenario. These are often variations of riddles with subtle, easy-for-humans flaws.
Rephrase: Carefully rewrite the question, noting any seemingly unimportant details that could change the meaning. Consider regional language differences.
Facts: List the explicit factual statements from the question.
Analyze: Evaluate physics (under real world conditions not ...
0.4
20,480
0.7
3
Model
deepseek-r1-70b
These are trick questions designed to expose weaknesses in reasoning, not just knowledge. Focus on the literal details, not the expected scenario. These are often variations of riddles with subtle, easy-for-humans flaws.
Rephrase: Carefully rewrite the question, noting any seemingly unimportant details that could change the meaning. Consider regional language differences.
Facts: List the explicit factual statements from the question.
Analyze: Evaluate physics (under real world conditions not ...
0.4
2,048
0.95
3
Model
deepseek-r1-70b
These are trick questions designed to expose weaknesses in reasoning, not just knowledge. Focus on the literal details, not the expected scenario. These are often variations of riddles with subtle, easy-for-humans flaws.
Rephrase: Carefully rewrite the question, noting any seemingly unimportant details that could change the meaning. Consider regional language differences.
Facts: List the explicit factual statements from the question.
Analyze: Evaluate physics (under real world conditions not ...
0.4
2,048
0.95
3
Model
deepseek-r1-70b
These are trick questions designed to expose weaknesses in reasoning, not just knowledge. Focus on the literal details, not the expected scenario. These are often variations of riddles with subtle, easy-for-humans flaws.
Rephrase: Carefully rewrite the question, noting any seemingly unimportant details that could change the meaning. Consider regional language differences.
Facts: List the explicit factual statements from the question.
Analyze: Evaluate physics (under real world conditions not ...
0.4
2,048
0.95
3
Model
deepseek-r1-70b
These are trick questions designed to expose weaknesses in reasoning, not just knowledge. Focus on the literal details, not the expected scenario. These are often variations of riddles with subtle, easy-for-humans flaws.
Rephrase: Carefully rewrite the question, noting any seemingly unimportant details that could change the meaning. Consider regional language differences.
Facts: List the explicit factual statements from the question.
Analyze: Evaluate physics (under real world conditions not ...
0
2,048
0.95
3
Model
deepseek-r1
You are an expert at reasoning and you always pick the most realistic answer. Think step by step and output your reasoning followed by your final answer using the following format: Final Answer: X where X is one of the letters A, B, C, D, E, or F.
0.7
2,048
0.95
3
Model
deepseek-r1-70b
You are an expert at reasoning and you always pick the most realistic answer. Think step by step and output your reasoning followed by your final answer using the following format: Final Answer: X where X is one of the letters A, B, C, D, E, or F.
0.7
2,048
0.95
3
Model
deepseek-r1-70b
You are an expert at reasoning and you always pick the most realistic answer. Think step by step and output your reasoning followed by your final answer using the following format: Final Answer: X where X is one of the letters A, B, C, D, E, or F.
0
2,048
0.95
3
Model
Total Rows: 9