Traces
All Ops
Filter
inputs
output
Trace
Feedback
Status
instruction
max_results
model
paper
query
self
N/A
N/A
N/A
N/A
N/A
Determine how I would best incorporate these benchmarks for my customer support RAG system. What evaluations would work best specifically for me?
N/A
N/A
N/A
N/A
N/A
8
N/A
N/A
(ti:agentic OR ti:constituitional OR ab:agentic OR ab:"language model" OR ab:"large language model") AND (ti:advanc* OR ab:advanc* OR ab:recent OR ab:latest) AND (cat:cs.CL OR cat:cs.AI OR cat:cs.LG)
N/A
["weave:///a-sh0ts/arxiv-chain-of-density-summarization/object/ArxivPaper:NfN0BAgsWaqE9oUD3xrgu06cY8837uftJK85G0EWH38","weave:///a-sh0ts/arxiv-chain-of-density-summarization/object/ArxivPaper:XzfVvaLuhSDw7a0YOq05q4nVIl6hUtglFB8ksn9KE2s","weave:///a-sh0ts/arxiv-chain-of-density-summarization/object/ArxivPaper:sbYpVBzUXMSRz0UUMUrdesllfgYMaaUnqzq50YYnWwA","weave:///a-sh0ts/arxiv-chain-of-density-summarization/object/ArxivPaper:oSTJVC4KYvHZugtq9lNCXaXlBYvTqR74fGOlGQTCcRY","weave:///a-sh0ts/arxiv-cha...
Answer the following question: What are the latest advancements in Agentic LLMs?
N/A
claude-3-sonnet-20240229
N/A
N/A
N/A
["(ti:agentic OR ti:constituitional OR ab:agentic OR ab:\"language model\" OR ab:\"large language model\") AND (ti:advanc* OR ab:advanc* OR ab:recent OR ab:latest) AND (cat:cs.CL OR cat:cs.AI OR cat:cs.LG)",8]
1-6 of 6
Per page:
50