Creating a customer support chatbot using Claude 3, Llamaindex and W&B Weave

In this article, we'll go over about how we can create a customer support chatbot using the Claude 3 API, LlamaIndex and W&B Weave.
Saurav Maheshkar
Created on June 26|Last edited on July 31
Comment
In this article, we will cover how to create a simple customer support chatbot for queries related to W&B Weave, a toolkit for developing AI-powered applications built by Weights & Biases. We will use the Claude 3 family of models by Anthropic as our LLM, the official wandb/weave Github repository as the data source and W&B Weave itself for LLM tracing!
Table of Contents🧠 Anthropic's Claude 3 model📚 Gathering data for our chatbot 🤗 Generate embeddings for our model🗂️ Creating the vector lndex 💾 Saving the index as an Artifact⬇️ Load the index Artifact into memory🎙️ Querying our chatbot👋 Conclusion
﻿
﻿
🧠 Anthropic's Claude 3 model
Figure 1: Comparison of the different Claude 3 variants, viz. Haiku, Sonnet and Opus in terms of Benchmark score and cost. Source﻿
The Claude family of models from Anthropic excels in performance on reasoning, coding, multilingual tasks, long-context handling, honesty, and image processing. It offers significant improvements over its older legacy models Claude Instant 1.2, Claude 2.0 and Claude 2.1.
For this article we will use the wrapper provided by LlamaIndex in the python package llama-index-llms-anthropic. 
from llama_index.core import Settings
from llama_index.llms.anthropic import Anthropic
﻿
model = Anthropic(model="claude-3-haiku-20240307")
Settings.llm = model
📚 Gathering data for our chatbot To develop a customer support chatbot for this tutorial, I thought a fun use case would be to create a system to answer questions about W&B Weave. There are a couple of data sources to consider here, for instance the hosted W&B Weave documentation and API Reference. As websites can be tedious to scrape, a better source might be the official wandb/weave Github repository.
Thus, we will clone the repository and recursively search for markdown files in the docs/ directory. This can simply be achieved using the SimpleDirectoryReader from LlamaIndex.
from llama_index.core import SimpleDirectoryReader
﻿
required_exts = [".md"]
﻿
reader = SimpleDirectoryReader(
    input_dir="/content/weave/docs",
    required_exts=required_exts,
    recursive=True,
)
﻿
docs = reader.load_data()
🤗 Generate embeddings for our modelWe can use a local model to generate embeddings for the vector index, using a off the shelf model from the HuggingFace 🤗 Hub.
from llama_index.core import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
﻿
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
🗂️ Creating the vector lndex We now have all the essential requirements to create a simple RAG application using LlamaIndex.
✅ LLM: Anthropic Claude 3
✅ Embedding Model: BAAI/bge-small-en-v1.5﻿
✅ Documents: Markdown files from wandb/weave repository.
Now all that we need to do is create a vector index, for this example we will use the VectorStoreIndex class from LlamaIndex.
from llama_index.core import VectorStoreIndex
﻿
index = VectorStoreIndex.from_documents(docs)
💾 Saving the index as an ArtifactSince the act of computing embeddings and creating a vector index is an expensive and time-consuming process, we can save the index as a Weights & Biases artifact. The LlamaIndex Weights & Biases callback employs Weights & Biases Prompts a suite of LLMOps tools built for the development of LLM-powered applications. We will use this handler to persist our index as an artifact.
from llama_index.callbacks.wandb import WandbCallbackHandler
﻿
## Instantiate the callback
wandb_callback = WandbCallbackHandler(
    run_args={"project": "chatbot-claude3-llamaindex-weave"}
)
﻿
## Save the Index as an artifact
wandb_callback.persist_index(index, index_name="claude3-index")
⬇️ Load the index Artifact into memoryNow that we have uploaded our index as an Artifact, we can use the Weights & Biases Callback to download the index and load the index from storage.
from llama_index.core import load_index_from_storage
﻿
## Download Artifact into a storage context
storage_context = wandb_callback.load_storage_context(
  artifact_url="sauravmaheshkar/chatbot-claude3-llamaindex-weave/claude3-index:v0"
)
﻿
## Load the index and initialize a query engine
index = load_index_from_storage(
  storage_context,
)
🎙️ Querying our chatbotNow all that we have to do is create a query engine using the index, and ask questions. For instance, 
query_engine = index.as_query_engine()
response = query_engine.query("What python version does weave require ?")
👋 ConclusionIn this article, you read through a brief tutorial to create a customer support chatbot on using the Claude 3 API, LlamaIndex and W&B Weave.
To see the full suite of Weights & Biases features, please check out this short 5-minute guide. Check out these other reports on Fully Connected covering other LlamaIndex tutorials.
Building Advanced Query Engine and Evaluation with LlamaIndex and W&B
This report showcases a few cool evaluation strategies and touches upon a few advanced features in LlamaIndex that can be used to build LLM-based QA bots. It also shows, the usefulness of W&B for building such a system.
Building a RAG-Based Digital Restaurant Menu with LlamaIndex and W&B Weave
Powered by RAG, we will transform the traditional restaurant PDF menu into an AI powered interactive menu! 
﻿
﻿
Add a comment