Skip to main content

Creating a customer support chatbot using Claude 3, Llamaindex and W&B Weave

In this article, we'll go over about how we can create a customer support chatbot using the Claude 3 API, LlamaIndex and W&B Weave.
Created on June 26|Last edited on July 31
In this article, we will cover how to create a simple customer support chatbot for queries related to W&B Weave, a toolkit for developing AI-powered applications built by Weights & Biases. We will use the Claude 3 family of models by Anthropic as our LLM, the official wandb/weave Github repository as the data source and W&B Weave itself for LLM tracing!

Table of Contents





🧠 Anthropic's Claude 3 model

Figure 1: Comparison of the different Claude 3 variants, viz. Haiku, Sonnet and Opus in terms of Benchmark score and cost. Source
The Claude family of models from Anthropic excels in performance on reasoning, coding, multilingual tasks, long-context handling, honesty, and image processing. It offers significant improvements over its older legacy models Claude Instant 1.2, Claude 2.0 and Claude 2.1.
For this article we will use the wrapper provided by LlamaIndex in the python package llama-index-llms-anthropic.
from llama_index.core import Settings
from llama_index.llms.anthropic import Anthropic

model = Anthropic(model="claude-3-haiku-20240307")
Settings.llm = model

📚 Gathering data for our chatbot

To develop a customer support chatbot for this tutorial, I thought a fun use case would be to create a system to answer questions about W&B Weave. There are a couple of data sources to consider here, for instance the hosted W&B Weave documentation and API Reference. As websites can be tedious to scrape, a better source might be the official wandb/weave Github repository.
Thus, we will clone the repository and recursively search for markdown files in the docs/ directory. This can simply be achieved using the SimpleDirectoryReader from LlamaIndex.
from llama_index.core import SimpleDirectoryReader

required_exts = [".md"]

reader = SimpleDirectoryReader(
input_dir="/content/weave/docs",
required_exts=required_exts,
recursive=True,
)

docs = reader.load_data()

🤗 Generate embeddings for our model

We can use a local model to generate embeddings for the vector index, using a off the shelf model from the HuggingFace 🤗 Hub.
from llama_index.core import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

🗂️ Creating the vector lndex

We now have all the essential requirements to create a simple RAG application using LlamaIndex.
Now all that we need to do is create a vector index, for this example we will use the VectorStoreIndex class from LlamaIndex.
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(docs)

💾 Saving the index as an Artifact

Since the act of computing embeddings and creating a vector index is an expensive and time-consuming process, we can save the index as a Weights & Biases artifact. The LlamaIndex Weights & Biases callback employs Weights & Biases Prompts a suite of LLMOps tools built for the development of LLM-powered applications. We will use this handler to persist our index as an artifact.
from llama_index.callbacks.wandb import WandbCallbackHandler

## Instantiate the callback
wandb_callback = WandbCallbackHandler(
run_args={"project": "chatbot-claude3-llamaindex-weave"}
)

## Save the Index as an artifact
wandb_callback.persist_index(index, index_name="claude3-index")

⬇️ Load the index Artifact into memory

Now that we have uploaded our index as an Artifact, we can use the Weights & Biases Callback to download the index and load the index from storage.
from llama_index.core import load_index_from_storage

## Download Artifact into a storage context
storage_context = wandb_callback.load_storage_context(
artifact_url="sauravmaheshkar/chatbot-claude3-llamaindex-weave/claude3-index:v0"
)

## Load the index and initialize a query engine
index = load_index_from_storage(
storage_context,
)

🎙️ Querying our chatbot

Now all that we have to do is create a query engine using the index, and ask questions. For instance,
query_engine = index.as_query_engine()
response = query_engine.query("What python version does weave require ?")

👋 Conclusion

In this article, you read through a brief tutorial to create a customer support chatbot on using the Claude 3 API, LlamaIndex and W&B Weave.
To see the full suite of Weights & Biases features, please check out this short 5-minute guide. Check out these other reports on Fully Connected covering other LlamaIndex tutorials.