Skip to main content

Training a chatbot on personal data with LlamaIndex and W&B

In this article, we'll go over about how we can create a chatbot on personal data using Llamaindex and local models with a Weights & Biases integration
Created on June 15|Last edited on June 20

In this article, we'll go over creating a simple chatbot on some private data using local models (both llm and embedding model) using Llamaindex and Weights & Biases.



Table of Contents





Example data

In this example, Let's try to create a system which we can use to ask questions about the original Jurassic Park Screenplay 🦖 !!! (available online at jurassicpost) llamaindex has great modules which can enable us to easily process and use the PDF as a dataset, viz.
from llama_index.core import Document
from llama_index.readers.file import PyMuPDFReader

documents = PyMuPDFReader().load(file_path='/content/JurassicPark-Final.pdf', metadata=True)

doc_text = "\n\n".join([d.get_content() for d in documents])
docs = [Document(text=doc_text)]

Configuring Llamaindex to use local models

Since in this example we want to use local models for both creating the embeddings and conversing with the model we need to update the global settings and change the embed_model and llm. We can use the BAAI/bge-base-en-v1.5 embedding model as follows:
from llama_index.core import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5")
Similarly we need to modify the llm setting as well (otherwise it defaults to OpenAI). Let's use the Writer/camel-5b-hf model
import torch
from llama_index.core import Settings
from llama_index.core import PromptTemplate
from llama_index.llms.huggingface import HuggingFaceLLM

query_wrapper_prompt = PromptTemplate(
"Below is an instruction that describes a task. "
"Write a response that appropriately completes the request.\n\n"
"### Instruction:\n{query_str}\n\n### Response:"
)

Settings.llm = HuggingFaceLLM(
context_window=2048,
max_new_tokens=256,
generate_kwargs={"do_sample": False},
query_wrapper_prompt=query_wrapper_prompt,
tokenizer_name=config.model,
model_name=config.model,
device_map="auto",
tokenizer_kwargs={"max_length": 2048},
model_kwargs={"torch_dtype": torch.float16}
)

Creating and using a index

Creating and Using a Vector Index is comparatively very easy, viz.
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(documents)

# We can log the index as a wandb artifact !!
wandb_callback.persist_index(index, index_name="camel-5b-hf-index")
The following vector index, is available as a public artifact here. To directly download, use the following snippet
from llama_index.callbacks.wandb import WandbCallbackHandler
wandb_callback = WandbCallbackHandler(run_args=run_args)

storage_context = wandb_callback.load_storage_context(
artifact_url="sauravmaheshkar/llamaindex-local-models/camel-5b-hf-index:v0"
)

# Load the index and initialize a query engine
index = load_index_from_storage(
storage_context,
)
To query the model on any given user provided question, we can simply use it as a query index viz.
query_engine = index.as_query_engine()
response = query_engine.query("Are Velociraptors pack hunters?")


Conclusion

In this article, you read through a brief overview of using a model on private data using locally generated embeddings and how we can use Weights & Biases to log and store the various artifacts.
To see the full suite of W&B features, please check out this short 5-minute guide. If you want more reports covering the math and "from-scratch" code implementations, let us know in the comments down below or on our forum ✨!
Check out these other reports on Fully Connected covering other LLM-related topics.