A demonstration of using LangServe to create an API from a LCEL Rag Chain!
Getting started is as easy as 2 steps:
pip install langchain-cli[all]
langchain app new /YOUR/PATH/HERE
With that, you should see a directory with the following structure:
Using the Notebook found here - we created, and then saved, a FAISS-backed VectorStore containing information from the LangServe repository.
Within the server.py
found here, we can create our chain.
We leverage our pre-created index and some Hugging Face Inference endpoints (hosting Mistral-7B-Instruct-v0.1, and WhereIsAI/UAE-Large-V1 embeddings) through LangChain - and then create a simple RAG chain using LCEL.
hf_llm = HuggingFaceEndpoint(
endpoint_url="<<YOUR URL HERE>>",
huggingfacehub_api_token=os.environ["HF_TOKEN"],
task="text-generation",
)
embeddings_model = HuggingFaceInferenceAPIEmbeddings(
api_key=os.environ["HF_TOKEN"],
api_url="<<YOUR URL HERE>>",
)
faiss_index = FAISS.load_local("langserve_index", embeddings_model)
retriever = faiss_index.as_retriever()
prompt_template = """\
Use the provided context to answer the user's question. If you don't know the answer, say you don't know.
Context:
{context}
Question:
{question}"""
rag_prompt = ChatPromptTemplate.from_template(prompt_template)
entry_point_chain = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
rag_chain = entry_point_chain | rag_prompt | hf_llm | StrOutputParser()
Now we can map our chain to its own custom route using:
add_routes(app, rag_chain, path="/rag")
All that's left to do is:
langchain serve
!
Head on over to localhost:8000/rag/playground
to get experimenting with your chain!