Integrating LangChain, Knowledge Graphs, and Retrieval-Augmented Generation for Hotel and Attraction Recommendations
- Introduction
- Features
- Technologies Used
- Architecture
- Installation
- Usage
- Evaluation
- Contributing
- License
- Contact
- Acknowledgements
Welcome to A Travel Agent LLM, an advanced conversational travel assistant designed to help users find hotels, attractions, and transportation options through natural language queries. By integrating a Large Language Model (LLM), LangChain, a custom Knowledge Graph (KG), and Retrieval-Augmented Generation (RAG), this system provides intelligent and contextually relevant recommendations to enhance user travel planning experiences.
- Intent Classification: Distinguishes between hotel-related and non-hotel-related user queries.
- Hotel Recommendations: Utilizes rule-based filtering on a comprehensive hotel dataset to provide accurate hotel suggestions.
- Attraction and Transportation Suggestions: Employs semantic embedding-based retrieval from a custom Knowledge Graph to recommend attractions and transport options.
- Conversational Memory: Maintains context for follow-up queries, ensuring a seamless conversational experience.
- Retrieval-Augmented Generation: Enhances response generation by integrating retrieved data and contextual information.
- Scalable Knowledge Graph: Initially based on New York City, the Knowledge Graph can be extended to include additional destinations.
- Language Model: OpenAI GPT-3.5
- Framework: LangChain
- Knowledge Graph: Custom-built using NetworkX
- Vector Store: FAISS
- Data Sources: Kaggle Hotel Dataset
- Programming Language: Python
- Visualization: TikZ
- Documentation: LaTeX
The system architecture integrates multiple components to deliver a robust travel assistant:
- User Interface: Accepts natural language queries from users.
- Intent Classification: Utilizes an LLM to categorize queries as hotel-related or non-hotel-related.
- Data Retrieval:
- Hotel Queries: Applies rule-based filters on the hotel CSV dataset to find relevant hotels.
- Non-Hotel Queries: Performs semantic searches on the Knowledge Graph using embeddings and FAISS for similarity search.
- Response Generation: Combines retrieved data and context to generate coherent responses using the LLM.
- Conversational Memory: Maintains context for handling follow-up queries seamlessly.
Follow these steps to set up the project locally:
- Hotel Dataset: Download the hotel dataset from Kaggle and place the CSV file in the
data/
directory. - Knowledge Graph: Ensure the Knowledge Graph data is available in the
data/kg/
directory. You may need to preprocess or extend the KG based on your requirements.
Run the main application to start the travel assistant.
The system was evaluated based on several performance metrics:
Metric | With RAG | Without RAG |
---|---|---|
Hotels Provided | 3,994 | Numerous (exact number not provided) |
4-Star Hotels | 490 | Numerous |
5-Star Hotels | 89 | Numerous |
Correct Contact Info | Yes | No, Incorrect, Made-up |
Info on Getting There | Yes | Yes |
Response Cost | High | Medium |
Response Time | Medium | Low |
Statistic | Value |
---|---|
Run Count | 267 |
Total Tokens | 80,001 / $1.02 |
Median Tokens | 221 |
Error Rate | 2% |
% Streaming | 0% |
Latency | P50: 0.68s, P99: 10.13s |
Analysis:
- The integration of RAG significantly improved the accuracy of responses, especially for non-hotel queries.
- The system maintained a low error rate and acceptable latency, ensuring a reliable user experience.
- Response costs are higher with RAG integration, which is a trade-off for improved accuracy and relevance.
- Open a Pull Request
Please ensure your code follows the project's coding standards and includes relevant tests.
This project is licensed under the Creative Commons CC BY 4.0 license.
Shi Qiu
The George Washington University
Email: [email protected]
For any inquiries or feedback, please reach out via email or open an issue on GitHub.