Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unable to Chat - Option never appears #463

Open
meltedhead opened this issue Nov 4, 2024 · 6 comments
Open

[BUG] Unable to Chat - Option never appears #463

meltedhead opened this issue Nov 4, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@meltedhead
Copy link

Description

I can connect to local ollama model for embeddings etc. I can then upload documents and they are indexed successfully but anytime i try to chat then the chat section never appears. If I select a certain document to chat with or search all then i still don't see the option to chat. Any ideas?
Screenshots below showing everything set up
Screenshot 2024-11-04 160156
Screenshot 2024-11-04 155954
Screenshot 2024-11-04 155909
Screenshot 2024-11-04 155848

Reproduction steps

I run the following

# optional (setup env)
conda create -n kotaemon python=3.10
conda activate kotaemon

# clone this repo
git clone https://github.com/Cinnamon/kotaemon
cd kotaemon

pip install -e "libs/kotaemon[all]"
pip install -e "libs/ktem"
Install and unzip PDF_JS_DIST

i have my .env file as below:

# this is an example .env file, use it to create your own .env file and place it in the root of the project

# settings for OpenAI
#OPENAI_API_BASE=https://api.openai.com/v1
#OPENAI_API_KEY=<YOUR_OPENAI_KEY>
#OPENAI_CHAT_MODEL=gpt-3.5-turbo
#OPENAI_EMBEDDINGS_MODEL=text-embedding-ada-002

# settings for Azure OpenAI
#AZURE_OPENAI_ENDPOINT=
#AZURE_OPENAI_API_KEY=
#OPENAI_API_VERSION=2024-02-15-preview
#AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-35-turbo
#AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT=text-embedding-ada-002

# settings for Cohere
COHERE_API_KEY=<COHERE_API_KEY>

# settings for local models
LOCAL_MODEL=llama3.1:8b
LOCAL_MODEL_EMBEDDINGS=nomic-embed-text
LOCAL_EMBEDDING_MODEL_DIM = 768
LOCAL_EMBEDDING_MODEL_MAX_TOKENS = 8192

# settings for GraphRAG
GRAPHRAG_API_KEY=openai_key
GRAPHRAG_LLM_MODEL=gpt-4o-mini
GRAPHRAG_EMBEDDING_MODEL=text-embedding-3-small

# set to true if you want to use customized GraphRAG config file
USE_CUSTOMIZED_GRAPHRAG_SETTING=false

# settings for Azure DI
AZURE_DI_ENDPOINT=
AZURE_DI_CREDENTIAL=

# settings for Adobe API
# get free credential at https://acrobatservices.adobe.com/dc-integration-creation-app-cdn/main.html?api=pdf-extract-api
# also install pip install "pdfservices-sdk@git+https://github.com/niallcm/pdfservices-python-sdk.git@bump-and-unfreeze-requirements"
PDF_SERVICES_CLIENT_ID=
PDF_SERVICES_CLIENT_SECRET=

# settings for PDF.js
PDFJS_VERSION_DIST="pdfjs-4.0.379-dist"
Then I start the app. I can test the LLM and Embeddings and everything is working. I can upload files and they are indexed but I can'#t seem to chat? It just never appears?

Screenshots

![DESCRIPTION](LINK.png)

Logs

No response

Browsers

Chrome

OS

Linux

Additional information

I am running this in google cloud workstation

@meltedhead meltedhead added the bug Something isn't working label Nov 4, 2024
@meltedhead
Copy link
Author

Which version of graphrag and future should be installed? Could this be the cause? When i try to install latest versions it causes lots of issues.

@meltedhead
Copy link
Author

I have tried installing with the run_linux.sh script and again same issue. The only issue i can see is below and when i try and install graphrag and future then i end up with lots of library conflicts and it doesn't start. I keep retrying with various versions of both and can't seem to resolve.

******************************************************
Launching Kotaemon in your browser, please wait...
******************************************************

[nltk_data] Downloading package punkt_tab to
[nltk_data]     /home/user/kotaemon/install_dir/env/lib/python3.10/sit
[nltk_data]     e-packages/llama_index/core/_static/nltk_cache...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.
GraphRAG dependencies not installed. Try `pip install graphrag future` to install. GraphRAG retriever pipeline will not work properly.
Nano-GraphRAG dependencies not installed. Try `pip install nano-graphrag` to install. Nano-GraphRAG retriever pipeline will not work properly.
User "admin" created successfully
Setting up quick upload event
Running on local URL:  http://127.0.0.1:7860

@meltedhead
Copy link
Author

i have managed to get 3.0.5 of futures installed and graphrag version 0.1.1 but still no chat when i launch. when i check pip check i get

 pip check
ipykernel 6.29.5 requires pyzmq, which is not installed.
jupyter-client 8.6.3 requires pyzmq, which is not installed.
gradio 4.39.0 has requirement aiofiles<24.0,>=22.0, but you have aiofiles 24.1.0.

I then uninstall aiofiles and try to reinstall but i get

 pip install "aiofiles<24.0"     
Collecting aiofiles<24.0
  Downloading aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Downloading aiofiles-23.2.1-py3-none-any.whl (15 kB)
Installing collected packages: aiofiles
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
graphrag 0.1.1 requires aiofiles<25.0.0,>=24.1.0, but you have aiofiles 23.2.1 which is incompatible.
Successfully installed aiofiles-23.2.1

Is this the cause? How can i get around this?

@meltedhead
Copy link
Author

Despite the Pip Issues above, I can now see the chat input but still have the compatibility issues. The chats are totally irrelevant though as the LLM can't access my files.
Screenshot 2024-11-05 120101

@meltedhead
Copy link
Author

Various errors showing in the log. When i try to upload files for GraphRag i get errors as below.

Traceback (most recent call last):
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/user/kotaemon/libs/kotaemon/kotaemon/indices/vectorindex.py", line 203, in query_vectorstore
    vs_docs = self.doc_store.get(vs_ids)
  File "/home/user/kotaemon/libs/kotaemon/kotaemon/storages/docstores/lancedb.py", line 109, in get
    .to_list()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lancedb/query.py", line 303, in to_list
    return self.to_arrow().to_pylist()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lancedb/query.py", line 760, in to_arrow
    return ds.to_table(
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lance/dataset.py", line 435, in to_table
    ).to_table()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lance/dataset.py", line 2202, in to_table
    return self.to_reader().read_all()
  File "pyarrow/ipc.pxi", line 757, in pyarrow.lib.RecordBatchReader.read_all
  File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
OSError: Io error: Execution error: External error: Execution error: ExecNode(Take): thread panicked: task 10 panicked
Got 0 from vectorstore
Got 0 from docstore
Cohere API key not found. Skipping rerankings.
Got raw 0 retrieved documents
thumbnail docs 0 non-thumbnail docs 0 raw-thumbnail docs 0
retrieval step took 1.051011562347412
Got 0 retrieved documents
len (original) 0
Got 0 images
Trying LLM streaming
Got 0 cited docs
User-id: 1, can see public conversations: True
User-id: 1, can see public conversations: True
No row was found when one was required
/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/gradio/components/dropdown.py:188: UserWarning:

The value passed into gr.Dropdown() is not in the list of choices. Please update the list of choices to include:  or set allow_custom_value=True.

User-id: 1, can see public conversations: True
Session reasoning type simple
Session LLM None
Reasoning class <class 'ktem.reasoning.simple.FullQAPipeline'>
Reasoning state {'app': {'regen': False}, 'pipeline': {}}
Thinking ...
Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x7ee7d83fcb80>, FSPath=PosixPath('/home/user/kotaemon/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x7ee7d83fc970>, get_extra_table=True, llm_scorer=LLMTrulensScoring(concurrent=True, normalize=10, prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x7ee7cb7a61d0>, system_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x7ee7cb7a6650>, top_k=3, user_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x7ee7cb7a57b0>), mmr=True, rerankers=[CohereReranking(cohere_api_key='<COHERE_API_KEY>', model_name='rerank-multilingual-v2.0')], retrieval_mode='hybrid', top_k=10, user_id=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset_ object at 0x7ee83468a380>, FSPath=<theflow.base.unset_ object at 0x7ee83468a380>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset_ object at 0x7ee83468a380>, VS=<theflow.base.unset_ object at 0x7ee83468a380>, file_ids=[], user_id=<theflow.base.unset_ object at 0x7ee83468a380>)]
searching in doc_ids []
Got 0 retrieved documents
len (original) 0
Got 0 images
Trying LLM streaming
Got 0 cited docs
User-id: 1, can see public conversations: True
User-id: 1, can see public conversations: True
No row was found when one was required
User-id: 1, can see public conversations: True
Session reasoning type simple
Session LLM None
Reasoning class <class 'ktem.reasoning.simple.FullQAPipeline'>
Reasoning state {'app': {'regen': False}, 'pipeline': {}}
Thinking ...
Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x7ee7d83fcb80>, FSPath=PosixPath('/home/user/kotaemon/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x7ee7d83fc970>, get_extra_table=True, llm_scorer=LLMTrulensScoring(concurrent=True, normalize=10, prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x7ee7cb7d4040>, system_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x7ee7cb7d4a00>, top_k=3, user_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x7ee7cb7d4af0>), mmr=True, rerankers=[CohereReranking(cohere_api_key='<COHERE_API_KEY>', model_name='rerank-multilingual-v2.0')], retrieval_mode='hybrid', top_k=10, user_id=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset_ object at 0x7ee83468a380>, FSPath=<theflow.base.unset_ object at 0x7ee83468a380>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset_ object at 0x7ee83468a380>, VS=<theflow.base.unset_ object at 0x7ee83468a380>, file_ids=[], user_id=<theflow.base.unset_ object at 0x7ee83468a380>)]
searching in doc_ids ['0a1908f7-2b7b-4504-8558-e5f30e1a09f7', '0c28dda8-b4ed-42ad-8fac-91e4ddba8af7', 'a3d6d5be-3833-4100-9342-0827c97b75c7']
retrieval_kwargs: dict_keys(['do_extend', 'scope', 'filters', 'mode', 'mmr_threshold'])
Number of requested results 100 is greater than number of elements in index 74, updating n_results = 74
thread 'lance_background_thread' panicked at /home/runner/work/lance/lance/rust/lance-encoding/src/decoder.rs:686:29:
Expected a list column
Exception in thread Thread-5 (query_vectorstore):
Traceback (most recent call last):
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/user/kotaemon/libs/kotaemon/kotaemon/indices/vectorindex.py", line 203, in query_vectorstore
    vs_docs = self.doc_store.get(vs_ids)
  File "/home/user/kotaemon/libs/kotaemon/kotaemon/storages/docstores/lancedb.py", line 109, in get
    .to_list()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lancedb/query.py", line 303, in to_list
    return self.to_arrow().to_pylist()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lancedb/query.py", line 760, in to_arrow
    return ds.to_table(
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lance/dataset.py", line 435, in to_table
    ).to_table()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lance/dataset.py", line 2202, in to_table
    return self.to_reader().read_all()
  File "pyarrow/ipc.pxi", line 757, in pyarrow.lib.RecordBatchReader.read_all
  File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
OSError: Io error: Execution error: External error: Execution error: ExecNode(Take): thread panicked: task 37 panicked
Got 0 from vectorstore
Got 0 from docstore
Cohere API key not found. Skipping rerankings.
Got raw 0 retrieved documents
thumbnail docs 0 non-thumbnail docs 0 raw-thumbnail docs 0
retrieval step took 0.4497072696685791
Got 0 retrieved documents
len (original) 0
Got 0 images
Trying LLM streaming
Got 0 cited docs
Session reasoning type simple
Session LLM None
Reasoning class <class 'ktem.reasoning.simple.FullQAPipeline'>
Reasoning state {'app': {'regen': False}, 'pipeline': {}}
Thinking ...
Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x7ee7d83fcb80>, FSPath=PosixPath('/home/user/kotaemon/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x7ee7d83fc970>, get_extra_table=True, llm_scorer=LLMTrulensScoring(concurrent=True, normalize=10, prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x7ee7cb7d6470>, system_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x7ee7cb7d6590>, top_k=3, user_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x7ee7cb7d6680>), mmr=True, rerankers=[CohereReranking(cohere_api_key='<COHERE_API_KEY>', model_name='rerank-multilingual-v2.0')], retrieval_mode='hybrid', top_k=10, user_id=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset_ object at 0x7ee83468a380>, FSPath=<theflow.base.unset_ object at 0x7ee83468a380>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset_ object at 0x7ee83468a380>, VS=<theflow.base.unset_ object at 0x7ee83468a380>, file_ids=[], user_id=<theflow.base.unset_ object at 0x7ee83468a380>)]
searching in doc_ids ['0a1908f7-2b7b-4504-8558-e5f30e1a09f7', '0c28dda8-b4ed-42ad-8fac-91e4ddba8af7', 'a3d6d5be-3833-4100-9342-0827c97b75c7']
User-id: 1, can see public conversations: True
retrieval_kwargs: dict_keys(['do_extend', 'scope', 'filters', 'mode', 'mmr_threshold'])
Number of requested results 100 is greater than number of elements in index 74, updating n_results = 74
thread 'lance_background_thread' panicked at /home/runner/work/lance/lance/rust/lance-encoding/src/decoder.rs:686:29:
Expected a list column
Exception in thread Thread-7 (query_vectorstore):
Traceback (most recent call last):
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/user/kotaemon/libs/kotaemon/kotaemon/indices/vectorindex.py", line 203, in query_vectorstore
    vs_docs = self.doc_store.get(vs_ids)
  File "/home/user/kotaemon/libs/kotaemon/kotaemon/storages/docstores/lancedb.py", line 109, in get
    .to_list()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lancedb/query.py", line 303, in to_list
    return self.to_arrow().to_pylist()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lancedb/query.py", line 760, in to_arrow
    return ds.to_table(
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lance/dataset.py", line 435, in to_table
    ).to_table()
  File "/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/lance/dataset.py", line 2202, in to_table
    return self.to_reader().read_all()
  File "pyarrow/ipc.pxi", line 757, in pyarrow.lib.RecordBatchReader.read_all
  File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
OSError: Io error: Execution error: External error: Execution error: ExecNode(Take): thread panicked: task 64 panicked
Got 0 from vectorstore
Got 0 from docstore
Cohere API key not found. Skipping rerankings.
Got raw 0 retrieved documents
thumbnail docs 0 non-thumbnail docs 0 raw-thumbnail docs 0
retrieval step took 0.3767068386077881
Got 0 retrieved documents
len (original) 0
Got 0 images
Trying LLM streaming
Got 0 cited docs
use_quick_index_mode False
reader_mode default
Using reader <kotaemon.loaders.excel_loader.PandasExcelReader object at 0x7ee7cb7b73a0>
/home/user/kotaemon/libs/kotaemon/kotaemon/loaders/excel_loader.py:87: FutureWarning:

Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

Got 0 page thumbnails
Adding documents to doc store
[2024-11-05T12:05:12Z WARN  lance::dataset] No existing dataset at /home/user/kotaemon/ktem_app_data/user_data/docstore/index_2.lance, it will be created
indexing step took 0.3589944839477539
Using reader <kotaemon.loaders.pdf_loader.PDFThumbnailReader object at 0x7ee7cb702050>
/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/pypdf/_crypt_providers/_cryptography.py:32: CryptographyDeprecationWarning:

ARC4 has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.ARC4 and will be removed from this module in 48.0.0.

Page numbers: 31
Got 31 page thumbnails
Adding documents to doc store
indexing step took 0.8728067874908447
Using reader <kotaemon.loaders.pdf_loader.PDFThumbnailReader object at 0x7ee7cb702050>
Page numbers: 202
Got 202 page thumbnails
Adding documents to doc store
indexing step took 2.9457244873046875
Initializing project at /home/user/kotaemon/ktem_app_data/user_data/files/graphrag/744998a0-91f0-484a-a8d6-24cd005cacc2

/home/user/kotaemon/install_dir/env/lib/python3.10/site-packages/numpy/core/fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
  return bound(*args, **kwds)

@meltedhead
Copy link
Author

I feel like this might be something simple. Any help would be much appreciated as i am wasting a lot of time trying to fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant