Is there a plugin that can read documents in text-generation-webui? #1933
Replies: 8 comments 7 replies
-
https://github.com/sebaxzero/LangChain_PDFChat_Oobabooga OR I hope @oobabooga looks into integrating this as an extension |
Beta Was this translation helpful? Give feedback.
-
I've been looking into ways to use RAG locally with various models, and I'm a bit confused about Superbooga/v2. So you can't upload and vectorise PDFs with it? What about epubs? TXT docs? Also - does it always add the chats to the vectorDB, or only what we tell it to add? The vectorDB would pretty soon get filled with garbage if it automatically puts the whole/every chat into the vector store. Is it possible to specify a directory for the vector store, so we can have different vector DBs for different purposes, with different embeddings in each? |
Beta Was this translation helpful? Give feedback.
-
i look also for PDFF support like in GTP4ALL superbooga support only text, i can load files to but for PDF i get an error .... |
Beta Was this translation helpful? Give feedback.
-
I had issues with this as well, pypdf, docx2txt, and maybe rtf support
would be awesome.
…On Mon, Dec 18, 2023 at 9:16 AM kalle ***@***.***> wrote:
i look also for PDFF support like in GTP4ALL
superbooga
superboogav2
support only text, i can load files to but for PDF i get an error
....
Output generated in 16.64 seconds (34.50 tokens/s, 574 tokens, context 66,
seed 263742790)
Traceback (most recent call last):
File
"e:\text-generation-webui\installer_files\env\lib\site-packages\gradio\queueing.py",
line 407, in call_prediction
output = await route_utils.call_process_api(
File
"e:\text-generation-webui\installer_files\env\lib\site-packages\gradio\route_utils.py",
line 226, in call_process_api
output = await app.get_blocks().process_api(
File
"e:\text-generation-webui\installer_files\env\lib\site-packages\gradio\blocks.py",
line 1550, in process_api
result = await self.call_function(
File
"e:\text-generation-webui\installer_files\env\lib\site-packages\gradio\blocks.py",
line 1199, in call_function
prediction = await utils.async_iteration(iterator)
File
"e:\text-generation-webui\installer_files\env\lib\site-packages\gradio\utils.py",
line 519, in async_iteration
return await iterator.*anext*()
File
"e:\text-generation-webui\installer_files\env\lib\site-packages\gradio\utils.py",
line 512, in *anext*
return await anyio.to_thread.run_sync(
File
"e:\text-generation-webui\installer_files\env\lib\site-packages\anyio\to_thread.py",
line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File
"e:\text-generation-webui\installer_files\env\lib\site-packages\anyio_backends_asyncio.py",
line 877, in run_sync_in_worker_thread
return await future
File
"e:\text-generation-webui\installer_files\env\lib\site-packages\anyio_backends_asyncio.py",
line 807, in run
result = context.run(func, *args)
File
"e:\text-generation-webui\installer_files\env\lib\site-packages\gradio\utils.py",
line 495, in run_sync_iterator_async
return next(iterator)
File
"e:\text-generation-webui\installer_files\env\lib\site-packages\gradio\utils.py",
line 649, in gen_wrapper
yield from f(*args, **kwargs)
File "e:\text-generation-webui\extensions\superboogav2\script.py", line
49, in _feed_file_into_collector
text = file.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 10:
invalid continuation byte
......
—
Reply to this email directly, view it on GitHub
<#1933 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVWOXW6HFESALPJPSCOBE5LYKB26VAVCNFSM6AAAAAAX27JTBGVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TQOBYHEYDC>
.
You are receiving this because you commented.Message ID:
<oobabooga/text-generation-webui/repo-discussions/1933/comments/7888901@
github.com>
|
Beta Was this translation helpful? Give feedback.
-
omg best feature ever and 7 month, nothing ^^ |
Beta Was this translation helpful? Give feedback.
-
have a look at https://github.com/langgenius/dify and https://github.com/janhq/jan (RAG coming next week). gpt4all does do it too, but the vectorizing process was quite slow when I tried it. |
Beta Was this translation helpful? Give feedback.
-
oh i can write it here too and programmer guys, maybe take look at "docfetcher" its not an LLM but an open document index programm, its old and JAVA |
Beta Was this translation helpful? Give feedback.
-
This is for automatic1111 This extension for ooba also has OCR maybe you can do something with this, I am not a coder sorry. |
Beta Was this translation helpful? Give feedback.
-
Is there a plugin that can read documents in text-generation-webui?
Beta Was this translation helpful? Give feedback.
All reactions