Better citation management #1120
Replies: 3 comments 5 replies
-
This is super interesting, but would result in a lot of verification steps. I wonder if maybe this could be solved with better prompt engineering? Pushing the model to only include sources that it exist, etc? |
Beta Was this translation helpful? Give feedback.
-
@assafelovic , as you may have seen on the tickets, I am currently digging into document-only work, and there wen currently have very bad citations, and empty list of URLs. Both "local" and "hybrid" need to treat document chunks more like URLs for the existing mechanism to work. |
Beta Was this translation helpful? Give feedback.
-
Sup @danieldekay ❤ it! I also agree there are a lot of interesting things we can do with the metadata of the Langchain Documents. @assafelovic, we've had a lot of cases where users are also confused because the documents flow begins with a web search. I'd be happy to think through the upgrades with @danieldekay to the documents and vector store modules Something I've also been thinking about is treating vector stores as another swappable component (like we do with LLM's and Retrievers) Happy to hear your thoughts 🙏 |
Beta Was this translation helpful? Give feedback.
-
Currently we are asking the LLM to cite using a markdown hyperlink, formatted in APA or whatever format name we use.
This works well in some cases, but fails in other cases. Usually you don't know, because you don't click every link and see, but since we ask for a hyperlink failed cases are always hallucinated, so only a check of the URL for a 404 or something else works.
Would it not be better, if we cite with metadata from the documents or chunks, and generate the hyperlinks/table of references more algorithmically?
@assafelovic - have you experimented with something like that?
We could use the metadata in the langchain documents for the citations, like a universal key for each. This way we can even add "page 7 on PDF file ABC" for citations.
Crazy version: In my own project I convert the final markdown to PDF using pandoc and my own latex template -- i bet we could make fancy footnotes and citations here as well, if we would create an intermediate bibtex file...
Beta Was this translation helpful? Give feedback.
All reactions