-
Notifications
You must be signed in to change notification settings - Fork 13.9k
server/publc_simplechat tiny (50KB) web client updated with reasoning, vision, builtin clientside tool calls and markdown #17506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
server/publc_simplechat tiny (50KB) web client updated with reasoning, vision, builtin clientside tool calls and markdown #17506
Conversation
Update the immidiate tool call triggering failure and tool call response timeout paths to use the new ToolTemp and MultiChatUI based chat show logics. Actual tool call itself generating errors, is already handled in the previous commit changes.
Pass chatId to tool call, and use chatId in got tool call resp, to decide as to to which chat session the async tool call resp belongs and inturn if auto submit timer should be started if auto is enabled.
This should ensure that tool call responses can be mapped back to the chat session for which it was triggered.
Rather simplify and make the content_equiv provide a relatively simple and neat representation of the reasoning with content and toolcall as the cases may be. Also remove the partial new para that I had introduced in the initial go for reasoning.
Update existing flow so that next Tool Role message is handled directly from within
Also take care of updating the toolcall ui if needed from within this.
Fix up the initial skeleton / logic as needed. Remember that we are working with potentially a subset of chat messages from the session, given the sliding window logic of context managing on client ui side, so fix up the logic to use the right subset of messages array and not the global xchat when deciding whether a message is the last or last but one, which need special handling wrt Assistant (with toolcall) and Tool (ie response) messages. Moving tool call ui setup as well as tool call response got ui setup into ChatShow of MultiChatUI ensures that switching between chat sessions handle the ui wrt tool call triggering ui and tool call response submission related ui as needed properly. Rather even loading a previously auto saved chat session if it had tool call or tool call response to be handled, the chat ui will be setup as needed to continue that session properly.
Also cleanup the minimal based showing of chat messages a bit And add github.com to allowed list
Add a newline between name and content in the xml representation of the tool response, so that it is more easy to distinguish things Add github, linkedin and apnews domains to allowed.domains for simpleproxy.py
Seperate out the message ui block into a container containing a role block and contents container block. This will allow themeing of these seperately, if required. As part of same, currently the role has been put to the side of the message with vertical text flow.
Also make reasoning easily identifiable in the chat
Define rules to ensure that chat message contents wrap so as to avoid overflowing beyond the size of the screen being viewed. The style used for chat message role to be placed with vertical oriented text adjacent to the actual message content on the side seems to be creating issue with blank pages in some browsers, so avoid that styling when one is printing.
Create the DB store Try Get and Set operations The post back to main thread done from asynchronous paths. NOTE: given that it has been ages since indexed db was used, so this is a logical implementation by refering to mdn as needed.
Update tooldb logic to match that needed for the db logic and its web worker. Bring in the remaining aspects of db helpers into tools flow.
So mention that may be ai can send complex objects in stringified form. Rather once type of value is set to string, ai should normally do it, but no harm is hinting.
In the eagerness of initial skeleton, had forgotten that the root/generic tool call router takes care of parsing the json string into a object, before calling the tool call, so no need to try parse again. Fixed the same. Hadnt converted the object based response from data store related calls in the db web worker, into json string before passing to the generic tool response callback, fixed the same. - Rather the though of making the ChatMsgEx.createAllInOne handle string or object set aside for now, to keep things simple and consistant to the greatest extent possible across different flows. And good news - flow is working atleast for the overall happy path Need to check what corner cases are lurking like calling set on same key more than once, seemed to have some flow oddity, which I need to check later. Also maybe change the field name to value from data in the response to get, to match the field name convention of set. GPT-OSS is fine with it. But worst case micro / nano / pico models may trip up, in worst case, so better to keep things consistent.
And indexedDB add isnt the one to be happy with updating existing key.
Update the descriptions of set and get to indicate the possible corner cases or rather semantic in such situations. Update the readme also a bit. The auto save and restore mentioned has nothing to do with the new data store mechanism.
The basic skeleton added on the web worker side for listing keys. TODO: Avoid duplication of similar code to an extent across some of these db ops.
Update regex to match both ordered and unordered list
Avoid seperate new list level logic for a fresh list and list with in list paths. Rather adjust lastOffset specifically for fresh list. All paths lead to need to insert list item and the difference to be handled wrt starting or ending a list level is handled by respective condition check blocks directly without delaying it for later so no need for that sList state, so remove. Avoid check for same level list item path, as nothing special needs to be do in that path currently. Live identify the last offset, when unwinding. NOTE: Logic currently will handle ordered lists on their own or unordered lists on thier own or intermixed list containing both type of lists within them, however remember that all will be shown as unordered lists. ALERT: if there is a really long line, the logic currently doesnt support it being broken into smaller line with same or greater offset than the line identifying the current list item.
Start ordered or unordered list as the case may be and push the same into endType for matching unwinding. Ignore empty lines and dont force a list unwind.
If a split line is found which remains within the constraints of the preceding list item, then dont unwind the list, rather for now add the split line as a new item at the same level.
Rename from unordered to just list, given that the logic handles both types of lists at a basic level now.
If the split lines dont have any empty lines inbetween and also remain within the block area of the list item which they belong to, then the split line will be appended to the corresponding list item, ELSE a new list item will be created. To help with same a generic keyed empty lines tracker logic has been added. TODO: Account similar semantic wrt paragraph related split lines
Had forgotten to include this in the examples before.
Similar to listitem before, now also allow a para to have its long lines split into adjacent lines. Inturn the logic will take care of merging them into single para. The common logic wrt both flows moved into its own helper function.
Maintain raw and sanitized versions of line. Make blockquote work with raw line and not the sanitized line. So irrespective of whether sanitize is enabled or not, the logic will still work. Inturn re-enable HtmlSanitize.
Also update readme a bit, better satisfying md file format.
Given that now fetch_web_url_raw can also fetch local files, if local file access scheme is enabled in simpleproxy.py, so rename this tool call by dropping web from its name, given that some ai models were getting confused because of the same.
|
Markdown to html logic should work ok enough in general now especially for basic markdown contents. And more complex markdown should also potentially display ok enough at a basic level. |
If lines immidately follows a list item, without the list marker at their begining, but with a offset matching the list item, then these lines will be appended to that list item. If a empty line is there between a list item and a new line with some content, but without a list marker * if the content offset is less than the last list item, then unwind the lists before such a line. * if the content offset is larger than the last list item, then the line will be added as a new list item at the same level as the last list item. * if the content offset is same as the last list tiem, then unwind the list by one level and then insert this line as a new list item at this new unwound level.
Move all markdown configs into a single object field. Add always flag, which if set, all roles' message contents will be treated as markdown, else only ai assistant's messages will be treated as markdown.
|
Cleanup and refine exiting, unwinding or continuing lists and lists within lists of Markdown Also by default have all contents other than Assistant responses as is and only interpret Assistant response contents as Markdown. User can fully disable Markdown or force Markdown wrt all role messages. So that web page contents etal fetched using tool or loaded dont get messed up, while Ai model output of it after processing should be better structured Markdown or otherwise (normal plain text), so markdown processing on it by default in general should be fine. |
Add a textFormat field wrt ChatMessageEx. User can be allowed to change how to interpret the text content at a individual message level.
Add format selection box to the popover. Update show_message logic to allow refreshing a existing message ui element, rather than creating a new one. Trigger refresh of the message ui element, when format selection changes.
If user explicitly makes a content text format selection, the same will be used. Else based on session settings, a format will be used. Now when the popover menu is shown, the current message's format type is reflected in the popover menu.
Always show all the info wihen show_info is called, inturn avoid the corresponding all info enable flag wrt show_info as well as chat_show. Now chat_show gives the option to its caller to enable showing of its own chat session divStream. This is in addition to the handle multipart response also calling corresponding divStream show. Previously chat_show would have not only cleared corresponding chat session's divStream contents but would have also hidden divStream. Now except for the clearChat case, in all other cases own divStream is unhidden, when chat_show is called. Without this, when a tool call takes too much time and inturn a chat session times out the tool call and inturn user switches between chat sessions, if the tool call was external_ai, then its related live ai response would no longer be visible in any of the chat sessions, including the external_ai special chat session, if the user had switched to this external_ai special chat session. But now in the external_ai special chat session, the live response will be visible. TODO: With this new semantic wrt chat_show, where a end user can always peek into a chat session's ai live stream response if any, as long as that chat session's ai server handshake is still active, So now After tool call timeout, which allows users to switch between sessions, it is better to disable the external ai live divStream in other chat sessions, when user switches into them. This ensures that 1. if user doesnt switch out of the chat session which triggered external_ai, for now the user can continue to see the ext ai live response stream. 2. Switching out of the chat session which triggered ext ai, will automatically disable viewing of external ai live response from all chat sessions except for the external ai's special chat session. IE I need to explicitly clear not just the own divStream, but also the external ai related divStream, which is appened to end of all chat session's UI. This will tidy up the usage flow and ui and avoid forcefully showing external ai tool call's ai live response in other chat sessions, which didnt trigger the ext ai tool call. And also in the chat session which triggered ext ai, it will stop showing if user exits out of that chat session. Same time user can always look at the ext ai live response stream in the special chat session corresponding to ext ai.
Implement todo noted in last commit, and bit more. This brings in clearing of the external ai tool call special chat session divStream during chat show, which ensures that it gets hidden by default wrt other chat sessions and inturn only get enabled if user triggers a new tool call involving external ai tool call. This patch also ensures that if ext ai tool call takes too much time and so logic gives you back control with a timed out response as a possible response back to ai wrt the tool call, then the external ai tool call's ai live response is no longer visible in the current chat session ui. So user can go ahead with the timed out response or some other user decided response as the response to the tool call. And take the chat in a different direction of their and ai's choosing. Or else, if they want to they can switch to the External Ai specific special chat session and continue to monitor the response from the tool call there, to understand what the final response would have been wrt that tool call. Rather this should keep the ui flow clean. ALERT: If the user triggers a new ext ai tool call, when the old one is still alive in the background, then response from both will be in a race for user visibility, so beware of it.
updated server/public_simplechat additionally with a initial go at a simple minded minimal markdown to html logic, so that if the ai model is outputting markdown text instead of plain text, user gets a basic formatted view of the same. If things dont seem ok, user can disable markdown processing from settings in ui.
look into the previous PR #17451 in this series for details wrt other features added to tools/server/public_simplechat
like peeking into reasoning, working with vision models as well as built in support for a bunch of useful tool calls on the client side with minimal to no setup.
All features (except for pdf - pypdf dep) are implemented internally without depending on any external libraries, and inturn should fit within 50KB compressed. Created using pure html+css+js in general, with additionally python for simpleproxy to bypass the cors++ restrictions in browser environment for direct web access.