Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle message moves for recent senders data model #1418

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

PIG208
Copy link
Member

@PIG208 PIG208 commented Mar 18, 2025

Fixes #1460.

To test this manually, bring up mentions autocomplete in a channel view by entering @ in the compose box, and move messages from different users around on another device. Moving the cursor before from the @ and back should refresh the options after the recent sender update.

@PIG208 PIG208 force-pushed the pr-moved branch 3 times, most recently from adf3690 to ed9452d Compare March 18, 2025 22:07
@PIG208 PIG208 marked this pull request as ready for review March 18, 2025 22:08
@PIG208 PIG208 added the maintainer review PR ready for review by Zulip maintainers label Mar 18, 2025
@PIG208 PIG208 requested a review from chrisbobbe March 18, 2025 22:08
Copy link
Collaborator

@chrisbobbe chrisbobbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Small comments below.

@@ -1,5 +1,5 @@
import 'package:collection/collection.dart';
import 'package:flutter/foundation.dart';
import 'package:flutter/foundation.dart' hide binarySearch;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recent_senders [nfc]: Use binarySearch when applicable

This achieves the same effect that we used lowerBound for (testing if
an element is present in a sorted list), but more concisly.

(The other call site of lowerBound is untouched, because we
indeed need the returned position for insertion.)

Commit-message nit: spelling of "concisely"

@@ -68,6 +68,52 @@ class RecentSenders {
[senderId] ??= MessageIdTracker()).add(messageId);
}

/// Handles channel/topic updates to the data resulted from a move.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I having trouble parsing this sentence. Perhaps: "Handles channel/topic updates when messages are moved."?

@@ -142,6 +145,169 @@ void main() {
});
});

group('RecentSenders.handleUpdateMessageUpdate', () {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'RecentSenders.handleUpdateMessageEvent'

Comment on lines 187 to 190
final streamSenderIdsBefore = model.streamSenders
[origChannel.streamId]![userX.userId]!.ids;
final topicSenderIdsBefore = model.topicSenders
[origChannel.streamId]![TopicName(origTopic)]![userX.userId]!.ids;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read "…SenderIds…" in these names, and it sounds like user IDs but the value is actually message IDs. (Same in several places in these tests)

@PIG208
Copy link
Member Author

PIG208 commented Mar 28, 2025

Thanks for the review! Updated the PR.

Copy link
Collaborator

@chrisbobbe chrisbobbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just one small comment below, and I'll mark for Greg's review.

Comment on lines 187 to 190
final messageIdsByUserInStreamBefore = model.streamSenders
[origChannel.streamId]![userX.userId]!.ids;
final messageIdsByUserInTopicBefore = model.topicSenders
[origChannel.streamId]![TopicName(origTopic)]![userX.userId]!.ids;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These line breaks feel a little awkward; can we try something like:

Suggested change
final messageIdsByUserInStreamBefore = model.streamSenders
[origChannel.streamId]![userX.userId]!.ids;
final messageIdsByUserInTopicBefore = model.topicSenders
[origChannel.streamId]![TopicName(origTopic)]![userX.userId]!.ids;
final messageIdsByUserInStreamBefore =
model.streamSenders[origChannel.streamId]![userX.userId]!.ids;
final messageIdsByUserInTopicBefore =
model.topicSenders[origChannel.streamId]![eg.t(origTopic)]![userX.userId]!.ids;

(here and elsewhere)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't know that eg.t exists! Thanks, that's helpful (it still went over by 7 characters !.ids);, but it's a bit better than 12 with TopicName).

@chrisbobbe chrisbobbe added integration review Added by maintainers when PR may be ready for integration and removed maintainer review PR ready for review by Zulip maintainers labels Apr 1, 2025
@chrisbobbe chrisbobbe assigned gnprice and unassigned chrisbobbe Apr 1, 2025
@chrisbobbe chrisbobbe requested a review from gnprice April 1, 2025 04:04
Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Comments below.

In this round I haven't yet read the main commit's tests; I think they can probably be made simpler given one of my comments below.

Comment on lines 1 to 2
import 'package:collection/collection.dart';
import 'package:flutter/foundation.dart';
import 'package:flutter/foundation.dart' hide binarySearch;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's apply hide the other way around, actually — the implementation in flutter/foundation is simpler. (It's the same algorithm, and very similar code; it looks like they started from the package:collection version, then took out the extra options like compare and keyOf.)

Comment on lines +770 to 771
recentSenders.handleUpdateMessageEvent(event, messages);
_messages.handleUpdateMessageEvent(event);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's essential that these lines come in this order. Right?

Let's tweak the interface to make that evident here at the call site:

Suggested change
recentSenders.handleUpdateMessageEvent(event, messages);
_messages.handleUpdateMessageEvent(event);
recentSenders.handleUpdateMessageEvent(event, oldMessages: messages);
_messages.handleUpdateMessageEvent(event);

Hmm but tests don't break if I swap these lines. Maybe it doesn't matter? If it does matter, let's add a test that would catch that.

Copy link
Member Author

@PIG208 PIG208 Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it shouldn't matter because we only need the message IDs and their corresponding sender user IDs; those are not subject to updates.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that makes sense.

Let's I guess write that conclusion down in dartdoc on that handleUpdateMessageEvent method: the messages parameter should be store.messages, and it doesn't matter whether the method is called before or after MessageStore itself is updated.

Comment on lines 93 to 97
final movedMessagesInStreamTracker = streamTracker?.popAll(messages);
if (streamTracker?.maxId == null) sendersInChannel?.remove(senderId);
if (movedMessagesInStreamTracker != null) {
((streamSenders[newStreamId] ??= {})
[senderId] ??= MessageIdTracker()).addAll(movedMessagesInStreamTracker);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this last line add only movedMessagesInStreamTracker, or the whole messages list?

I think it should be the whole messages. These are messages we know were sent by the given sender, and now have the given channel, so they belong in the new channel's tracker for that sender.

(Also I think it should be an invariant that movedMessagesInStreamTracker has all the same messages as messages — i.e. that all the messages in messages are known to the RecentSenders data structure. That's because messages only has messages that were found in store.messages, and every message that makes it there should also make it into here.)

This differs from the situation for unreads because a message only belongs in the unreads data structure if it's unread, and the way to find out whether it's unread is by looking in the unreads data structure's own existing data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I agree that this invariant holds. It would be helpful to add an assertion here, but I think we should still use the return value from popAll here, because we otherwise need to construct a QueueList from messages even when there is one that can be reused.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, but I guess if the invariant doesn't hold because of a client bug, messages would be a more complete source, since it is always a superset of the popped IDs. You mentioned that one of your comments could simplify the tests, maybe it is this one, since we drop the optimization for reusing the QueueList?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think constructing a QueueList isn't materially more expensive than a plain List, and this implementation is already doing that in _groupStreamMessageIdsBySender.

If the invariant gets broken, it doesn't matter which version of the list we use (messages vs movedMessagesInStreamTracker); that case shouldn't happen, and if it does then bigger things are probably wrong elsewhere than the difference between those lists. Which one to use can be governed by whatever makes the code better for the non-buggy case.

But zooming out… → well, I ended up posting my next thoughts on this below: #1418 (comment)

:origStreamId, :newStreamId, :origTopic, :newTopic) = event.moveData!;

final messagesBySender = _groupStreamMessageIdsBySender(event.messageIds, cachedMessages);
final sendersInChannel = streamSenders[origStreamId];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
final sendersInChannel = streamSenders[origStreamId];
final sendersInStream = streamSenders[origStreamId];

Given all the other "stream" names here, plus the otherwise very similar handleDeleteMessageEvent below which it's useful to compare this code to, I think it's clearest if these particular locals stick with the "stream" names for now. We'll switch them all to "channel" names as part of the post-launch sweep we have in the tracker.

Comment on lines 86 to 96
// The later `popAll` calls require the message IDs to be sorted in
// ascending order. Only sort as many as we need: the message IDs
// with the same sender, instead of all of them in `event.messageIds`.
messages.sort();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like something we should be able to count on the server doing — it's an easy guarantee for the server to make, and I'd guess that the server does sort the list and has since forever.

So at a minimum we should leave a TODO(server) comment about getting that as an API guarantee. It'd also be good to ask about it in #api documentation.

Copy link
Member Author

@PIG208 PIG208 Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dug a bit but didn't find solid guarantee that this is sorted. The web app suggests otherwise, albeit with quite an old comment:

https://github.com/zulip/zulip/blob/82f4ec0d6c1c30e1d1ab72d175deea23530b8ff1/web/src/message_events.ts#L535-L537

which is consistent with the error I got from an added assert before sorting this:

BUG: Error handling an event: 'package:zulip/model/recent_senders.dart': Failed assertion: line 89 pos 14: 'isSortedWithoutDuplicates(messages)': is not true.

Leaving a TODO(server) comment and follow up sounds to me.

#api documentation > Make message_ids from message update event sorted @ 💬

PIG208 added 4 commits April 2, 2025 20:08
To be reused later for handling moved messages.
This achieves the same effect that we used lowerBound for (testing if
an element is present in a sorted list), but more concisely.

(The other call site of lowerBound is untouched, because we
indeed need the returned position for insertion.)
The data structure and its helpers rely on message IDs being sorted,
otherwise its behavior is unpredictable.

This will provide useful diagnostic information when that happens. This
is NFC because no change is made live, i.e. no additional sorting.
This is similar to how we add support to handling moves for unreads
(commit 34e6201), especially the optimizations to avoid making
unnecessary copies of message IDs when the entire conversation is moved
(e.g. resolve/unresolve topic).

An alternative approach to this is extracting helpers from
handleMessages and handleDeleteMessageEvent and combining the two to
handle moves, like web does
(https://github.com/zulip/zulip/blob/bd04a30b/web/src/recent_senders.ts#L165-L190).

Compared to that, creating a dedicated helper (this commit) makes it
more straightforward to optimize for our performance needs.

(The tests do not have to use PerAccountStore, but this setup makes it a
bit more integrated.)

Fixes: zulip#901
@PIG208
Copy link
Member Author

PIG208 commented Apr 3, 2025

This has been updated. Thanks for the review!

@gnprice
Copy link
Member

gnprice commented Apr 3, 2025

Thanks for the revision!

As I think about this more, I'm realizing that this path is performance-sensitive for much the same reasons as its Unreads counterpart, and yet is different for a couple of reasons so the Unreads solution from #1311 can't be just transplanted here, and therefore it's going to take significant effort to get to a version that I'm confident is well optimized just from reading it. (Potentially some profiling would show that optimization isn't needed, that it's already fast enough — but profiling would also take significant effort.) See the subthread from #1418 (comment) above.

Meanwhile the behavior this fixes is a lot less high-priority than #1311 was: the unreads data drives several highly-visible parts of the app's UI, while the recent-senders data only affects @-mention autocomplete, and only the ranking of results there. So the main effect of the remaining bug here is that if a topic gets moved or resolved and you try to write a message there and @-mention one of the participants, we'll fail to adequately prioritize them in the autocomplete results. That seems pretty OK for launch.

So I think what I'd like to do to handle this is:

  • I'll file a new issue for the recent-senders data, for M6.
  • I'll re-scope the M5a issue Handle moved messages in unreads data #901 as being about just the unreads data, and mark it as having been fixed by Handle moved messages for unreads #1311.
    (This is probably all a sign that I should have filed it as separate issues in the first place, oh well.)
  • I'll mark this PR as a draft. We can return to it post-launch, when we pick up the issue again.

@gnprice gnprice marked this pull request as draft April 3, 2025 18:49
@gnprice gnprice removed the integration review Added by maintainers when PR may be ready for integration label Apr 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle moved messages in recent-senders data
3 participants