Akka.Cluster.Sharding: make entity passivization aware of _all_ messages being sent to entities, not just messages sent via the `ShardRegion`

**Is your feature request related to a problem? Please describe.**

This is the entity passivization code inside the `Shard` class - it passivates actors based on how long ago they processed their last message. We do this in order to free up memory from unused entity actors:

https://github.com/akkadotnet/akka.net/blob/4ae47927da9f2539742c336acfa8ae0037fabbb7/src/contrib/cluster/Akka.Cluster.Sharding/Shard.cs#L1788-L1802

The passivization window is configurable and this feature can be disabled altogether - but that's not really what this is about. The problem is that the `Shard` actor only uses data from the Cluster.Sharding system itself to keep entity actors alive:

https://github.com/akkadotnet/akka.net/blob/4ae47927da9f2539742c336acfa8ae0037fabbb7/src/contrib/cluster/Akka.Cluster.Sharding/Shard.cs#L1932-L1946

https://github.com/akkadotnet/akka.net/blob/4ae47927da9f2539742c336acfa8ae0037fabbb7/src/contrib/cluster/Akka.Cluster.Sharding/Shard.cs#L1780-L1786

This can result in weird scenarios where "busy" entity actors can still be killed, such as:

1. Queueing a ton of messages to an entity actor upfront, where it takes the entity actor more than 2m to process;
2. Entity actor that receives traffic from non-ShardRegion sources can die (i.e. `DistributedPubSub`)
3. Entity actors receiving messages directly from other actors can die (via its `IActorRef`, rather than the `ShardRegion`) can die.

I think we can probably broaden the definition of "passivate" to include all sources of message traffic that are not recurring messages (i.e. `IWithTimers` or `Context.System.Scheduler`) and enforce that inside Akka.Cluster.Sharding. This should make the behavior of automatic entity passivization more closely aligned to what users expect without having to make distinctions between which messages count and which ones don't.

**Describe the solution you'd like**

Two ideas for this:

1. A passivization protocol - similar to how the `ReceiveTimeout` works, but in this case there's two parties: passivator (A) and passivatee (B). A basically tells B to set a `ReceiveTimeout` for some duration and then B does all of the accounting. The rest of the normal `INotInfluenceReceiveTimeout` and timeout window code that's already in the `ActorCell` applies and A gets a notification when B hits its receive timeout. The only real thing we'd need to add here is a private message type, handled automatically via `ActorCell`, that allows someone else to set the `ReceiveTimeout` and then a notification message type back.
2. Optional and not exclusive, but maybe we should add an override to `ActorBase` that allows user-defined actors to customize what happens when they receive a `PoisonPill` - that way we can avoid / address issues like https://github.com/akkadotnet/akka.net/issues/6321 - so if a passivating actor needs to do some work prior to shutting down, maybe it can be given some time to do that. Only problem here though - no guarantee you get the time you need when shutting down, so this might add some extra complexity the framework doesn't need. Did occur to me though.

**Describe alternatives you've considered**

I also considered the alternative of having the entity actors ping the `Shard` every time they receive a message but that would get _incredibly_ noisy and would harm the throughput of the sharding system significantly. Better to push the decision making and state tracking to the edges where it belongs.

	private void PassivateIdleEntities()
	{
	var deadline = DateTime.UtcNow - _settings.PassivateIdleEntityAfter;

	var refsToPassivate = _lastMessageTimestamp
	.Where(i => i.Value < deadline)
	.Select(i => _entities.Entity(i.Key))
	.Where(i => i != null).ToList();
	if (refsToPassivate.Count > 0)
	{
	Log.Debug("{0}: Passivating [{1}] idle entities", _typeName, refsToPassivate.Count);
	foreach (var r in refsToPassivate)
	Passivate(r!, _handOffStopMessage);
	}
	}

	private IActorRef GetOrCreateEntity(EntityId id)
	{
	var child = _entities.Entity(id);
	if (child != null)
	return child;

	var name = Uri.EscapeDataString(id);
	var a = Context.ActorOf(_entityProps(id), name);
	Context.WatchWith(a, new EntityTerminated(a));
	Log.Debug("{0}: Started entity [{1}] with entity id [{2}] in shard [{3}]", _typeName, a, id, _shardId);
	_entities.AddEntity(id, a);
	TouchLastMessageTimestamp(id);
	EntityCreated(id);
	return a;
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Akka.Cluster.Sharding: make entity passivization aware of _all_ messages being sent to entities, not just messages sent via the `ShardRegion` #7395

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	private void TouchLastMessageTimestamp(EntityId id)
	{
	if (_passivateIdleTask != null)
	{
	_lastMessageTimestamp[id] = DateTime.UtcNow;
	}
	}

Akka.Cluster.Sharding: make entity passivization aware of _all_ messages being sent to entities, not just messages sent via the ShardRegion #7395

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Akka.Cluster.Sharding: make entity passivization aware of _all_ messages being sent to entities, not just messages sent via the `ShardRegion` #7395