Can't recover from disk full error

### Report

Disk (PVC) was full; I made the PVCs bigger and let mysql restart.

The group replication never came back. I set instance 0 to "bootstrap" and it got replication working.

The two other instances never finished "recovering" though and are just crash looping now. The logs from one of them attached.


### More about the problem


[mysql-1.txt](https://github.com/user-attachments/files/17495499/mysql-1.txt)

The controller doesn't have any (to me) useful information; it seems to think everything is fine-ish. I'm not sure what role the controller here has though (I'm migrating from the bitpoke operator which worked a little differently with the orchestrator exposed).


```
2024-10-23T16:21:26.676Z	INFO	Crash recovery	Pod is waiting for recovery	{"controller": "ps-controller", "controllerGroup": "ps.percona.com", "controllerKind": "PerconaServerMySQL", "PerconaServerMySQL": {"name":"ntpdb","namespace":"ntpdb"}, "namespace": "ntpdb", "name": "ntpdb", "reconcileID": "d8d400bb-a9ba-4351-a213-4dcd61755a65", "pod": "ntpdb-mysql-0", "gtidExecuted": "60a1cc96-859c-11ef-99ea-fe24b27f638b:1-4,6c35e34c-859c-11ef-9ccb-fe24b27f638b:1-17766194,6c35e5d8-859c-11ef-9ccb-fe24b27f638b:1-10"}
2024-10-23T16:22:27.762Z	INFO	Crash recovery	Pod is waiting for recovery	{"controller": "ps-controller", "controllerGroup": "ps.percona.com", "controllerKind": "PerconaServerMySQL", "PerconaServerMySQL": {"name":"ntpdb","namespace":"ntpdb"}, "namespace": "ntpdb", "name": "ntpdb", "reconcileID": "d8d400bb-a9ba-4351-a213-4dcd61755a65", "pod": "ntpdb-mysql-1", "gtidExecuted": "60a1cc96-859c-11ef-99ea-fe24b27f638b:1-4,6c35e34c-859c-11ef-9ccb-fe24b27f638b:1-17766194,6c35e5d8-859c-11ef-9ccb-fe24b27f638b:1-1060a1cc96-859c-11ef-99ea-fe24b27f638b:1-4,6c35e34c-859c-11ef-9ccb-fe24b27f638b:1-16262363,6c35e5d8-859c-11ef-9ccb-fe24b27f638b:1-5"}
2024-10-23T16:23:40.357Z	INFO	Crash recovery	Cluster was successfully rebooted	{"controller": "ps-controller", "controllerGroup": "ps.percona.com", "controllerKind": "PerconaServerMySQL", "PerconaServerMySQL": {"name":"ntpdb","namespace":"ntpdb"}, "namespace": "ntpdb", "name": "ntpdb", "reconcileID": "d8d400bb-a9ba-4351-a213-4dcd61755a65"}
2024-10-23T16:23:47.288Z	INFO	groupReplicationStatus.ntpdb-mysql-1.ntpdb-mysql.ntpdb	Member is not ONLINE	{"controller": "ps-controller", "controllerGroup": "ps.percona.com", "controllerKind": "PerconaServerMySQL", "PerconaServerMySQL": {"name":"ntpdb","namespace":"ntpdb"}, "namespace": "ntpdb", "name": "ntpdb", "reconcileID": "d8d400bb-a9ba-4351-a213-4dcd61755a65", "state": "RECOVERING"}
2024-10-23T16:30:19.004Z	INFO	Crash recovery	Pod is waiting for recovery	{"controller": "ps-controller", "controllerGroup": "ps.percona.com", "controllerKind": "PerconaServerMySQL", "PerconaServerMySQL": {"name":"ntpdb","namespace":"ntpdb"}, "namespace": "ntpdb", "name": "ntpdb", "reconcileID": "b7ace1d4-d0eb-47cd-83d3-e9e8f7a81940", "pod": "ntpdb-mysql-0", "gtidExecuted": "60a1cc96-859c-11ef-99ea-fe24b27f638b:1-4,6c35e34c-859c-11ef-9ccb-fe24b27f638b:1-17766305,6c35e5d8-859c-11ef-9ccb-fe24b27f638b:1-13"}
2024-10-23T16:31:20.054Z	INFO	Crash recovery	Pod is waiting for recovery	{"controller": "ps-controller", "controllerGroup": "ps.percona.com", "controllerKind": "PerconaServerMySQL", "PerconaServerMySQL": {"name":"ntpdb","namespace":"ntpdb"}, "namespace": "ntpdb", "name": "ntpdb", "reconcileID": "b7ace1d4-d0eb-47cd-83d3-e9e8f7a81940", "pod": "ntpdb-mysql-1", "gtidExecuted": "60a1cc96-859c-11ef-99ea-fe24b27f638b:1-4,6c35e34c-859c-11ef-9ccb-fe24b27f638b:1-17766305,6c35e5d8-859c-11ef-9ccb-fe24b27f638b:1-1360a1cc96-859c-11ef-99ea-fe24b27f638b:1-4,6c35e34c-859c-11ef-9ccb-fe24b27f638b:1-16262363,6c35e5d8-859c-11ef-9ccb-fe24b27f638b:1-5"}
2024-10-23T16:31:55.660Z	INFO	Crash recovery	Cluster was successfully rebooted	{"controller": "ps-controller", "controllerGroup": "ps.percona.com", "controllerKind": "PerconaServerMySQL", "PerconaServerMySQL": {"name":"ntpdb","namespace":"ntpdb"}, "namespace": "ntpdb", "name": "ntpdb", "reconcileID": "b7ace1d4-d0eb-47cd-83d3-e9e8f7a81940"}
2024-10-23T16:32:02.594Z	INFO	groupReplicationStatus.ntpdb-mysql-1.ntpdb-mysql.ntpdb	Member is not ONLINE	{"controller": "ps-controller", "controllerGroup": "ps.percona.com", "controllerKind": "PerconaServerMySQL", "PerconaServerMySQL": {"name":"ntpdb","namespace":"ntpdb"}, "namespace": "ntpdb", "name": "ntpdb", "reconcileID": "b7ace1d4-d0eb-47cd-83d3-e9e8f7a81940", "state": "OFFLINE"}
```


### Steps to reproduce

1. let disk run full; for example use the default configuration that doesn't limit how many binlog files are kept.
2. watch cluster go down
3. watch cluster not recover after disk has been added
4. force mysql-0 to start group replication
5. watch the replicas never recovering


### Versions

1. Kubernetes - v1.28.12
2. Operator - 0.8.0
3. Database - the default 8.x version from 0.8.0


### Anything else?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can't recover from disk full error #758

Report

More about the problem

Steps to reproduce

Versions

Anything else?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can't recover from disk full error #758

Description

Report

More about the problem

Steps to reproduce

Versions

Anything else?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions