Description
Hello,
Related to previously closed issue: #1617
We are using blobfuse2 on AKS via CSI blob driver, latest version.
Few days ago we upgraded node pool to latest image and it automatically installed blobfuse version 2.4.0
When we ran our performance tests, nodes started to transition to NotReady state after very short period of time.
After conducting debug session, we realized blobfuse2
is not freeing RAM at all. It just keep growing until host becomes unresponsive due to lack of memory.
I conducted tests on following blobfuse2 versions:
- 2.3.0 -> no issue
- 2.3.2 -> issue persist
- 2.4.0 -> issue persist
- 2.4.1 -> issue persist
We are using Blob CSI via PV (RBAC UAMI auth) using below mount options:
mountOptions:
- '-o allow_other'
- '--file-cache-timeout-in-seconds=0'
- '-o attr_timeout=0'
- '-o entry_timeout=0'
- '-o negative_timeout=0'
- '--attr-timeout=0'
- '--entry-timeout=0''
- '--cancel-list-on-mount-seconds=10'
- '--block-cache
We are using blob cache due to file cache limit (it doesn't clean up folders and inode limit is reached)
We tried -o direct_io
but performance was very poor and not acceptable by our SLA
Any suggestions are welcome, thanks!