not hold mutex when destruct big object #178

dongdaoguang · 2020-06-30T12:30:33Z

Signed-off-by: jason [email protected]
我们线上使用titan的时候经常发现get会产生几百毫秒延时，原因是get命令创建snapshot的时候会mutex_.Lock()。但BlobGCJob对象在析构的时候会持有mutex_，该对象中保存了需要回写sst中的kv，如果需要回写的kv很多（线下测试可能会有几十万条），那么该对象会很大，析构的时候比较耗时，持有mutex_的时间就会比较长，导致get命令加锁被阻塞。所以建议析构大对象的时候，不要持有锁。我们修复后，线上已经跑了一个月，get已经不会产生延时抖动。

优化前：一天get超过200ms延时大概在10w条左右
优化后：一天get超过200ms延时大概在100条左右

Signed-off-by: jason <[email protected]>

yiwu-arbug · 2020-06-30T21:29:09Z

src/db_impl_gc.cc

    }
+
+    mutex_.Unlock();
+    delete blob_gc_job;


not sure if unlocking would cause any race condition, and it is just hard to check. A better pattern is we can pass a struct GCContext { std::unique_ptr<BlobGCJob> } from outside of the mutex (in BackgroundCallGC()) into BackgroundGC and use the context struct to hold the pointer, and make sure the context struct is cleanup outside of mutex.

We checked the BlobGCJob destruct function and sure it will not cause race condition, but it is a good idea to use GCContext to manage the BlobGCJob pointer.

Not just BlobGCJob destructor. I didn't check whether it is safe if some other job slip in between blob_gc_job->Finish() and blob_gc->ReleaseGcFiles().

yiwu-arbug

Great thanks for the fix. May I know where are you using Titan?

yiwu-arbug · 2020-06-30T21:29:56Z

And do you mind update the PR summary using English?

yiwu-arbug · 2020-06-30T21:30:38Z

You can install clang-format and clang-format-diff and run scripts/format-diff.sh once to fix CI.

dongdaoguang · 2020-07-01T01:28:19Z

We use titan for PIKA whitch is a persistent huge storage service.

Connor1996 · 2020-07-07T02:00:01Z

@dongdaoguang friendly ping

yiwu-arbug · 2021-07-29T02:46:20Z

/run-tests

yiwu-arbug · 2021-07-29T03:19:12Z

/run-tests

not hold mutex when destruct big object

d4c7010

Signed-off-by: jason <[email protected]>

ti-srebot added the contribution label Jun 30, 2020

yiwu-arbug reviewed Jun 30, 2020

View reviewed changes

Merge branch 'master' into master

c44eb30

Merge branch 'master' into master

12e4757

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

not hold mutex when destruct big object #178

not hold mutex when destruct big object #178

dongdaoguang commented Jun 30, 2020

yiwu-arbug Jun 30, 2020

dongdaoguang Jul 1, 2020

yiwu-arbug Jul 1, 2020

yiwu-arbug left a comment

yiwu-arbug commented Jun 30, 2020

yiwu-arbug commented Jun 30, 2020

dongdaoguang commented Jul 1, 2020

Connor1996 commented Jul 7, 2020

yiwu-arbug commented Jul 29, 2021

yiwu-arbug commented Jul 29, 2021

not hold mutex when destruct big object #178

Are you sure you want to change the base?

not hold mutex when destruct big object #178

Conversation

dongdaoguang commented Jun 30, 2020

yiwu-arbug Jun 30, 2020

Choose a reason for hiding this comment

dongdaoguang Jul 1, 2020

Choose a reason for hiding this comment

yiwu-arbug Jul 1, 2020

Choose a reason for hiding this comment

yiwu-arbug left a comment

Choose a reason for hiding this comment

yiwu-arbug commented Jun 30, 2020

yiwu-arbug commented Jun 30, 2020

dongdaoguang commented Jul 1, 2020

Connor1996 commented Jul 7, 2020

yiwu-arbug commented Jul 29, 2021

yiwu-arbug commented Jul 29, 2021