Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue #934: multi-thread compaction support #976

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

caijieming-ng
Copy link
Collaborator

#934 compact优化,story3, 多线程compact

@caijieming-ng
Copy link
Collaborator Author

build

@baidubot
Copy link
Collaborator

Reviewers: @taocp @xupeilin

@caijieming-ng caijieming-ng force-pushed the compact_multithread branch 2 times, most recently from 9c789c6 to b38af4b Compare August 25, 2016 02:40
@caijieming-ng caijieming-ng mentioned this pull request Aug 25, 2016
@caijieming-ng caijieming-ng force-pushed the compact_multithread branch 3 times, most recently from 4ee16ed to d84e0b7 Compare August 30, 2016 04:43
// check filesystem, and then check pending_outputs_
std::vector<std::string> filenames;
mutex_.Unlock();
env_->GetChildren(dbname_, &filenames); // Ignoring errors on purpose
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why Ignoring errors on purpose

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

即使本次删除旧sst失败,下一次compact也会继续删除,所以没有必要因为此处失败,而认为本次compact失败,compact重做的话,io开销很重。

@caijieming-ng caijieming-ng force-pushed the compact_multithread branch 3 times, most recently from f28504a to e61e91b Compare September 7, 2016 18:31
@@ -219,10 +219,10 @@ Status DBTable::Init() {
std::vector<uint64_t> snapshot_sequence = options_.snapshots_sequence;
std::map<uint64_t, uint64_t> rollbacks = options_.rollbacks;
for (std::set<uint32_t>::iterator it = options_.exist_lg_list->begin();
it != options_.exist_lg_list->end() && s.ok(); ++it) {
it != options_.exist_lg_list->end() && s.ok(); ++it) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改这个缩进是基于一种怎样的考虑?

@caijieming-ng caijieming-ng force-pushed the compact_multithread branch 2 times, most recently from 59b896b to 1741aa5 Compare September 8, 2016 04:46
@taocp
Copy link
Collaborator

taocp commented Sep 8, 2016

@baidu/tera_dev 请都瞧一瞧。

LGTM

@00k
Copy link
Collaborator

00k commented Sep 13, 2016

build

1 similar comment
@lylei
Copy link
Collaborator

lylei commented Sep 19, 2016

build

const InternalKey* begin; // NULL means beginning of key range
const InternalKey* end; // NULL means end of key range
InternalKey tmp_storage; // Used to keep track of compaction progress
int compaction_conflict; // 0 == idle, 1 == conflict, 2 == wake
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

直接用ManualCompactState?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum内部也是个int吧

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

只是感觉这样好读

@00k
Copy link
Collaborator

00k commented Oct 26, 2016

commit log里,multithread中间加个-,compactiong拼错了

@caijieming-ng caijieming-ng force-pushed the compact_multithread branch 2 times, most recently from 5ec0c93 to 549443c Compare November 21, 2016 08:20
@caijieming-ng
Copy link
Collaborator Author

已修改

@caijieming-ng caijieming-ng changed the title issue #934: multithread compactiong support issue #934: multi-thread compaction support Nov 21, 2016
@caijieming-ng
Copy link
Collaborator Author

build

@caijieming-ng
Copy link
Collaborator Author

build

2 similar comments
@caijieming-ng
Copy link
Collaborator Author

build

@caijieming-ng
Copy link
Collaborator Author

build

@taocp
Copy link
Collaborator

taocp commented May 11, 2017

在看ing,需要点时间。。稍安勿躁!

Copy link
Collaborator

@taocp taocp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

求补点注释:)

}
imm_->SetBeingFlushed(true);

if (imm_->ApproximateMemoryUsage() <= 0) { // imm is empty, do nothing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这是什么情况?

Copy link
Collaborator Author

@caijieming-ng caijieming-ng Jun 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

代码完整性检查:memtablet没有数据时,不做内存的dump

struct CompactionTask {
int64_t id;
double score;
uint64_t timeout;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

求注释下各个字段的基本含义

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已改

@@ -110,19 +116,19 @@ class DBImpl : public DB {

// Compact the in-memory write buffer to disk. Switches to a new
// log-file/memtable and writes a new descriptor iff successful.
Status CompactMemTable()
Status CompactMemTable(bool* sched_idle = NULL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

函数的参数求注释啊,下同

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

描述改函数是否真正干活,避免空调度

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

防止空调度

// end of tera-specific

bool ScoreSortDesc(std::pair<double, uint64_t> i, std::pair<double, uint64_t> j) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

函数名求不要简写,看不懂。。。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已改

c->ReleaseInputs();
DeleteObsoleteFiles();
DeleteObsoleteFiles();// delete any sst not in version set. if comapct and flush mem handle in multi-thread, may delete new data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个注释啥意思?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同问?


// Information for a manual compaction
enum ManualCompactState {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

跪求注释,这几种状态的定义

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已改

Copy link
Member

@elithnever elithnever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

缺少单测吧?

uint64_t bg_compaction_timeout_;
int64_t bg_schedule_id_;
std::vector<CompactionTask*> bg_compaction_tasks_;
int bg_compaction_scheduled_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bg_compaction_scheduled_不需要了吧? bg_compaction_score_可以计算出来?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已改

std::vector<CompactionTask*> bg_compaction_tasks_;
int bg_compaction_scheduled_;
std::vector<double> bg_compaction_score_;
std::vector<int64_t> bg_schedule_id_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bg_compaction_score_和bg_schedule_id_ 放在一个结构体里?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

score和id不是同时产生的,没有必要放到同一个结构体里面

const InternalKey* begin; // NULL means beginning of key range
const InternalKey* end; // NULL means end of key range
InternalKey tmp_storage; // Used to keep track of compaction progress
ManualCompactState compaction_conflict; // 0 == idle, 1 == conflict, 2 == wake
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compaction_conflict变量命名有点奇怪? 猜不出什么含义?

Copy link
Collaborator Author

@caijieming-ng caijieming-ng Jun 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

就是手动compact并发冲突

@@ -79,6 +79,13 @@ class MemTable {
empty_ = false;
}

bool BeingFlushed() { return being_flushed_;}
void SetBeingFlushed(bool flag) {
assert(flag ? !being_flushed_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么要加assert?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

放置后续同学改次块代码改错

@@ -314,6 +314,14 @@ struct Options {
// Default: 10 %
uint64_t del_percentage;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释中的default值是10, 然而实际的default值是20

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已改

@@ -746,13 +772,19 @@ Status DBImpl::CompactMemTable() {
Log(options_.info_log, "[%s] CompactMemTable SetLastSequence %lu",
dbname_.c_str(), edit.GetLastSequence());
s = versions_->LogAndApply(&edit, &mutex_);
pending_outputs_.erase(number);
} else {
// TODO: dump memtable error, delete temp file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

把TODO补上逻辑还是不需要TODO?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已改

MutexLock lock(&mutex_);
Status s;
if (recover_mem_ == NULL) {
DeleteObsoleteFiles();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recover阶段需要增加这步吗?

}

bg_compaction_scheduled_ = false;
bg_compaction_scheduled_--;
std::vector<CompactionTask*>::iterator task_id = std::find(bg_compaction_tasks_.begin(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是代码格式乱了, 还是显示的乱了?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

显示问题

c->ReleaseInputs();
DeleteObsoleteFiles();
DeleteObsoleteFiles();// delete any sst not in version set. if comapct and flush mem handle in multi-thread, may delete new data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同问?

@@ -1118,6 +1216,7 @@ void DBImpl::CleanupCompaction(CompactionState* compact) {
for (size_t i = 0; i < compact->outputs.size(); i++) {
const CompactionState::Output& out = compact->outputs[i];
pending_outputs_.erase(BuildFullFileNumber(dbname_, out.number));
Log(options_.info_log, "[%s] erase pending_output #%lu", dbname_.c_str(), out.number);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

日志里最好都写filename, 便于追查问题, 另外这个日志会不会输出量较大?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

追查问题就是用sst number

@caijieming-ng
Copy link
Collaborator Author

db_test里面基本能覆盖多线程compact的测试

@taocp
Copy link
Collaborator

taocp commented Jun 7, 2017

LGTM

@caijieming-ng caijieming-ng force-pushed the compact_multithread branch from 17702ce to b45434e Compare July 5, 2017 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants