Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[curve/toos-v2]: add snapshot-clone-status #2346

Closed
Cyber-SiKu opened this issue Mar 30, 2023 · 12 comments
Closed

[curve/toos-v2]: add snapshot-clone-status #2346

Cyber-SiKu opened this issue Mar 30, 2023 · 12 comments
Assignees
Labels
enhancement improve feature good first issue Good for newcomers

Comments

@Cyber-SiKu
Copy link
Contributor

Is your feature request related to a problem? (你需要的功能是否与某个问题有关?)

we'd like to support bs status snapshot command in curve tool

  • The implementation of the old tool is here:
    int StatusTool::PrintSnapshotCloneStatus() {
    std::cout << "SnapshotCloneServer status:" << std::endl;
    if (noSnapshotServer_) {
    std::cout << "No SnapshotCloneServer" << std::endl;
    return 0;
    }
    std::string version;
    std::vector<std::string> failedList;
    int res = versionTool_->GetAndCheckSnapshotCloneVersion(&version,
    &failedList);
    int ret = 0;
    if (res != 0) {
    std::cout << "GetAndCheckSnapshotCloneVersion fail" << std::endl;
    ret = -1;
    } else {
    std::cout << "version: " << version << std::endl;
    if (!failedList.empty()) {
    versionTool_->PrintFailedList(failedList);
    ret = -1;
    }
    }
    std::vector<std::string> activeAddrs = snapshotClient_->GetActiveAddrs();
    std::map<std::string, bool> onlineStatus;
    snapshotClient_->GetOnlineStatus(&onlineStatus);
    std::cout << "current snapshot-clone-server: " << activeAddrs << std::endl;
    PrintOnlineStatus("snapshot-clone-server", onlineStatus);
    return ret;
    }
  • The old command input and out put:
curve_ops_tool snapshot-clone-status

-------
output:
SnapshotCloneServer status:
version: ***
current snapshot-clone-server: ***
online snapshot-clone-server list: ***
offline snapshot-clone-server list: ***

Refer to tool develop guide to get start, and paste the result of the command in pr.

Build compilation environment:https://github.com/opencurve/curve/blob/master/docs/cn/build_and_run.md

Describe the solution you'd like (描述你期望的解决方法)

Add subcommand clinet to curve bs list.

Describe alternatives you've considered (描述你想到的折衷方案)

Additional context/screenshots (更多上下文/截图)

@zhanghuidinah
Copy link
Member

@Cyber-SiKu Cyber-SiKu self-assigned this Apr 20, 2023
@Xinlong-Chen
Copy link
Contributor

Xinlong-Chen commented Apr 21, 2023

@Cyber-SiKu I am interested in this task, please assign it to me, thanks.

@ilixiaocui
Copy link
Contributor

Xinlong-Chen

welcome!

@Xinlong-Chen
Copy link
Contributor

Xinlong-Chen commented Apr 25, 2023

@Cyber-SiKu When I tried to understand this task, I found that this task is similar to the implementation of the curve bs status mds command.

But I have a little doubt about the implementation of mds.


In tools-v1: there need 3 rpcs to get status(in curve/src/tools/status_tool.cpp@PrintMdsStatus):

  1. versionTool_->GetAndCheckMdsVersion(&version, &failedList)
  2. mdsClient_->GetCurrentMds()
  3. mdsClient_->GetMdsOnlineStatus(&onlineStatus)

But this is different in tools-v2.

There are only 2 rpcs in tools-v2(in curve/tools-v2/pkg/cli/command/curvebs/status/mds/mds.go@Init):

  1. statusMetric := basecmd.NewMetric(addrs, STATUS_SUBURI, timeout)
  2. versionMetric := basecmd.NewMetric(addrs, VERSION_SUBURI, timeout)

This 2 rpcs in tools-v2 is the same as the first two rpcs in tools-v1.

I would like to ask why it can be done like this (get rid of the tail rpc)?

@Cyber-SiKu
Copy link
Contributor Author

Cyber-SiKu commented Apr 25, 2023

@Cyber-SiKu When I tried to understand this task, I found that this task is similar to the implementation of the curve bs status mds command.

But I have a little doubt about the implementation of mds.

In tools-v1: there need 3 rpcs to get status(in curve/src/tools/status_tool.cpp@PrintMdsStatus):

  1. versionTool_->GetAndCheckMdsVersion(&version, &failedList)
  2. mdsClient_->GetCurrentMds()
  3. mdsClient_->GetMdsOnlineStatus(&onlineStatus)

But this is different in tools-v2.

There are only 2 rpcs in tools-v2(in curve/tools-v2/pkg/cli/command/curvebs/status/mds/mds.go@Init):

  1. statusMetric := basecmd.NewMetric(addrs, STATUS_SUBURI, timeout)
  2. versionMetric := basecmd.NewMetric(addrs, VERSION_SUBURI, timeout)

This 2 rpcs in tools-v2 is the same as the first two rpcs in tools-v1.

I would like to ask why it can be done like this (get rid of the tail rpc)?

GetCurrentMds

std::vector<std::string> MDSClient::GetCurrentMds() {
std::vector<std::string> leaderAddrs;
for (const auto item : dummyServerMap_) {
// 获取status来判断正在服务的地址
std::string status;
MetricRet ret = metricClient_.GetMetric(item.second,
kMdsStatusMetricName, &status);
if (ret != MetricRet::kOK) {
std::cout << "Get status metric from " << item.second
<< " fail" << std::endl;
continue;
}
if (status == kMdsStatusLeader) {
leaderAddrs.emplace_back(item.first);
}
}
return leaderAddrs;
}

GetMdsOnlineStatus

int MDSClient::GetListenAddrFromDummyPort(const std::string& dummyAddr,
std::string* listenAddr) {
assert(listenAddr != nullptr);
MetricRet res = metricClient_.GetConfValueFromMetric(dummyAddr,
kMdsListenAddrMetricName, listenAddr);
if (res != MetricRet::kOK) {
return -1;
}
return 0;
}
void MDSClient::GetMdsOnlineStatus(std::map<std::string, bool>* onlineStatus) {
assert(onlineStatus != nullptr);
onlineStatus->clear();
for (const auto item : dummyServerMap_) {
std::string listenAddr;
int res = GetListenAddrFromDummyPort(item.second, &listenAddr);
// 如果获取到的监听地址与记录的mds地址不一致,也认为不在线
if (res != 0 || listenAddr != item.first) {
onlineStatus->emplace(item.first, false);
continue;
}
onlineStatus->emplace(item.first, true);
}
}

It can be found that both are essentially obtaining metirc to judge the status of the server.It's just that v1 still needs to judge whether the dummyport is consistent with the configuration file.

@Xinlong-Chen
Copy link
Contributor

Xinlong-Chen commented Apr 25, 2023

Got it!

Based on forementioned reasons, we also can implement bs status snapshot command with 2 rpcs.

There are the corresponding metric strings I found about snapshot. I don't know if that's right?

STATUS_SUBURI  = "/vars/snapshotcloneserver_status"
VERSION_SUBURI = "/vars/curve_version"

@Cyber-SiKu
Copy link
Contributor Author

Got it!

Based on forementioned reasons, we also can implement bs status snapshot command with 2 rpcs.

There are the corresponding metric strings I found about snapshot. I don't know if that's right?

STATUS_SUBURI  = "/vars/snapshotcloneserver_status"
VERSION_SUBURI = "/vars/curve_version"

It should be like this, you can refer to the old code. It seems that there are differences in versions

@Xinlong-Chen
Copy link
Contributor

Well, I also have a question that how can I test it?
When I use tools-v1 command curve_ops_tool snapshot-clone-status in docker(use reference image),the following result I get bothers me.

root@f8df310173eb:/# curve_ops_tool snapshot-clone-status
SnapshotCloneServer status:
No SnapshotCloneServer

Do I need do more action about snapshot deployment to get correct result as following:

curve_ops_tool snapshot-clone-status

-------
output:
SnapshotCloneServer status:
version: ***
current snapshot-clone-server: ***
online snapshot-clone-server list: ***
offline snapshot-clone-server list: ***

@Cyber-SiKu
Copy link
Contributor Author

Currently snapshot-related commands cannot use the cluster deployed by curveadm palygroud (there is no s3, so there is no snapshot).
Need to deploy a minio.
Then follow the deployment document to deploy a stand-alone cluster. You can skip formatting the disk, and then fill in the configuration items of topology.yaml:

kind: curvebs
global:
...
   s3.nos_address: <> // ip:9000 is the ip and port number deployed by minio
   s3.snapshot_bucket_name: <> // created bucket name
   s3.ak: <> // ak minioadmin
   s3.sk: <> //sk minioadmin
...

chunkserver_services:
   config:
...
     copiesets: 100
      chunkfilepool.enable_get_chunk_from_pool: false
   deploy:
     - host: ${target}
     - host: ${target}
     - host: ${target}
...

@Cyber-SiKu
Copy link
Contributor Author

目前snapshot 相关的命令不能使用 curveadm palygroud 部署的集群(没有s3,所以没有snapshot).
需要先部署一个minio.
然后按照部署文档部署一个单机集群.可以跳过格式化磁盘,然后在填写topology.yaml的配置项:

kind: curvebs
global:
... ...
  s3.nos_address: <> // ip:9000 为minio部署的ip和端口号
  s3.snapshot_bucket_name: <> // 创建的桶名
  s3.ak: <> // ak  minioadmin
  s3.sk: <> //sk minioadmin
... ...

chunkserver_services:
  config:
... ...
    copysets: 100
     chunkfilepool.enable_get_chunk_from_pool: false
  deploy:
    - host: ${target}
    - host: ${target}
    - host: ${target}
... ...

@Xinlong-Chen
Copy link
Contributor

It seems that both addr and dummyaddr should be given by user(or file).
in tools-v1(mds):

int MDSClient::Init(const std::string& mdsAddr,
const std::string& dummyPort) {
if (isInited_) {
return 0;
}
// 初始化channel
curve::common::SplitString(mdsAddr, ",", &mdsAddrVec_);
if (mdsAddrVec_.empty()) {
std::cout << "Split mds address fail!" << std::endl;
return -1;
}
int res = InitDummyServerMap(dummyPort);
if (res != 0) {
std::cout << "init dummy server map fail!" << std::endl;
return -1;
}
for (uint64_t i = 0; i < mdsAddrVec_.size(); ++i) {
if (channel_.Init(mdsAddrVec_[i].c_str(), nullptr) != 0) {
std::cout << "Init channel to " << mdsAddr << "fail!" << std::endl;
continue;
}
// 寻找哪个mds存活
curve::mds::topology::ListPhysicalPoolRequest request;
curve::mds::topology::ListPhysicalPoolResponse response;
curve::mds::topology::TopologyService_Stub stub(&channel_);
brpc::Controller cntl;
cntl.set_timeout_ms(FLAGS_rpcTimeout);
stub.ListPhysicalPool(&cntl, &request, &response, nullptr);
if (cntl.Failed()) {
continue;
}
currentMdsIndex_ = i;
isInited_ = true;
return 0;
}
std::cout << "Init channel to all mds fail!" << std::endl;
return -1;
}

in tools-v1(snapshot):
int SnapshotCloneClient::Init(const std::string& serverAddr,
const std::string& dummyPort) {
curve::common::SplitString(serverAddr, ",", &serverAddrVec_);
if (serverAddrVec_.empty()) {
// no snapshot clone server
return 1;
}
int res = InitDummyServerMap(dummyPort);
if (res != 0) {
std::cout << "init dummy server map fail!" << std::endl;
return -1;
}
return 0;
}

in tools-v2(mds):
func AddBsUint64RequiredFlag(cmd *cobra.Command, name string, usage string) {
cmd.Flags().Uint64(name, uint64(0), usage+color.Red.Sprint("[required]"))
cmd.MarkFlagRequired(name)
err := viper.BindPFlag(BSFLAG2VIPER[name], cmd.Flags().Lookup(name))
if err != nil {
cobra.CheckErr(err)
}
}
// add flag option
// bs mds[option]
func AddBsMdsFlagOption(cmd *cobra.Command) {
AddBsStringOptionFlag(cmd, CURVEBS_MDSADDR, "mds address, should be like 127.0.0.1:6700,127.0.0.1:6701,127.0.0.1:6702")
}
func AddBsMdsDummyFlagOption(cmd *cobra.Command) {
AddBsStringOptionFlag(cmd, CURVEBS_MDSDUMMYADDR, "mds dummy address, should be like 127.0.0.1:6700,127.0.0.1:6701,127.0.0.1:6702")
}

I guess that the strings is from user input or config file like template.yaml:
mdsAddr: 127.0.0.1:6700,127.0.0.1:6701,127.0.0.1:6702
mdsDummyAddr: 127.0.0.1:7700,127.0.0.1:7701,127.0.0.1:7702
etcdAddr: 127.0.0.1:23790,127.0.0.1:23791, 127.0.0.1:23792

How can i get the addr and dummy in snapshot tools-v2 implement?(add two kv pairs about snapshot in template.yaml?)

@Cyber-SiKu
Copy link
Contributor Author

It seems that both addr and dummyaddr should be given by user(or file). in tools-v1(mds):

int MDSClient::Init(const std::string& mdsAddr,
const std::string& dummyPort) {
if (isInited_) {
return 0;
}
// 初始化channel
curve::common::SplitString(mdsAddr, ",", &mdsAddrVec_);
if (mdsAddrVec_.empty()) {
std::cout << "Split mds address fail!" << std::endl;
return -1;
}
int res = InitDummyServerMap(dummyPort);
if (res != 0) {
std::cout << "init dummy server map fail!" << std::endl;
return -1;
}
for (uint64_t i = 0; i < mdsAddrVec_.size(); ++i) {
if (channel_.Init(mdsAddrVec_[i].c_str(), nullptr) != 0) {
std::cout << "Init channel to " << mdsAddr << "fail!" << std::endl;
continue;
}
// 寻找哪个mds存活
curve::mds::topology::ListPhysicalPoolRequest request;
curve::mds::topology::ListPhysicalPoolResponse response;
curve::mds::topology::TopologyService_Stub stub(&channel_);
brpc::Controller cntl;
cntl.set_timeout_ms(FLAGS_rpcTimeout);
stub.ListPhysicalPool(&cntl, &request, &response, nullptr);
if (cntl.Failed()) {
continue;
}
currentMdsIndex_ = i;
isInited_ = true;
return 0;
}
std::cout << "Init channel to all mds fail!" << std::endl;
return -1;
}

in tools-v1(snapshot):

int SnapshotCloneClient::Init(const std::string& serverAddr,
const std::string& dummyPort) {
curve::common::SplitString(serverAddr, ",", &serverAddrVec_);
if (serverAddrVec_.empty()) {
// no snapshot clone server
return 1;
}
int res = InitDummyServerMap(dummyPort);
if (res != 0) {
std::cout << "init dummy server map fail!" << std::endl;
return -1;
}
return 0;
}

in tools-v2(mds):

func AddBsUint64RequiredFlag(cmd *cobra.Command, name string, usage string) {
cmd.Flags().Uint64(name, uint64(0), usage+color.Red.Sprint("[required]"))
cmd.MarkFlagRequired(name)
err := viper.BindPFlag(BSFLAG2VIPER[name], cmd.Flags().Lookup(name))
if err != nil {
cobra.CheckErr(err)
}
}
// add flag option
// bs mds[option]
func AddBsMdsFlagOption(cmd *cobra.Command) {
AddBsStringOptionFlag(cmd, CURVEBS_MDSADDR, "mds address, should be like 127.0.0.1:6700,127.0.0.1:6701,127.0.0.1:6702")
}
func AddBsMdsDummyFlagOption(cmd *cobra.Command) {
AddBsStringOptionFlag(cmd, CURVEBS_MDSDUMMYADDR, "mds dummy address, should be like 127.0.0.1:6700,127.0.0.1:6701,127.0.0.1:6702")
}

I guess that the strings is from user input or config file like template.yaml:

mdsAddr: 127.0.0.1:6700,127.0.0.1:6701,127.0.0.1:6702
mdsDummyAddr: 127.0.0.1:7700,127.0.0.1:7701,127.0.0.1:7702
etcdAddr: 127.0.0.1:23790,127.0.0.1:23791, 127.0.0.1:23792

How can i get the addr and dummy in snapshot tools-v2 implement?(add two kv pairs about snapshot in template.yaml?)

Yes, and you can refer to etcd or mds to obtain the relevant configuration process, which can be read from the configuration file or specified by flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement improve feature good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

5 participants