Skip to content

TypeError: getaddrinfo() argument 1 must be string or None #312

@ajayrgb

Description

@ajayrgb

Problem

xgboost changed the order of the RabitTracker constructor parameters in 2.1.0

In 2.0.3, host_ip comes first
In 2.1.0, host_ip is second.

This breaks the call here.

Steps to reproduce

  1. Create a new venv. Tested with python 3.10.13

  2. Install packages

pip install xgboost_ray==0.1.19 xgboost==2.1.0 scikit-learn ray[train]
  1. Run example:
from xgboost_ray import RayDMatrix, RayParams, train
from sklearn.datasets import load_breast_cancer

train_x, train_y = load_breast_cancer(return_X_y=True)
train_set = RayDMatrix(train_x, train_y)

evals_result = {}
bst = train(
    {
        "objective": "binary:logistic",
        "eval_metric": ["logloss", "error"],
    },
    train_set,
    evals_result=evals_result,
    evals=[(train_set, "train")],
    verbose_eval=False,
    ray_params=RayParams(
        num_actors=2,  # Number of remote actors
        cpus_per_actor=1))

bst.save_model("model.xgb")
print("Final training error: {:.4f}".format(
    evals_result["train"]["error"][-1]))

Error

2024-07-09 15:35:11,003 INFO main.py:1191 -- [RayXGBoost] Starting XGBoost training.
Traceback (most recent call last):
  File "/home/jovyan/run.py", line 10, in <module>
    bst = train(
  File "/home/jovyan/venv/lib/python3.10/site-packages/xgboost_ray/main.py", line 1612, in train
    bst, train_evals_result, train_additional_results = _train(
  File "/home/jovyan/venv/lib/python3.10/site-packages/xgboost_ray/main.py", line 1194, in _train
    rabit_process, rabit_args = _start_rabit_tracker(alive_actors)
  File "/home/jovyan/venv/lib/python3.10/site-packages/xgboost_ray/main.py", line 261, in _start_rabit_tracker
    rabit_tracker = _RabitTracker(host, num_workers)
  File "/home/jovyan/venv/lib/python3.10/site-packages/xgboost/tracker.py", line 64, in __init__
    get_family(host_ip)  # use python socket to stop early for invalid address
  File "/home/jovyan/venv/lib/python3.10/site-packages/xgboost/tracker.py", line 14, in get_family
    return socket.getaddrinfo(addr, None)[0][0]
  File "/opt/conda/lib/python3.10/socket.py", line 955, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
TypeError: getaddrinfo() argument 1 must be string or None

Proposed solution

Pin the xgboost dependency to <2.1.0

OR

change this line to

rabit_tracker = _RabitTracker(host_ip=host, n_workers=num_workers)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions