[BUG] LightGBM | Required time to fit is too long #2226

victor-cattani-beegol · 2024-05-14T16:40:38Z

SynapseML version

1.0.4

System information

Language version (e.g. python 3.8, scala 2.12):
Spark Version (e.g. 3.5.0):
Spark Platform (e.g. Synapse, Databricks):

Describe the problem

Hello, folks!

I facing some troubles to train a LightGBM Model. The model fits until a certain point and after that, somehow models stop to train. It is not indicating any kind of error, the model just stop to load and stays in the same place forever. As you can seen below: I've been using features such as: numTasks, numThreads, numBatches, useSingleDatasetMode and useBarrierExecutionMode in order to improve fit performance.

My dataset has about 418 millions lines to train and 18 millions for validation. I've been dealing of with about 21 features, 10 categorical and rest are continuous variables.

DataBricks Cluster Configuration:

--- Single Node
--- 256 GB Ram Memory | 32 Cores

You guys have any idea why I'm having such issue?

Code to reproduce issue

dic_params_reg_model_0 = {'learningRate' : 0.10686341357711826 ,
'featureFraction': 0.9064118023259887,
'maxBin' : 5,
'minDataInLeaf' : 6,
'numIterations' : 53,
'numLeaves' : 147,
'lambdaL2' : 45.405492626469716,
'lambdaL1' : 0.0015480184927416942}

model_cluster_0 = LightGBMRegressor(metric = 'mae', earlyStoppingRound=1, labelCol='target',
dataTransferMode='streaming', numTasks=32, numThreads=32, validationIndicatorCol='validation_col', numBatches=500, useSingleDatasetMode=True, useBarrierExecutionMode=True
).setParams(**dic_params_reg_model_0).fit(train_0)

Other info / logs

Spark Configuration:

spark.master local[*, 8]
spark.databricks.cluster.profile singleNode
spark.driver.maxResultSize 150g
spark.jars.repositories https://mmlspark.azureedge.net/maven

What component(s) does this bug affect?

What language(s) does this bug affect?

language/scala: Scala source code
language/python: Pyspark APIs
language/r: R APIs
language/csharp: .NET APIs
language/new: Proposals for new client languages

What integration(s) does this bug affect?

integrations/synapse: Azure Synapse integrations
integrations/azureml: Azure ML integrations
integrations/databricks: Databricks integrations

The text was updated successfully, but these errors were encountered:

victor-cattani-beegol added the bug label May 14, 2024

github-actions bot added the triage label May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] LightGBM | Required time to fit is too long #2226

[BUG] LightGBM | Required time to fit is too long #2226

victor-cattani-beegol commented May 14, 2024

[BUG] LightGBM | Required time to fit is too long #2226

[BUG] LightGBM | Required time to fit is too long #2226

Comments

victor-cattani-beegol commented May 14, 2024

SynapseML version

System information

Describe the problem

Code to reproduce issue

Other info / logs

What component(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?