SparkBoostLGBM

class sparklightautoml.ml_algo.boost_lgbm.SparkBoostLGBM(default_params=None, freeze_defaults=True, timer=None, optimization_search_space=None, use_single_dataset_mode=True, max_validation_size=10000, chunk_size=4000000, convert_to_onnx=False, mini_batch_size=5000, seed=42, parallelism=1, use_barrier_execution_mode=False, experimental_parallel_mode=False, persist_output_dataset=True, computations_settings=None)[source]

Bases: SparkTabularMLAlgo, ImportanceEstimator

Gradient boosting on decision trees from LightGBM library.

default_params: All available parameters listed in synapse.ml documentation:

freeze_defaults:

  • True : params may be rewritten depending on dataset.

  • False: params may be changed only manually or with tuning.

timer: Timer instance or None.

fit_predict(train_valid_iterator)[source]

Fit and then predict accordig the strategy that uses train_valid_iterator.

If item uses more then one time it will predict mean value of predictions. If the element is not used in training then the prediction will be numpy.nan for this item

Parameters:

train_valid_iterator (SparkBaseTrainValidIterator) – Classic cv-iterator.

Return type:

SparkDataset

Returns:

Dataset with predicted values.