SparkTabularMLAlgo

class sparklightautoml.ml_algo.base.SparkTabularMLAlgo(default_params=None, freeze_defaults=True, timer=None, optimization_search_space=None, persist_output_dataset=True, computations_settings=None)[source]

Bases: MLAlgo, TransformerInputOutputRoles, ABC

Machine learning algorithms that accepts numpy arrays as input.

property features

Get list of features.

fit_predict_single_fold(fold_prediction_column, validation_column, train, runtime_settings=None)[source]

Train on train dataset and predict on holdout dataset.

Parameters:
  • fold_prediction_column (str) – column name for predictions made for this fold

  • validation_column (str) – name of the column that signals if this row is from train or val

  • train (SparkDataset) – dataset containing both train and val rows.

  • runtime_settings (Optional[Dict[str, Any]]) – settings important for parallelism and performance that can depend on running processes

  • moment (at the) –

Return type:

Tuple[PipelineModel, DataFrame, str]

Returns:

Target predictions for valid dataset.