SparkFeaturesPipeline
- class sparklightautoml.pipelines.features.base.SparkFeaturesPipeline(**kwargs)[source]
Bases:
FeaturesPipeline
,TransformerInputOutputRoles
Abstract class.
Analyze train dataset and create composite transformer based on subset of features. Instance can be interpreted like Transformer (look for
LAMLTransformer
) with delayed initialization (based on dataset metadata) Main method, user should define in custom pipeline is.create_pipeline
. For example, look atLGBSimpleFeatures
. After FeaturePipeline instance is created, it is used like transformer with.fit_transform
and.transform
method.- create_pipeline(train)[source]
Analyse dataset and create composite transformer.
- Parameters:
train (
SparkDataset
) – Dataset with train data.- Return type:
Union
[SparkBaseEstimator
,SparkBaseTransformer
,SparkUnionTransformer
,SparkSequentialTransformer
]- Returns:
Composite transformer (pipeline).
- fit_transform(train)[source]
Create pipeline and then fit on train data and then transform.
- Parameters:
train (
SparkDataset
) – Dataset with train data.n- Return type:
- Returns:
Dataset with new features.