SparkFeaturesPipeline
- class sparklightautoml.pipelines.features.base.SparkFeaturesPipeline(**kwargs)[source]
Bases:
FeaturesPipeline,TransformerInputOutputRolesAbstract class.
Analyze train dataset and create composite transformer based on subset of features. Instance can be interpreted like Transformer (look for
LAMLTransformer) with delayed initialization (based on dataset metadata) Main method, user should define in custom pipeline is.create_pipeline. For example, look atLGBSimpleFeatures. After FeaturePipeline instance is created, it is used like transformer with.fit_transformand.transformmethod.- create_pipeline(train)[source]
Analyse dataset and create composite transformer.
- Parameters:
train (
SparkDataset) – Dataset with train data.- Return type:
Union[SparkBaseEstimator,SparkBaseTransformer,SparkUnionTransformer,SparkSequentialTransformer]- Returns:
Composite transformer (pipeline).
- fit_transform(train)[source]
Create pipeline and then fit on train data and then transform.
- Parameters:
train (
SparkDataset) – Dataset with train data.n- Return type:
- Returns:
Dataset with new features.