sparklightautoml.reader

Utils for reading, training and analysing data.

Readers

SparkToSparkReader

Reader to convert DataFrame to AutoML's PandasDataset.

SparkToSparkReaderTransformer

Transformer of SparkToSparkReader.

SparkReaderHelper

Helper class that provide some methods for SparkToSparkReader and SparkToSparkReaderTransformer.

Utility functions for advanced roles guessing

get_category_roles_stat

Search for optimal processing of categorical values.

get_gini_func

Returns generator that take iterator by pandas dataframes and yield dataframes with calculated ginis.

get_null_scores

Get null scores.

get_numeric_roles_stat

Calculate statistics about different encodings performances.

get_score_from_pipe

Get normalized gini index from pipeline.