BucketedPersistenceManager

class sparklightautoml.dataset.persistence.BucketedPersistenceManager(bucketed_datasets_folder, bucket_nums=100, parent=None, no_unpersisting=False)[source]

Bases: BasePersistenceManager

Manager that uses Spark Warehouse folder to store bucketed datasets (.bucketBy … .sortBy … .saveAsTable) To make such storing reliable, one should set ‘spark.sql.warehouse.dir’ to HDFS or other reliable storage.