Python API
get_numeric_roles_stat
Running on YARN Cluster with spark-submit
Running on YARN Cluster (from source)
Running on Spark Standalone Cluster (from source)
Deploy on Minikube (from source)
Running on Kubernetes Cluster (from source)
Calculate statistics about different encodings performances.
We need it to calculate rules about advanced roles guessing. Only for numeric data.
train (SparkDataset) – Dataset.
SparkDataset
subsample (Union[float, int, None]) – size of subsample.
Union
float
int
None
random_state (int) – int.
manual_roles (Optional[Dict[str, ColumnRole]]) – Dict.
Optional
Dict
str
ColumnRole
DataFrame
DataFrame.