utils.preprocess
Functions
|
Runs the entire preprocessing pipeline for the given data. |
- utils.preprocess.preprocess_data(data, ref, threshold, depth, sections=[], exclude_cols=[], nan_equivalents={}, verbosity: int = 0)
Runs the entire preprocessing pipeline for the given data.
- Parameters:
data (dataframe) – The perovskite data.
ref (dataframe) – The reference data for features.
threshold (float) – Threshold (%) for the feature density. Used to remove sparce data.
depth (float) – Threshold (%) for the feature layer density. Determines how many feature layers are extracted.
sections (list of str, optional) – List of sections to be included. Defaults to [].
exclude_cols (list of str, optional) – List of columns to be excluded. Defaults to [].
nan_equivalents (dict, optional) – Equivalent values for NaN in the dataset. Defaults to {}.
verbosity (int, optional) – Verbosity level. Defaults to 0.
- Returns:
The preprocessed data.
- Return type:
dataframe