utils.reduction
Functions
|
Collects the expanded features from a dictionary of features. |
|
Finds the sparsity of a column of data. |
|
Finds the features which encode layer data with patterns. |
|
Returns True if the given string has concentrations in it. |
|
Returns True if the given string is a pattern. |
|
Returns a mask of the features which encode layer data with patterns. |
|
Returns True if the given string matches the given regex string. |
|
Given a subset of features, partition the features into a patterned and non-patterned set. |
|
Checks if a column of data passes the sparsity threshold. |
|
Prunes a dictionary of features by their sparsity. |
|
Given a dataset return the columns which pass the given sparsity threshold. |
|
Removes features from a dictionary of features. |
|
Gets the features for a given section from a reference dictionary. |
|
Sorts a dictionary of features by their sparsity. |
- utils.reduction.collect_features(features)
Collects the expanded features from a dictionary of features.
- Parameters:
features (dict) – Dictionary of features.
- Returns:
List of expanded features.
- Return type:
list
- utils.reduction.find_sparsity(column)
Finds the sparsity of a column of data.
- Parameters:
column (series) – Column to be checked.
- Returns:
The sparsity of the data.
- Return type:
float
- utils.reduction.get_valid_patterns(ref, return_invalid: bool = True)
Finds the features which encode layer data with patterns. Returns the feature names.
- Parameters:
ref (dataframe) – The reference data for features.
return_invalid (bool) – Option to return a second set containing the names of the features which don’t contain patterns. Default True.
- Returns:
patterned feature names. list: nonpatterned features names if return_invalid is True.
- Return type:
list
- utils.reduction.has_concentrations(string)
Returns True if the given string has concentrations in it.
Used to check which features have concentrations encoded in them.
- utils.reduction.is_pattern(string)
Returns True if the given string is a pattern.
Used to check which features have layer data encoded in them.
- Examples of valid patterns:
“[Mat.1; Mat.2; … | Mat.3; … | Mat.4 | …]” “[Gas1; Gas2 >> Gas3; … >> … | Gas4 >> … | Gas5 | … ]”
- utils.reduction.is_valid_pattern(ref)
Returns a mask of the features which encode layer data with patterns.
- Parameters:
ref (dataframe) – The reference data for features.
- Returns:
A mask of the features which encode layer data with patterns.
- Return type:
series
- utils.reduction.matches_regex(rgx, string)
Returns True if the given string matches the given regex string.
- utils.reduction.partition_by_pattern(refs, keys)
Given a subset of features, partition the features into a patterned and non-patterned set.
- Parameters:
refs (dict of dataframe) – A set of references for each section of features. Sections include “Hole transport layer”, “The perovskite”, etc.
keys (list) – List of section names to be included in the partitioning.
- Returns:
patterned features. list: nonpatterned features.
- Return type:
list
- utils.reduction.passes_sparsity(column, percent=0.0)
Checks if a column of data passes the sparsity threshold.
- Parameters:
column (series) – Column to be checked.
percent (float) – Percentile threshold for the sparcity of the data. Defaults to 0.95.
- Returns:
True if the data sparcity passes the threshold. False otherwise.
- Return type:
bool
- utils.reduction.prune_by_sparsity(features, threshold)
Prunes a dictionary of features by their sparsity.
- Parameters:
features (dict) – Dictionary of features to be pruned.
threshold (float) – Threshold for the sparsity of the data.
- Returns:
Pruned dictionary of features.
- Return type:
dict
- utils.reduction.reduce_data(data, percent=0.0)
Given a dataset return the columns which pass the given sparsity threshold.
- Parameters:
data (dataframe) – Data to be reduced.
percent (float) – Percentile threshold for the sparcity of the data.
- Returns:
The reduced data.
- Return type:
dataframe
- utils.reduction.remove_features(features, remove)
Removes features from a dictionary of features.
- Parameters:
features (dict) – Dictionary of features.
remove (list) – List of features to be removed.
- Returns:
Dictionary of features without the removed features.
- Return type:
dict
- utils.reduction.section_features(sections, ref)
Gets the features for a given section from a reference dictionary.
- Parameters:
sections (list) – List of sections to be included.
ref (dict) – The reference data for features.
- Returns:
List of features for the given section.
- Return type:
list
- utils.reduction.sort_by_sparsity(features)
Sorts a dictionary of features by their sparsity.
- Parameters:
features (dict) – Dictionary of features to be sorted.
- Returns:
Sorted dictionary of features.
- Return type:
dict