Correlation Matrix
The transform checks correlation of each variable to the target variable in feature engineering to learn the features that are highly correlated with each other. It is applicable only for numerical columns.
Parameters
The table gives a brief description about each parameter in Copy dataset transform.
- Name:
By default, the transform name is populated. You can also add a custom name for the transform.
- Raw Dataset:
The file name of the input dataset. You can select the dataset that was uploaded from the drop-down list to check the correlation of each variable to the target variable either positively or negatively. (Required: True, Multiple: False)
The sample input for this transform looks as below:
The dashboard after running the Correlation Matrix transform on the dataset appears as below:
How to use it in Notebook
The following is the code snippet you must use in the Jupyter Notebook editor to run the Correlation Matrix transform:
transform = Transform()
transform.name = "correlation matrix"
transform.templateId = correlation_matrix.id
transform.variables = {
"inputDataset": dataset_w_one_hot_encoding.name,
}
recipe_corr_matrix = project.addRecipe([dataset_w_one_hot_encoding], name="correlation_matrix")
#recipe_corr_matrix.prepareForLocal(transform, contextId="correlation_matrix")
recipe_corr_matrix.addTransform(transform)
# TODO: Timeing issue. prir EDA may not have completed.
recipe_corr_matrix.run()