Fill_Nulls
This transform replaces the missing values with a specific value based on the selected method, which can be zero, mean, median or KNN of the column.
tags: [“Data Preparation”]
Parameters
The table gives a brief description about each parameter in Fill_Nulls transform.
- Name:
By default, the transform name is populated. You can also add a custom name for the transform.
- Raw Dataset:
The file name of the input dataset. You can select the dataset that was uploaded from the drop-down list. (Required: True, Multiple: False)
- Numerical Columns Fill Strategy:
The strategy to be followed to fill the null values in numerical columns. Possible options:
Mean
Median
Zero
KNN
- Categorical Columns Fill Strategy:
The strategy to be followed to fill the null values values in categorical columns. Possible options:
Mode
None
- Target:
The target column to make predictions on.
- Output Dataset:
The file name with which the output dataset is created after filling the null values in the dataset value. (Required: True, Multiple: False)
Below is the screenshot with sample input data for this transform.
The output after running the Fill_Nulls transform on the dataset appears as below:
The dashboard generated after running the Fill_Nulls transform on the dataset appears as below:
How to use it in Notebook
The following is the code snippet you must use in the Jupyter Notebook editor to run the Back Fill transform:
transform = Transform()
transform.name = "fill null"
transform.templateId = fill_nulls.id
transform.variables = {
"inputDataset": dataset_w_drop_bad_cols.name,
"numericalColsStrategy": "Mean",
"categoricalColsStrategy": "None",
"targetCol": "target",
"OutputDataset": fill_null_bad_col_ds_name,
}
recipe_fill_null = project.addRecipe([dataset_w_drop_bad_cols], name="fill_null")
# recipe_fill_null.prepareForLocal(transform, contextId="recipe_fill_null")
recipe_fill_null.addTransform(transform)
recipe_fill_null.run()
Requirements
pandas