Sample Bootstrapping
This function creates a random sample from the dataset
tags: [“Data Preparation”]
Parameters
Input Dataset: Input Dataset to get a sample of it (Required: True, Multiple: False)
Sampling Type: To take a proportion of the dataset, pick “frac”. To take a certain number of rows of the dataset pick “n” (Required: True, Multiple: False, Datatypes: [‘STRING’], Options: [‘CONSTANT’], Default_value: ‘frac’, Constant_options: [‘frac’, ‘n’])
Sample Rows/Proportion: If it is “n” pick the number of rows, if it is “frac” pick the proportion (Required: True, Multiple: False, Datatypes: [‘FLOAT’], Options: [‘CONSTANT’], Default_value: ‘0.8’)
Weight Column: Weight Column to do the sample (Required: False, Multiple: False, Datatypes: [‘FLOAT’], Options: [‘FIELDS’], Datasets: [‘df’])
Output Dataset: Sample Dataset Name (Required: True, Multiple: False)
How to use it in Notebook
template=TemplateV2.get_template_by('Sample Bootstrapping')
recipe_Sample_Bootstrapping= project.addRecipe([car_data, employee_data, temperature_data, only_numeric], name='Sample Bootstrapping')
transform=Transform()
transform.templateId = template.id
transform.name='Sample Bootstrapping'
transform.variables = {
'input_dataset':'car',
'type':"frac",
'number':0.8,
'weight':"horsepower",
'output_dataset':'car_sample_boot'}
recipe_Sample_Bootstrapping.add_transform(transform)
recipe_Sample_Bootstrapping.run()
How to use it in RC UI
Requirements
pandas