Recipes
A collection of recipes creates a machine-learning model. You can run the recipes on a dataset to perform a series of transformations. Each transformation does an action. You can apply various transformations in different stages of the building a machine learning model on the uploaded dataset and run them as recipes. After running the recipe, the transformed output is created, which can be a dataset, or a chart. On this dataset output, you can run another recipe until you attain the desired outcome.
Recipe Types
The platform allows you to use four forms of recipes -
AI-assisted - This recipe allows users to leverage AI to generate the code recipe or write the logic for the recipe using Python.
Rapid Model - Without using the above two recipe types and with zero coding knowledge, users can rapidly build models with just a few clicks.
Template - This recipe allows users to apply predefined templates to perform data transformation operations.
API Connector recipe - This recipe enables users to create a dataset or add and run a recipe on the dataset by writing Python code.
AI-assisted recipe
If there are no standard templates to perform data transformations, then AI-assisted recipes can be used. AI-assisted functionality is integrated with AI capabilities to enable business users to ask AI to generate the code for the given prompt. After the code is generated, you can add this to the recipe and run the data pipeline to view the output in the form of datasets or charts.
You can ask AI the questions from the dataset and receive accurate answers on the same chat. When you receive the response from the AI, you can easily share your satisfaction using thumbs up and down options.
Ask AI
When you click on the dataset block in the canvas and select the AI-assisted option from the Add recipes drop-down, this opens the code editor where you can can take assistance of the AI to generate the code snippet for the given prompt.
Select the dataset
You must select the dataset on which you want to apply the data transformation and generate code based on the given prompt. You also have the option to choose the output type: dataset, chart, text, or model. If you select a dataset output, you can generate up to five datasets and add them to the data pipeline. However, the number of datasets generated will depend on the given prompt.
Add the generated code to recipes in the data pipeline
Use the +Add to recipe option in the AI-assisted code editor to add the code generated by the AI to the flow or data pipeline. The Add to recipe option is available only after running the text prompt.
Save and Run the code recipe in the flow
Save the code and use the run option to run the code recipe to generate the output, which can be a dataset or a chart. You can continue building custom templates or code recipes using Ask AI.
Adding an AI-assisted recipe
Use this procedure to add an AI-assisted recipe to the data pipeline or ML flow using the integrated AI tool. The recipes can be related to data pre-processing or building a model.
Note
If the application is installed in the customer’s Virtual Private Cloud (VPC) environment, features that rely on external API calls will be restricted. If a user tries to access these features, a notification will inform them that the admin has disabled these features.
To add and run an AI-assisted recipe:
Click the dataset block on the canvas to open the pull-out window.
Select the recipe
Click the plus icon and select the AI-assisted recipe in the pull-out window.
This opens the Ask AI tab where you can type the text prompt in the provided query box to generate the code recipe.
If you want to view the column names and data type of each column in the uploaded dataset, you can expand the datasets in the Inputs section on the left.
Note
Use the delete icon to delete the uploaded dataset.
Use the plus button to add multiple datasets to use in the code recipe. In the drop-down, you can only find the datasets that you have added onto your Project canvas. If there is only one dataset on the canvas, this button remains disabled.
Select the dataaset to run the recipe
By default, the input dataset is selected from the list. Here, the input dataset is Titanic.
Note
You can select maximum of four datasets.
Enter the text prompt
Enter the query in the provided query box. In this example, we have provided the query to concatenate two columns in the dataset and generate the code for this, “Concatenate the First_name and Last_name columns and generate a new column with name”.
You can use @ in the query box to get the list of column names available in the selected dataset.
Note
(Optional) From the ellipses icon, select Generate Query Limit to run the query only on the selected number of rows in the dataset. You have three options to select from:
Full data
100k rows
1 million rows
Select the output type you want to generate
Type a slash (/) in the query box to choose the type of output you want the AI to generate for your query, such as a dataset, chart, model, or text. You can also view prompt suggestions option. Select this option and press enter to view prompt suggestions for the selected dataset. Besides the suggested prompts, you can also ask a query to generate additional prompt suggestions.
If you do not select the output type, the platform will auto-detect the output type based on the entered prompt.
Generate the code
Click the generate icon to generate the code. The generate button is enabled only after you select the dataset and provide the query.
The AI consumes the text prompt and generates the related code for concatenating two columns. You can see the output generated by the AI with the dataset size (total columns and rows)
You can only view 100 records in the output dataset. If you want to see lesser record, select the number of records you want to view from the drop-down list. You can also view the size of the dataset.
You can pin the generated dataset to use this dataset as a source to run the subsequent set of recipes. If you want to unpin, you can click on the unpin icon that is next to the Test button to unpin all datasets.
For the text responses, you can use the Copy answer option to copy the responses.
View the code
Click the View Code icon to view the code generated by the AI for the given prompt. It is optional to view the code. You can go back to View output using the View Output icon.
You can use thumbs up and down to indicate your satisfaction levels on the output generated by the AI for the given query.
You can generate up to five datasets based on the given prompt in the AskAI.
You can also request an explanation for the code generated by the LLM. It provides a detailed explanation of the code snippets generated by AskAI, helping users understand the logic and functionality behind the generated code.
Add the generated code to recipe
Click + Recipe to add the template for concatenating two columns to the recipe in the data pipeline. If you want to remove the recipe, click Remove from Recipe to remove from the pipeline.
Note
You can use the dataset generated by this recipe as an input for the next prompt using the Query icon.
Provide the custom name for the recipe.
Click Add to Recipe. After adding, you can see a green color line indicating the addition of this recipe to the data pipeline.
Test the recipe on the dataset
Click Test and select a test option to test the recipe on full dataset, 100k rows, or 1 million rows before saving and running this recipe in the data pipeline. Possible options:
Test (full data)
Test with 100k rows
Test with 1 Million rows
You can see the test output dataset in a new tab as shown in the screenshot below:
Save and run the recipe in the datapipeline
Click Save to save the recipe.
Click the Run icon to run the recipe and generate the output to see on the canvas. In this example, we want to generate a new column called full name after concatenating two columns that is first name and last name.
Check the canvas
Go back to the canvas to view the output.
Click the dataset block on the canvas. This opens the pull-out window.The output dataset will include an extra column resulting from the concatenation process that is the ‘name’ column.
Click View to see a dataset generated after concatenating two columns.
Note
To check recipe logs, click on the recipe block on the canvas and from the side panel, click the logs icon . This gives you access to the record of successful and failed recipes.
In AI-assisted recipes, you can also use code and snippets (default) to perform data transformations.
You can use the Code option in the AI-assisted recipe to write Python code and define logic for data transformation in the provided code editor. Subsequently, run this code recipe in the pipeline to transform the data and produce a dataset or a chart output.
Before running the code recipe in the data pipeline or flow, you can use the Test option to test the code and view the output. If the output is what you are expecting, you can run the custom code recipe in your flow.
If you want to add a code recipe to the flow, see Adding an AI-assisted recipe.
Adding code snippets to the data pipeline
Use this procedure to add code snippets to the data pipeline or flow. First, search for the code syntax in the code snippets list to get the syntax within which you must add the logic and run the data transform or recipe.
To add and run a code snippet:
Click the dataset block on the canvas to open the pull-out window.
Click the plus icon and select the AI-assisted recipe in the pull-out window. This takes you to the Ask AI tab.
Click the ellipses icon and select Snippets. This displays the SNIPPETS button.
Click SNIPPETS.
Search for the Syntax within which the logic (code snippet) must be added. Click Copy to copy the syntax and paste in the Code tab.
Select the template based on the data transformation you want to perform. If we want to replace a value in the dataset, search for replace value snippets.
Click Copy corresponding to the code block you want to use.
Click the Code tab to paste the copied code snippets in the coding workspace.
Now, add the dataset name in which you want to replace the value. Also, you must add the new value with which you want to replace the existing value.
Click TEST to test the code and view how the output looks before you save and run this recipe in the data pipeline.
Click Save to save the code recipe. This enables the run button.
Click the Run icon to run the recipe and generate the output to see on the canvas.
Writing logic for the template from scratch
Use this procedure to write the data transformation template from scratch.
Click the dataset block on the canvas to open the pull-out window.
Click the plus icon and select the AI-assisted Recipe recipe in the pull-out window. This takes you to the Ask AI tab.
Click the Code tab.
Write the logic for the code recipe in the provided coding space using Python.
Click Test to test the code you have written.
Click Save and then click the Run icon to run this transformation in the data pipeline.
Writing logic to generate Artifact
Use this procedure to write a logic to generate an artifact from the code tab and add the generated artifact to the data pipeline.
Click the dataset block on the canvas to open the pull-out window.
Click the plus icon and select the AI-assisted Recipe recipe in the pull-out window. This takes you to the Ask AI tab.
Click the Code tab and provide the below code to generate the artifact.
def transform(entities, context):
from utils.notebookhelpers.helpers import Helpers
from utils.dtos.templateOutput import ArtifactOutput
input_df_1 = entities['output_1'] # this is for reading input dataset
import pandas as pd
import numpy as np
output_df_1 = input_df_1.drop(['Age'], axis=1)
artifactsDir = Helpers.getOrCreateArtifactsDir(context, artifactsId = "test-artifact")
output_df_1.head(10).to_csv(artifactsDir + '/test.csv')
return {
'output_2': output_df_1, # output_2 is the name of the output to be generated. Change the name as per your requirements.
"test-artifact": ArtifactOutput()
}
Important
You can test the artifact code by using the Test option.
Click Save and then click the Run icon to add the generated artifact to the data pipeline.
Writing logic to generate a model
Use this procedure to write a logic to generate an ML model from the code tab. You can later use this model on the similar datasets to make predictions.
Click the dataset block on the canvas to open the pull-out window.
Click the plus icon and select the AI-assisted Recipe recipe in the pull-out window. This takes you to the Ask AI tab.
Click the Code tab and provide the below code to generate the model.
Writing logic to add global variables
Use this procedure to add global variables to store artifacts and models built on the source dataset in a project.
Click the dataset block on the canvas to open the pull-out window.
Click the plus icon and select the AI-assisted Recipe recipe in the pull-out window. This takes you to the Ask AI tab.
Click the Code tab and provide the below code to add global variables.
def transform(entities, context): from utils.notebookhelpers.helpers import Helpers input_df_1 = entities['titanic'] # this is for reading input dataset import pandas as pd import numpy as np output_df_1 = input_df_1.drop(columns=['Sex']) print("value of global variable:") print(Helpers.get_global_var(context, "test-var")) return { 'output_1': output_df_1, # output_1 is the name of the output to be generated. Change the name as per your requirements. }
Click Save and then click the Run icon to add the global variables.
Snippets
There are default snippets available to use for data cleaning and data preparation. You can test the code snippet before running the code recipe in the flow using the test option.
If you want to write logic for data transformations using Python from scratch, see Writing logic for template from scratch.
Rapid model recipe
You can use the rapid model recipe to solve an ML problem that falls into one of these categories, such as classification, regression, and binary classification, by creating an ML model on the historic dataset.
Developing ML models using the Rapid Model recipe
Use this procedure to build simple ML models using the Rapid model recipe type. This type of recipe eliminates writing the code template or using the predefined templates to perform data transformations. The platform performs all the data transformation steps automatically after you select the problem type and target column for the uploaded dataset.
Click the dataset block on the canvas to open the pull-out window.
Click the plus icon and select the Rapid Model Recipe recipe. This takes you to the recipe screen.
Select the dataset on which you want to perform the transformations and build ML model. By default, the dataset gets populated. However, if you want to run this recipe on any other dataset, select from the drop-down.
Select the target column on which you want to make predictions or build models by typing in search box. If there is no target column for Anomaly detection and Clustering, you can select I don’t have a Target Column option.
Note
Click the AI Guide icon corresponding to the target column field to open the AI Guide chat window. Using this integration, you can identify the most appropriate column you want the model to predict in the uploaded dataset or query the dataset to learn about the missing values in a particular column or other details.
Example Use Case:
Imagine you are in the process of building a predictive model using the Rapid Model recipe. You have loaded a dataset on the canvas but are unsure which column should be set as the target variable. Here’s how the AI Guide can help:
In the AI chat interface, ask, “What should be the target column for this dataset?”
Receive a Recommendation: The AI analyzes the dataset and provides a recommendation for the most suitable target column based on the data characteristics.
Problem Type is auto-detected by the platform based on the selected target column. Supported problem types by the platform:
Binary Classification
Regression
MultiClass Classification
Timeseries Forecasting
Anamoly Detection
Clustering
Note
If the selected problem type is Binary Classification, then Dynamic features toggle is displayed. Enabling this toggle displays the Business context field where you can provide the business context to use the problem type as Binary experimental.
Click to expand the Advanced inputs option. This will display the Freeze columns field, allowing you to select the columns that you want to keep intact and prevent them from being dropped or manipulated during the data pre-processing phase in the rapid model recipe. This Advanced Inputs option is displayed for clustering, Anomaly detection, regression, binary classification and Multi-class classification problem types.
Click Save and then click the Run icon .
Note
The status is set to Running until the model is built.
To check the logs of this recipe model, click the Logs icon.
Once the project run is successful, the link is displayed to open the canvas.
Click OPEN CANVAS. If you want to go back to the dataset view to rerun the recipe by changing the target column, you can click GO BACK TO DATASET VIEW.
You can see the output dataset, chart, model and artifact generated as outputs after running this recipe.
Adding a DataApp for Binary classification problem type
Use this procedure to create a Dataapp for binary classification, regression, binary experimental, and multi class classification problem types in the Rapid model recipe.
Prerequisites:
Create a prediction service for the model to create an App template. Prediction services
You can create a prediction service for the model directly from the canvas by clicking on the model block. This opens the side panel. Clicking on the Prediction service button takes you to the prediction service page.
To create a DataApp for binary classification problem type:
Select the project in which you want to create a DataApp. You can only create DataApps for the binary classificaiton problem type in the Rapid Model recipe.
Select Datapps from the project navigation menu. This opens the page to create a DataApp.
Click the plus icon . The Create DataApp window is displayed.
Specify this information:
- DataApp Name:
The name of the DataApp.
- DataApp Description:
It is optional. The description for the DataApp.
- Recipe Name:
Select the recipe you want to run in the DataApp.
Click Create to create the DataApp.
Click the DataApp to view the feature importance, model performance, what-if analysis and predictions.
Template recipe
The ready-to-use or system templates allow you to transform the data without writing the Python code on the UI. Using these standard templates, you can prepare the data, clean the data, add features, and split the data for testing and training the data and to build models. Running each recipe will transform the data in the flow.
By default, there are hundreds of system templates available. You can use them to transform the data and build simple to complex machine learning flows and subsequently models. You can also create custom templates at the project and tenant levels from Notebook and use them in your flows.
If you want to add a standard template to the flow, see Adding a standard transform or template within a recipe.
Adding a Template recipe
Use this procedure to add a transform within a template.
Select the project to upload a dataset. For more information, see Connectors
Click on the dataset block to run various data transformations on this dataset and build an ML model.
Use any of these options to add a transform within a template recipe:
Click the plus icon on the canvas page and then select Template.
Select the dataset block. This opens the side panel. Click the plus icon and select the Template recipe.
Select the dataset block to open the side panel in which you click View. This opens the dataset page. Click the plus icon and then select the Template recipe.
The page where you can add data Transformations is displayed.
Click Transformations.
Note
If you want to run a transform on a dataset, you must click on the dataset and add the recipe.
The Transformations side panel is displayed.
Search for the transforms or templates you want to add to the ML flow or data pipeline on the canvas:
There are set of templates available for each stage of machine learning model. All the templates associated with a particular stage are assigned to a specific tag. Possible tags:
Data Cleaning
Data Preparation
Data Analysis
Beta
Enter the transform name that you want to add to the data pipeline or filter the transforms by tags from the list.
Click on the transform name to open the transform page.
Specify the information in the respective fields of the selected transform. For more information, see Templates library
In this example, we have selected suffix to add suffix to all the columns in a dataset.
Click Add to add the transform to the data pipeline and close the transform window.
Note
You can also add multiple transforms simultaneously, using the +New Transform.
Click Test to test the transform and see the output before running this in the data pipeline. (Optional)
Click the Run icon to run this recipe in the flow. Once the run is successful, this generates an output dataset or a dashboard.
Click back button to navigate to the canvas from the transforms screen.
Click the output dataset block. This opens the pull-out window.
Click View to view the dataset with suffix to each column.
On this View data page, you can:
Click the caret icon to export the output dataset to a csv file, using the Export option.
Click the caret icon to delete the generated output, using the Delete option.
Click the plus icon to perform the following:
Append a file to the source dataset, using the File option. This option is only enabled for the source dataset.
Add a template recipe, using the Template option.
Add an AI-Assisted recipe, using the AI-assisted option.
Add a Rapid Model recipe, using the Rapid Model option.
Add an API Connector recipe, using the API Connector option.
Add a segment to the source dataset, using the +Segment option. This option is enabled only for the source dataset.
Viewing and editing the recipe details
Use this procedure to view the recipe details and edit the type of transform used within a recipe.
To view the recipe details:
Select the recipe block that you have uploaded onto the canvas. This opens the pull-out window.
Click inside the recipe name to modify change the recipe name.
View the recipe details on the pull-out window:
- Recipe type:
The type of recipe used. This is the tag assigned to the transform within this recipe.
- Created:
The date and time at which the recipe was created.
- Last modification:
The last date and time at which the recipe was modified.
- Last build:
The last date and time at which the recipe run was performed.
- Inputs:
The input dataset on which the transformation is applied and the recipe run was performed.
- Outputs:
The output dataset generated after running the recipe.
- Timeout:
The duration after which the recipe stops to run. By default, the duration is set to 2 hours. You can change the duration based on the complexity of the recipe you are running in the flow. If the recipe runs longer than this, the recipe run will be terminated after the set period. It is expressed in hrs.
On this pull-out window, you can also:
View the recipe logs, using the Log icon . This shows detailed record of all successful and failed execution of recipe runs. You can view full logs clicking the Logs option to open the logs page in a new browser tab. On this page, click View Full Log to view all logs.
View the recipe details, using the View option. This takes you to the respective recipe page.
Click the ellipses icon and select Delete to delete this recipe from the flow.
Run the recipe without navigating to the recipe page, using the Run option.
Click View to review the details of the recipe. The recipe page is displayed.
Click Transformations to view the transforms list in the project and select the transform whose details you want to modify, on the Transforms tab.
Click UPDATE.
Exporting the output dataset
Use this procedure to export the output dataset to a csv file.
To export the output dataset:
Select the dataset block, be it input or output dataset that you want to export to a csv file.
The pull-out window opens.
Click the ellipses icon
Select the Export option to download the dataset file onto your local system.
Deleting a recipe
Use this procedure to delete a recipe block from the canvas.
To delete a recipe block:
Select the recipe block that you want to delete from the canvas. This opens the pull-out window.
Click the ellipses icon and select the Delete option to delete the recipe.
You can also delete the recipe from the Transforms list page, using the delete icon available under caret icon . This page appears when you click View on the side panel window of the recipe block.
A dialog box prompts that deleting the recipe deletes the recipe block, output datasets, and associated recipes.
Click Delete to delete the recipe permanently from the canvas view or click Cancel to discard the action.
Running a specific recipe in the data pipeline
Use this procedure to run a particular recipe in the flow or data pipeline.
To run a recipe block:
Select the recipe block that you want to run from the canvas. This opens the side panel.
Click Run to run the recipe. The status of the recipe block changes to Running. Once the recipe run is successful, the status changes to Success.
You can also view output (dataset, model, or artifact) generated after running this recipe.
Exporting the output dataset to the connector
Use this procedure to save the output dataset to the configured connector that can be a cloud storage solution or a database.
Select the output dataset block that you want to save to the connector, on the canvas. This opens the side panel.
Select the Data connector from the drop-down. You can only see the connectors you have configured in this tenant.
Enter the destination folder name and file name with which you want to save the file to this folder in this connector.
Click Save to save the destination details.
Click Export to export the file to the connector.
You can delete the configured connector for this output dataset, using the delete icon |ico90|.
API Connector Recipe
The API Connector recipe allows users to write Python code directly within the platform to generate datasets, eliminating the need to upload a dataset beforehand. It is the only recipe that users can access without pre-uploading a dataset on the canvas. Additionally, the code workspace enables users to test their code before adding it to the canvas, ensuring the code works as intended.
Adding an API Connector Recipe
Use this procedure to use API Connector recipe on the canvas.
To add and run an API Connector recipe:
Click the plus icon on the canvas and select the API Connector from the list.
Add the Python code to create a dataset or recipe that you want to add to the canvas. This code should be added on the Code tab.
In the following example, you can see the code to add a dataset.
Click Save to save the code. This enables the TEST option. Use this option to test if the code is working as expected. The output will be displayed on the other tab.
Click Run to run the code to create a dataset. After running the code successfully, this adds the dataset to the canvas.
See also
glossary