Rapid model recipe
Overview
Use rapid model recipe to solve an ML problem that falls into one of these categories, such as classification, regression, and binary classification, by creating an ML model on the historic dataset.
Rapid model recipe - Problem Types
Binary classification
Binary classification is a method used to categorize things or make decisions into one of two possible groups. It's like making a simple "yes" or "no" choice or determining whether something belongs to Group A or Group B.
You can apply Binary classification technique to the following scenarios:
Email Filters: Deciding whether an email is spam (1) or not-spam (0).
Bank Transactions: Identifying if a transaction is fraudulent (1) or genuine (0).
Medical Diagnosis: Determining if a patient has a specific disease (1) or not (0).
In the context of Rapid Model Recipe, here are the guidelines for inputs:
Target Column: Select a column that contains only binary values, either 0 or 1.
Regression
Regression is a statistical method used to model and analyze the relationships between variables. Its primary objective is to identify how a dependent variable changes when one or more independent variables are modified. Essentially, it's about predicting a continuous value, rather than categorizing into groups.
Common applications of regression include:
Real Estate: Estimating property prices based on features like size, location, and age.
Sales Forecasting: Predicting future sales based on historical data and market trends.
Supply Chain: Projecting delivery times based on factors like distance, package weight, and transportation mode.
In the context of Rapid Model Recipe, here are the guidelines for inputs:
Target Column: Choose a column representing the variable you wish to predict, which can assume a continuous range of values.
Multi-class classification
Multi-class classification is a method used to categorize things or make decisions into more than two distinct groups. Unlike binary classification, which sorts items into one of two buckets, multi-class classification can sort items into three or more categories.
Common applications of multi-class classification include:
Natural Language Processing: Categorizing news articles into predefined topics such as sports, politics, or entertainment.
Image Recognition: Identifying objects within pictures, such as distinguishing between a car, a bicycle, or a pedestrian.
Customer Segmentation: Grouping customers into distinct segments based on purchasing behavior, preferences, or demographic data.
In the context of Rapid Model Recipe, here are the guidelines for inputs:
Target Column: Choose a column that represents the categories you wish to classify data into. This column should have three or more distinct values representing each class or category.
Time series forecasting
Time series forecasting is a method used to predict future values based on historically observed data points ordered in time. It's akin to looking at patterns from the past to anticipate what might come next in a sequence, much like predicting the next note in a familiar song.
Common applications of time series forecasting include:
Stock Market Analysis: Predicting stock prices based on historical trends and fluctuations.
Weather Forecasting: Anticipating weather conditions (e.g., temperature, rainfall) based on past meteorological data.
Sales Projections: Estimating future sales of a product based on its past sales performance.
In the context of Rapid Model Recipe, here are the guidelines for inputs:
Time Column: Designate a column that indicates the time or sequence of each data point, typically represented as dates or periods.
Value Column: Choose a column containing the historical data points that you wish to forecast.
Series Column: If analyzing multiple series (e.g., sales across different regions), specify a column that differentiates each series.
Prediction Period: Define the duration over which you want to forecast. This could be days, months, quarters, etc.
Prediction Frequency: Specify how often predictions should be made within the defined period, such as daily, weekly, or monthly forecasts.
Anomaly detection
Anomaly detection is a method used to identify patterns in data that do not conform to expected behavior. It's akin to spotting the "odd one out" in a group or identifying events that deviate from the norm.
Common applications of anomaly detection include:
Network Security: Monitoring network traffic to detect suspicious activities that may indicate a cyber-attack.
Industrial Equipment: Tracking machine performance to spot potential failures or malfunctions.
E-commerce: Monitoring user behavior to detect fraudulent transactions or unusual purchasing patterns.
In the context of Rapid Model Recipe, no input is required from the user.
Clustering
Clustering is a method used to group data points based on their similarity, without having predefined categories. It's akin to organizing a mixed bag of fruits into separate piles where each pile contains only one type of fruit.
Common applications of clustering include:
Market Segmentation: Grouping customers based on purchasing habits, demographics, or preferences to tailor marketing strategies.
Document Classification: Organizing a collection of texts into topics or themes.
Image Recognition: Categorizing images based on their visual content and patterns.
In the context of Rapid Model Recipe, no input is required from the user.
Building an ML model using the Rapid Model recipe
Use this procedure to build simple ML models using the Rapid model recipe type. This type of recipe eliminates writing the code template or using the predefined templates to perform data transformations. The platform performs all the data transformation steps automatically after you select the problem type and target column for the uploaded dataset.
Click the dataset block on the canvas to open the pull-out window.
Select the dataset on which you want to perform the transformations and build ML model. By default, the dataset gets populated. However, if you want to run this recipe on any other dataset, select from the drop-down.
Select the target column on which you want to make predictions or build models by typing in search box. If there is no target column for Anomaly detection and Clustering, you can select I don't have a Target Column option.
Note: Click the AI Guide icon corresponding to the target column field to open the AI Guide chat window. Using this integration, you can identify the most appropriate column you want the model to predict in the uploaded dataset or query the dataset to learn about the missing values in a particular column or other details.
Example Use Case:
Imagine you are in the process of building a predictive model using the Rapid Model recipe. You have loaded a dataset on the canvas but are unsure which column should be set as the target variable. Here's how the AI Guide can help:
In the AI chat interface, ask, "What should be the target column for this dataset?"
Receive a Recommendation: The AI analyses the dataset and provides a recommendation for the most suitable target column based on the data characteristics.
Select the Problem Type. Supported problem types by the platform:
Binary Classification
Regression
MultiClass Classification
Timeseries Forecasting
Anamoly Detection
Clustering
Note:
If the selected problem type is Binary Classification, the Dynamic Features toggle is displayed. Enabling this toggle reveals the Business Context field, where you can provide business context to use the problem type as Binary Experimental.
Click to expand the Advanced Inputs option. This displays the Freeze Columns field, allowing you to select columns that should remain intact and not be dropped or manipulated during the data pre-processing phase in the Rapid Model Recipe.
The Advanced Inputs option is available for Clustering, Anomaly Detection, Regression, Binary Classification, and Multi-Class Classification problem types.
Click Save and then click the Run button to run the recipe. If needed, you can stop the recipe run at any time by clicking the Stop button.
Keyboard Shortcuts:
Ctrl+T (or Cmd+T on Mac) → Test a recipe
Ctrl+R (or Cmd+R on Mac) → Run a recipe
Ctrl+S (or Cmd+S on Mac) → Save code changes
Note:
The status is set to Running until the model is built.
To check the logs of this recipe model, click the Logs icon.
Once the project run is successful, the link is displayed to open the canvas.
Click OPEN CANVAS. If you want to go back to the dataset view to rerun the recipe by changing the target column, you can click GO BACK TO DATASET VIEW.
You can see the output dataset, chart, model and artifact generated as outputs after running this recipe.
Editing Recipe Code in Rapid Model Recipe
Use this procedure to edit the default Rapid Model Recipe code in Jupyter Notebook according to your specific requirements.
To edit the Rapid Model Recipe:
Click on the Rapid Model Recipe component or block in the data pipeline within a project. This opens the Rapid Model Recipe page.
Click Recipe Code to expand the section.
Click the Edit in Notebook button. This opens the Jupyter Notebook editor, where you can modify the Rapid Model Recipe code.
Click Save Back to Recipe to save the changes made to the code. Once saved, the updated recipe is added to the data pipeline.
Click Run to execute the data pipeline with the updated recipe and view the new output.
Note: You can revert the changes at any time using the Revert option, which is available while editing the code in Jupyter Notebook.
Developing a Model for Time Series Forecasting Problem Type
Use this procedure to develop a model for a time series forecasting problem type.
Click the dataset block on the canvas to open the pull-out window.
Click the plus icon and select the Rapid Model option.
Select the target column for prediction. Based on this column, the problem type will automatically be set. If it’s a time series forecasting problem, three additional fields will appear.
Select the date column that serves as the time reference for your predictions. This column should contain time-indexed data (such as dates or timestamps) to help establish the forecasting timeline.
Specify the predict period to set the forecast horizon. For example, if you select 10 days, the model will generate predictions for a 10-day period.
Set the frequency of predictions within the forecast period. This setting determines the intervals at which forecasts will be produced. Possible values:
D – Calendar day frequency
W – Weekly frequency
M – Month-end frequency
Q – Quarter-end frequency
A – Year-end frequency
H – Hourly frequency
T – Minutely frequency
S – Secondly frequency
In the Advanced Options section, expand the menu to access the time span field. Here, you can specify a custom interval within the prediction period.
For example, if the predict period is set to 10 days and the time span is set to 3 days, the system will generate forecast outputs every 3 days instead of daily.
This option is useful when specific time intervals are preferred over daily predictions, providing insights at the most relevant intervals.
Adding a DataApp for Binary classification problem type
Use this procedure to create a DataApp for binary classification, regression, binary experimental, and multi class classification problem types in the Rapid model recipe.
Prerequisites:
Create a prediction service for the model to create an App template. Prediction services
You can create a prediction service for the model directly from the canvas by clicking on the model block. This opens the side panel. Clicking on the Prediction service button takes you to the prediction service page.
To create a DataApp for binary classification problem type:
Select the project in which you want to create a DataApp. You can only create DataApps for the binary classification problem type in the Rapid Model recipe.
Select Datapps from the project navigation menu. This opens the page to create a DataApp.
Specify this information:
DataApp Name: The name of the DataApp. DataApp Description: It is optional. The description for the DataApp. Recipe Name: Select the recipe you want to run in the DataApp.
Click Create to create the DataApp.
Click the DataApp to view the feature importance, model performance, what-if analysis and predictions.
Last updated