RapidCanvas Docs
  • Welcome
  • GETTING STARTED
    • Quick start guide
    • Introduction to RapidCanvas
    • RapidCanvas Concepts
    • Accessing the platform
  • BASIC
    • Projects
      • Projects Overview
        • Creating a project
        • Reviewing the Projects listing page
        • Duplicating a Project
        • Modifying the project settings
        • Deleting Project(s)
        • Configuring global variables at the project level
        • Working on a project
        • Generating the about content for the project
        • Generating AI snippets for each node on the Canvas
        • Marking & Unmarking a Project as Favorite
      • Canvas overview
        • Shortcut options on canvas
        • Queuing the Recipes
        • Bulk Deletion of Canvas Nodes
        • AI Guide
      • Recipes
        • AI-assisted recipe
        • Rapid model recipe
        • Template recipe
        • Code Recipe
        • RAG Recipes
      • Scheduler overview
        • Creating a scheduler
        • Running the scheduler manually
        • Managing schedulers in a project
        • Viewing the schedulers in a project
        • Viewing the run history of a specific scheduler
        • Publishing the updated data pipeline to selected jobs from canvas
        • Fetching the latest data pipeline to a specific scheduler
        • Comparing the canvas of the scheduler with current canvas of the project
      • Predictions
        • Manual Prediction
        • Prediction Scheduler
      • Segments and Scenarios
      • DataApps
        • Model DataApp
        • Project Canvas Datasets
        • Custom Uploaded Datasets
        • SQL Sources
        • Documents and PDFs
        • Prediction Service
        • Scheduler
        • Import DataApp
    • Connectors
      • Importing dataset(s) from the local system
      • Importing Text Files from the Local System
      • Connectors overview
      • Connect to external connectors
        • Importing data from Google Cloud Storage (GCS)
        • Importing data from Amazon S3
        • Importing data from Azure Blob
        • Importing data from Mongo DB
        • Importing data from Snowflake
        • Importing data from MySQL
        • Importing data from Amazon Redshift
        • Importing data from Fivetran connectors
    • Workspaces
      • User roles and permissions
    • Artifacts & Models
      • Adding Artifacts at the Project Level
      • Adding Models at the Project Level
      • Creating an artifact at the workspace level
      • Managing artifacts at the workspace level
      • Managing Models at the Workspace Level
      • Prediction services
    • Environments Overview
      • Creating an environment
      • Editing the environment details
      • Deleting an environment
      • Monitoring the resource utilization in an environment
  • ADVANCED
    • Starter Guide
      • Quick Start
    • Setup and Installation
      • Installing and setting up the SDK
    • Helper Functions
    • Notebook Guide
      • Introduction
      • Create a template
      • Code Snippets
      • DataApps
      • Prediction Service
      • How to
        • How to Authenticate
        • Create a new project
        • Create a Custom Environment
        • Add a dataset
        • Add a recipe to the dataset
        • Manage cloud connection
        • Code recipes
        • Display a template on the UI
        • Create Global Variables
        • Scheduler
        • Create new scenarios
        • Create Template
        • Use a template in a flow notebook
      • Reference Implementations
        • DataApps
        • Artifacts
        • Connectors
        • Feature Store
        • ML model
        • ML Pipeline
        • Multiple Files
      • Sample Projects
        • Model build and predict
  • Additional Reading
    • Release Notes
      • April 21, 2025
      • April 01, 2025
      • Mar 18, 2025
      • Feb 27, 2025
      • Jan 27, 2025
      • Dec 26, 2024
      • Nov 26, 2024
      • Oct 24, 2024
      • Sep 11, 2024
        • Aug 08, 2024
      • Aug 29, 2024
      • July 18, 2024
      • July 03, 2024
      • June 19, 2024
      • May 30, 2024
      • May 15, 2024
      • April 17, 2024
      • Mar 28, 2024
      • Mar 20, 2024
      • Feb 28, 2024
      • Feb 19, 2024
      • Jan 30, 2024
      • Jan 16, 2024
      • Dec 12, 2023
      • Nov 07, 2023
      • Oct 25, 2023
      • Oct 01, 2024
    • Glossary
Powered by GitBook
On this page
  • Import Functions
  • Create context
  • Create Input Dataset
  • Create input parameters
  • Create an output dataset parameter
  • Build the function you want your template to execute
  • Create an output collection
  1. ADVANCED
  2. Notebook Guide
  3. How to

Create Template

Import Functions

from utils.notebookhelpers.helpers import Helpers
from utils.dtos.templateOutputCollection import TemplateOutputCollection
from utils.dtos.variable import Metadata
from utils.dtos.templateOutput import TemplateOutput
from utils.dtos.templateOutput import OutputType
from utils.dtos.templateOutput import ChartType
from utils.notebookhelpers.schemaHelpers import SchemaHelpers
from utils.functions import FunctionLib

Create context

contextId = 'recipe_split'
context = Helpers.getOrCreateContext(contextId=contextId, localVars=locals())

Create Input Dataset

inputDatasetParameter = Helpers.get_or_create_input_dataset(
    name="inputDataset",
    metadata=Metadata(
        input_name='Input Dataset', 
        is_required=True, 
        tooltip='Dataset to apply the train and test split'
    ),
    local_context=locals()
)

Create input parameters

targetCol = Helpers.get_or_create_input_var(
    name="targetCol",
    metadata=Metadata(
        input_name="Target Column", 
        is_required=True, 
        default_value="target", 
        tooltip="Input the target column to predict on (label y)", 
        datatypes=['STRING']
    ),
    local_context=locals()
)

test_size = Helpers.get_or_create_input_var(
    name="test_size",
    metadata=Metadata(
        input_name="Test Size", 
        is_required=True, 
        default_value=0.2, 
        tooltip="Test Size, default to be 0.2, which means 80% train and 20% test split", 
        datatypes=['DOUBLE']
    ),
    local_context=locals()
)

Create an output dataset parameter

outputDatasetParameter_train = Helpers.get_or_create_output_dataset(
    name="output_train",
    metadata=Metadata(
        input_name='Output Train Dataset', 
        is_required=True, 
        default_value="train", 
        tooltip='Train Dataset to be created'
    ),
    local_context=locals()
)

outputDatasetParameter_test = Helpers.get_or_create_output_dataset(
    name="output_test",
    metadata=Metadata(
        input_name='Output Test Dataset', 
        is_required=True, 
        default_value="test", 
        tooltip='Test Dataset to be created'
    ),
    local_context=locals()
)
Helpers.getAllParams(context)
inputDataset = inputDatasetParameter.value
targetCol = targetCol.value
test_size = test_size.value
output_train = outputDatasetParameter_train.value
output_test = outputDatasetParameter_test.value

# test_size here is the string from the server, change that to float
test_size = float(test_size)

print(inputDataset)
print(targetCol)
print(test_size)
print(output_train)
print(output_test)
raw_w_one_hot_encoding
SalePrice
0.2
train
test
inDF = Helpers.getEntityData(context, inputDataset)

# get rid of the __index_level_0 column from system
#if '__index_level_0__' in inDF:
#    inDF = inDF.drop(['__index_level_0__'], axis=1)

#inDF

Build the function you want your template to execute

import pandas as pd
import numpy as np
import os
import itertools
import warnings
import re 
warnings.filterwarnings(action='ignore')

from sklearn.model_selection import train_test_split

X = inDF.drop(targetCol, axis=1)
y = inDF[targetCol]

X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                    #stratify=y, 
                                                    test_size=test_size, 
                                                    random_state=42)

y_train = pd.DataFrame(y_train)
y_test = pd.DataFrame(y_test)
train = pd.concat([X_train, y_train], axis=1)
test = pd.concat([X_test, y_test], axis=1)

# fix a weird bug adding index
if 'index' in train.columns:
    train.drop(columns=['index'], inplace=True)
    test.drop(columns=['index'], inplace=True)

if 'index' in test.columns:
    train.drop(columns=['index'], inplace=True)
    test.drop(columns=['index'], inplace=True)

Create an output collection

outputCollection = Helpers.createOutputCollection(context)

output_train = Helpers.createTemplateOutputDataset(context=context, outputName=output_train, dataFrame=train)
outputCollection.addTemplateOutput(output_train)

output_test = Helpers.createTemplateOutputDataset(context=context, outputName=output_test, dataFrame=test)
outputCollection.addTemplateOutput(output_test)

Helpers.save(context)
PreviousCreate new scenariosNextUse a template in a flow notebook

Last updated 1 month ago