RapidCanvas Docs
  • Welcome
  • GETTING STARTED
    • Quick start guide
    • Introduction to RapidCanvas
    • RapidCanvas Concepts
    • Accessing the platform
  • BASIC
    • Projects
      • Projects Overview
        • Creating a project
        • Reviewing the Projects listing page
        • Duplicating a Project
        • Modifying the project settings
        • Deleting Project(s)
        • Configuring global variables at the project level
        • Working on a project
        • Generating the about content for the project
        • Generating AI snippets for each node on the Canvas
        • Marking & Unmarking a Project as Favorite
      • Canvas overview
        • Shortcut options on canvas
        • Queuing the Recipes
        • Bulk Deletion of Canvas Nodes
        • AI Guide
      • Recipes
        • AI-assisted recipe
        • Rapid model recipe
        • Template recipe
        • Code Recipe
        • RAG Recipes
      • Scheduler overview
        • Creating a scheduler
        • Running the scheduler manually
        • Managing schedulers in a project
        • Viewing the schedulers in a project
        • Viewing the run history of a specific scheduler
        • Publishing the updated data pipeline to selected jobs from canvas
        • Fetching the latest data pipeline to a specific scheduler
        • Comparing the canvas of the scheduler with current canvas of the project
      • Predictions
        • Manual Prediction
        • Prediction Scheduler
      • Segments and Scenarios
      • DataApps
        • Model DataApp
        • Project Canvas Datasets
        • Custom Uploaded Datasets
        • SQL Sources
        • Documents and PDFs
        • Prediction Service
        • Scheduler
        • Import DataApp
    • Connectors
      • Importing dataset(s) from the local system
      • Importing Text Files from the Local System
      • Connectors overview
      • Connect to external connectors
        • Importing data from Google Cloud Storage (GCS)
        • Importing data from Amazon S3
        • Importing data from Azure Blob
        • Importing data from Mongo DB
        • Importing data from Snowflake
        • Importing data from MySQL
        • Importing data from Amazon Redshift
        • Importing data from Fivetran connectors
    • Workspaces
      • User roles and permissions
    • Artifacts & Models
      • Adding Artifacts at the Project Level
      • Adding Models at the Project Level
      • Creating an artifact at the workspace level
      • Managing artifacts at the workspace level
      • Managing Models at the Workspace Level
      • Prediction services
    • Environments Overview
      • Creating an environment
      • Editing the environment details
      • Deleting an environment
      • Monitoring the resource utilization in an environment
  • ADVANCED
    • Starter Guide
      • Quick Start
    • Setup and Installation
      • Installing and setting up the SDK
    • Helper Functions
    • Notebook Guide
      • Introduction
      • Create a template
      • Code Snippets
      • DataApps
      • Prediction Service
      • How to
        • How to Authenticate
        • Create a new project
        • Create a Custom Environment
        • Add a dataset
        • Add a recipe to the dataset
        • Manage cloud connection
        • Code recipes
        • Display a template on the UI
        • Create Global Variables
        • Scheduler
        • Create new scenarios
        • Create Template
        • Use a template in a flow notebook
      • Reference Implementations
        • DataApps
        • Artifacts
        • Connectors
        • Feature Store
        • ML model
        • ML Pipeline
        • Multiple Files
      • Sample Projects
        • Model build and predict
  • Additional Reading
    • Release Notes
      • April 21, 2025
      • April 01, 2025
      • Mar 18, 2025
      • Feb 27, 2025
      • Jan 27, 2025
      • Dec 26, 2024
      • Nov 26, 2024
      • Oct 24, 2024
      • Sep 11, 2024
        • Aug 08, 2024
      • Aug 29, 2024
      • July 18, 2024
      • July 03, 2024
      • June 19, 2024
      • May 30, 2024
      • May 15, 2024
      • April 17, 2024
      • Mar 28, 2024
      • Mar 20, 2024
      • Feb 28, 2024
      • Feb 19, 2024
      • Jan 30, 2024
      • Jan 16, 2024
      • Dec 12, 2023
      • Nov 07, 2023
      • Oct 25, 2023
      • Oct 01, 2024
    • Glossary
Powered by GitBook
On this page
  • Import Helper Functions
  • Context Management
  • getOrCreateContext
  • Parameter and Variable Management
  • addParam
  • getParam
  • getAllParams
  • get_global_var
  • get_secret
  • Entity Data Management
  • getAllEntities
  • getEntityData
  • load_all_entities
  • getEntityFilePath
  • get_data_from_source
  • write_data_to_source
  • Artifact Management
  • getOrCreateArtifactsDir
  • downloadArtifacts
  • list_artifact_files
  • download_artifact_file
  • get_artifact
  • uploadArtifacts
  • Output Management
  • createOutputCollection
  • getOutputCollection
  • createTemplateOutput
  • createTemplateOutputDataset
  • create_template_output_file
  • create_template_output_artifact
  • Chart Output Functions
  • create_template_output_chart
  • createTemplateOutputEChart
  • createTemplateOutputPlotlibChart
  • createTemplateOutputPlotlyChart
  • createTemplateOutputPlotlyChartAsJson
  • Model and Metadata Functions
  • create_template_output_rc_ml_model
  • get_rc_ml_model
  • create_template_output_metadata
  • get_metadata_value
  • get_all_metadata
  • create_template_output_answer
  • create_template_output_vector_store
  • Convenience Save Functions
  • save_output_dataset
  • save_output_plotly_chart_as_json
  • save_output_plot_lib_chart
  • save_output_echart
  • save_output_chart
  • save_output_rc_ml_model
  • save_output_artifacts
  • save_output_answer
  • save_output_metadata
  • save_output_file
  • save_output_vector_store
  • Utility Functions
  • initH2o
  • getChildDir
  • getTenantId
  • get_file_data
  • generate_warning
  • save
  1. ADVANCED

Helper Functions

This page provides a list of all helper functions supported within RapidCanvas

Import Helper Functions

from utils.notebookhelpers.helpers import Helpers

Context Management

getOrCreateContext

@staticmethod
def getOrCreateContext(contextId, localVars, entities=None):

Purpose: Creates or retrieves a context object which is essential for most helper functions. This is typically the first function called in a notebook.

Parameters:

  • contextId: String identifier for the context (used for local testing)

  • localVars: Local variables in the current environment (pass locals())

  • entities: Optional dictionary of local datasets with entity name as key and file path as value

Returns: A context dictionary that will be used by other helper functions

Example:

# Initialize context at the beginning of your notebook
context = Helpers.getOrCreateContext(
    contextId="my_test_context", 
    localVars=locals(),
    entities={
        "customers": "/path/to/customers.csv",
        "orders": "/path/to/orders.csv"
    }
)

How it works:

  1. Checks if internalContext already exists in local variables

  2. For production environments, returns an empty context

  3. For local execution, it either:

    • Creates a new context with entity paths if provided

    • Retrieves previously prepared context data

Notes:

  • This function must be called at the beginning of your notebook

  • The context object is used by most other helper functions

  • For local testing, you can specify local file paths for entities

  • The context now includes additional fields such as:

    • vector_stores: List of vector stores for embedding-based retrieval

    • files_data: List of file paths and metadata

    • global_vars: Dictionary of global variables

Parameter and Variable Management

addParam

@staticmethod
def addParam(context, paramKey, paramVal):

Purpose: Adds a parameter to the context dictionary.

Parameters:

  • context: The context dictionary

  • paramKey: Key name for the parameter

  • paramVal: Value of the parameter

Example:

# Store a configuration value in the context
Helpers.addParam(context, "max_iterations", 100)

getParam

@staticmethod
def getParam(context, param):

Purpose: Retrieves a parameter value from the context by its key.

Parameters:

  • context: The context dictionary

  • param: Key name of the parameter to retrieve

Returns: The value of the parameter, or None if the parameter doesn't exist

Example:

# Get a configuration value from the context
max_iterations = Helpers.getParam(context, "max_iterations")
if max_iterations is None:
    max_iterations = 50  # Default value if not found

getAllParams

@staticmethod
def getAllParams(context):

Purpose: Gets a list of all parameter keys available in the context.

Parameters:

  • context: The context dictionary

Returns: List of all parameter keys

Example:

# List all available parameters in the context
all_params = Helpers.getAllParams(context)
print("Available parameters:", all_params)

get_global_var

@staticmethod
def get_global_var(context, key):

Purpose: Retrieves a global variable from the context by key.

Parameters:

  • context: The context dictionary

  • key: Key name of the global variable to retrieve

Returns: The value of the global variable, or None if not found

Example:

# Get a global variable from the context
environment = Helpers.get_global_var(context, "environment")
if environment == "production":
    # Apply production-specific logic
    pass

get_secret

@staticmethod
def get_secret(context: dict, key: str):

Purpose: Securely retrieves and decrypts secret values from the context.

Parameters:

  • context: The context dictionary

  • key: Key name of the secret to retrieve

Returns: The decrypted secret value

Raises: Exception if the secret is not found

Example:

# Get a secret API key
try:
    api_key = Helpers.get_secret(context, "API_KEY")
    # Use the API key for authentication
except Exception as e:
    print(f"Error retrieving secret: {e}")

How it works:

  1. Checks if the secret exists in the context

  2. If it's a string, decodes the base64 encoding

  3. Decrypts the value using AES encryption with environment keys

  4. Removes control characters from the result

  5. If not found in context, tries to get from environment variables

  6. Raises an exception if the secret is not found anywhere

Notes:

  • This function handles decryption of sensitive information

  • Secrets are stored encrypted in the context for security

Entity Data Management

getAllEntities

@staticmethod
def getAllEntities(context):

Purpose: Returns a list of all entity names available in the context.

Parameters:

  • context: The context dictionary

Returns: List of entity names

Example:

# Get all available entities
entities = Helpers.getAllEntities(context)
print("Available entities:", entities)

getEntityData

@staticmethod
def getEntityData(context, entityName, inferDTypesFromSchema=False, numRows=None, pandas_lib=None):

Purpose: Loads an entity (dataset) as a pandas DataFrame.

Parameters:

  • context: The context dictionary

  • entityName: Name of the entity to load

  • inferDTypesFromSchema: If True, will use schema to set data types

  • numRows: Optional limit on number of rows to read

  • pandas_lib: Optional pandas library to use (defaults to standard pandas)

Returns: DataFrame containing the entity data

Raises: Exception if the entity is not found

Example:

# Load customer data
try:
    customers_df = Helpers.getEntityData(context, "customers")
    
    # Load just the first 100 rows
    sample_df = Helpers.getEntityData(context, "customers", numRows=100)
    
    # Process the data
    customers_df['full_name'] = customers_df['first_name'] + ' ' + customers_df['last_name']
except Exception as e:
    print(f"Error loading entity: {e}")

How it works:

  1. Looks up the entity in the context's entity paths

  2. Gets the schema if needed

  3. Downloads the entity file if necessary

  4. Reads the data into a DataFrame

  5. Raises an informative exception if the entity doesn't exist

load_all_entities

@staticmethod
def load_all_entities(context):

Purpose: Loads all entities into a dictionary of DataFrames.

Parameters:

  • context: The context dictionary

Returns: Dictionary with entity names as keys and DataFrames as values

Example:

# Load all available entities as DataFrames
entities_dict = Helpers.load_all_entities(context)

# Access individual DataFrames
customers_df = entities_dict['customers']
orders_df = entities_dict['orders']

# Process multiple datasets
for entity_name, df in entities_dict.items():
    print(f"Entity: {entity_name}, Rows: {len(df)}")

getEntityFilePath

@staticmethod
def getEntityFilePath(context, entityName):

Purpose: Gets the file path for a specific entity.

Parameters:

  • context: The context dictionary

  • entityName: Name of the entity

Returns: File path string, or None if entity doesn't exist

Example:

# Get the path to the customers entity file
entity_path = Helpers.getEntityFilePath(context, "customers")
if entity_path:
    print(f"Entity file is located at: {entity_path}")
else:
    print("Entity not found")

get_data_from_source

@staticmethod
def get_data_from_source(source_type: DataSourceType, source: str, name: str = None, **options):

Purpose: Retrieves data from various data sources.

Parameters:

  • source_type: Type of data source (enum from DataSourceType)

  • source: Path or identifier for the data source

  • name: Optional name for the data source

  • **options: Additional options for the data source

Returns: Data from the source (typically a DataFrame)

Example:

from utils.rcclient.enums import DataSourceType

# Read data from a CSV file
csv_data = Helpers.get_data_from_source(
    source_type=DataSourceType.CSV,
    source="path/to/file.csv"
)

# Read data from a database
db_data = Helpers.get_data_from_source(
    source_type=DataSourceType.SQL,
    source="SELECT * FROM customers",
    connection_string="postgresql://user:pass@localhost/dbname"
)

write_data_to_source

@staticmethod
def write_data_to_source(df, source_type: DataSourceType, target: str, name: str = None, **options):

Purpose: Writes data to various data destinations.

Parameters:

  • df: DataFrame to write

  • source_type: Type of data source (enum from DataSourceType)

  • target: Path or identifier for the data destination

  • name: Optional name for the data source

  • **options: Additional options for the data source

Example:

from utils.rcclient.enums import DataSourceType
import pandas as pd

# Create a simple DataFrame
df = pd.DataFrame({'id': [1, 2, 3], 'value': ['a', 'b', 'c']})

# Write to a CSV file
Helpers.write_data_to_source(
    df=df,
    source_type=DataSourceType.CSV,
    target="path/to/output.csv"
)

# Write to a database table
Helpers.write_data_to_source(
    df=df,
    source_type=DataSourceType.SQL,
    target="my_table",
    connection_string="postgresql://user:pass@localhost/dbname",
    if_exists="replace"  # Options: fail, replace, append
)

Artifact Management

getOrCreateArtifactsDir

@staticmethod
def getOrCreateArtifactsDir(context, artifactsId, purgeOld=False):

Purpose: Creates or retrieves a directory for storing artifacts (models, files, etc.)

Parameters:

  • context: The context dictionary

  • artifactsId: Identifier for the artifacts collection (auto-generated if None)

  • purgeOld: If True, will create a fresh directory even if it exists

Returns: Path to the artifacts directory

Example:

# Create an artifacts directory
artifacts_dir = Helpers.getOrCreateArtifactsDir(context, "my_model_files")

# Save a model file to the artifacts directory
model_path = os.path.join(artifacts_dir, "model.pkl")
with open(model_path, "wb") as f:
    pickle.dump(model, f)

print(f"Artifacts directory: {artifacts_dir}")
print(f"Use artifacts ID 'my_model_files' to download these artifacts later")

How it works:

  1. If no artifactsId is provided, generates a random UUID

  2. Adds the artifactsId to the context for tracking

  3. Constructs the local path for the artifacts directory

  4. Either creates a fresh directory or downloads existing artifacts

  5. Returns the path to the artifacts directory

Notes:

  • Artifacts are automatically uploaded when you call Helpers.save(context)

  • Use the same artifactsId to access the same artifacts in different sessions

downloadArtifacts

@staticmethod
def downloadArtifacts(context, artifactsId):

Purpose: Downloads artifacts from storage to the local filesystem.

Parameters:

  • context: The context dictionary

  • artifactsId: ID of the artifacts to download

Returns: Dictionary of file names and their paths

Example:

# Download model artifacts by ID
files_dict = Helpers.downloadArtifacts(context, "my_model_files")

# Access specific files
model_path = files_dict.get("model.pkl")
if model_path:
    with open(model_path, "rb") as f:
        model = pickle.load(f)

list_artifact_files

@staticmethod
def list_artifact_files(context, artifacts_id):

Purpose: Lists files in an artifact without downloading them.

Parameters:

  • context: The context dictionary

  • artifacts_id: ID of the artifacts to list

Returns: List of file names in the artifact

Example:

# See what files are available in an artifact
file_list = Helpers.list_artifact_files(context, "my_model_files")
print("Available files:", file_list)

# Check if a specific file exists
if "model.pkl" in file_list:
    # Download only that file
    model_path = Helpers.download_artifact_file(context, "my_model_files", "model.pkl")

download_artifact_file

@staticmethod
def download_artifact_file(context, artifacts_id, file_name):

Purpose: Downloads a single file from an artifact.

Parameters:

  • context: The context dictionary

  • artifacts_id: ID of the artifact containing the file

  • file_name: Name of the file to download

Returns: Path to the downloaded file

Example:

# Download just the model file from an artifact
model_path = Helpers.download_artifact_file(context, "my_model_files", "model.pkl")

# Load the model
with open(model_path, "rb") as f:
    model = pickle.load(f)

get_artifact

@staticmethod
def get_artifact(context, artifact_id, artifact_name):

Purpose: Gets the path to a specific artifact file, downloading it if necessary.

Parameters:

  • context: The context dictionary

  • artifact_id: ID of the artifact

  • artifact_name: Name of the specific file in the artifact

Returns: Path to the artifact file

Example:

# Get a specific file from an artifact
config_path = Helpers.get_artifact(context, "project_files", "config.json")

# Read the config file
with open(config_path, "r") as f:
    config = json.load(f)

uploadArtifacts

@staticmethod
def uploadArtifacts(context):

Purpose: Uploads all artifacts in the context to remote storage.

Parameters:

  • context: The context dictionary

Example:

# Upload all artifacts registered in the context
Helpers.uploadArtifacts(context)

Notes:

  • This is automatically called by Helpers.save(context)

  • Only uploads artifacts with IDs listed in the context

Output Management

createOutputCollection

@staticmethod
def createOutputCollection(context):

Purpose: Creates or retrieves a template output collection for storing outputs.

Parameters:

  • context: The context dictionary

Returns: TemplateOutputCollection object

Example:

# Create or get the output collection
output_collection = Helpers.createOutputCollection(context)

Notes:

  • Most output creation functions add to this collection automatically

  • This is called internally by other output functions

getOutputCollection

@staticmethod
def getOutputCollection(context):

Purpose: Gets the existing template output collection from the context.

Parameters:

  • context: The context dictionary

Returns: TemplateOutputCollection object or None if not found

Example:

# Get the output collection
output_collection = Helpers.getOutputCollection(context)
if output_collection:
    # Check how many outputs are in the collection
    print(f"Number of outputs: {len(output_collection.templateOutputs)}")

createTemplateOutput

@staticmethod
def createTemplateOutput(context, outputName: str, outputType: OutputType, data=None, 
                         dataType: FileType = FileType.CSV, outputFileName: str = None,
                         custom_params: dict = {}, metadata: dict = {}, 
                         description: str = None, group: str = None):

Purpose: Creates a template output of any type.

Parameters:

  • context: The context dictionary

  • outputName: Name of the output

  • outputType: Type of output from OutputType enum

  • data: Data for the output (typically DataFrame)

  • dataType: Type of data file from FileType enum (CSV, PARQUET, etc.)

  • outputFileName: Optional custom filename

  • custom_params: Additional parameters for the output

  • metadata: Metadata for the output

  • description: Optional description

  • group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

from utils.dtos.templateOutput import OutputType, FileType
import pandas as pd

# Create a DataFrame
df = pd.DataFrame({'id': [1, 2, 3], 'value': ['a', 'b', 'c']})

# Create a CSV output
output = Helpers.createTemplateOutput(
    context=context,
    outputName="my_output",
    outputType=OutputType.ENTITY,
    data=df,
    dataType=FileType.CSV,
    description="Sample data output",
    group="Data Outputs"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(output)

Notes:

  • This is a general-purpose output creation function

  • Specialized versions exist for specific output types

  • Automatically handles file generation and storage

createTemplateOutputDataset

@staticmethod
def createTemplateOutputDataset(context, outputName, dataFrame):

Purpose: Creates a dataset (entity) output from a DataFrame.

Parameters:

  • context: The context dictionary

  • outputName: Name of the dataset

  • dataFrame: Pandas DataFrame with the data

Returns: TemplateOutput object

Example:

import pandas as pd

# Create a processed dataset
processed_df = pd.DataFrame({'id': [1, 2, 3], 'processed_value': [10, 20, 30]})

# Create a dataset output
dataset_output = Helpers.createTemplateOutputDataset(
    context=context,
    outputName="processed_data",
    dataFrame=processed_df
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(dataset_output)

Notes:

  • This is a convenience wrapper around createTemplateOutput

  • Automatically sets outputType to ENTITY and dataType to PARQUET

create_template_output_file

@staticmethod
def create_template_output_file(context, output_name, file_contents, file_type: FileType = FileType.TEXT):

Purpose: Creates a file output with the specified content.

Parameters:

  • context: The context dictionary

  • output_name: Name of the output

  • file_contents: String content of the file

  • file_type: Type of file from FileType enum (TEXT, JSON, MARKDOWN, HTML). Defaults to TEXT if not specified.

Returns: TemplateOutput object

Example:

from utils.dtos.templateOutput import FileType

# Create a JSON configuration file
config = {
    "model_params": {
        "n_estimators": 100,
        "max_depth": 5
    }
}
json_content = json.dumps(config, indent=2)

# Create a file output
file_output = Helpers.create_template_output_file(
    context=context,
    output_name="model_config",
    file_contents=json_content,
    file_type=FileType.JSON
)

# Create a text file output with default file type
text_output = Helpers.create_template_output_file(
    context=context,
    output_name="readme",
    file_contents="This is a simple text file."
    # file_type parameter omitted, defaults to FileType.TEXT
)

# Add the outputs to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(file_output)
output_collection.addOutput(text_output)

create_template_output_artifact

@staticmethod
def create_template_output_artifact(context, artifact_name):

Purpose: Creates an artifact output for storing files.

Parameters:

  • context: The context dictionary

  • artifact_name: Name of the artifact (should match an artifactsId)

Returns: TemplateOutput object

Example:

# First create an artifacts directory
artifacts_dir = Helpers.getOrCreateArtifactsDir(context, "model_files")

# Save model files to the directory
model_path = os.path.join(artifacts_dir, "model.pkl")
with open(model_path, "wb") as f:
    pickle.dump(model, f)

# Create an artifact output
artifact_output = Helpers.create_template_output_artifact(
    context=context,
    artifact_name="model_files"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(artifact_output)

Notes:

  • The artifact_name should match the artifactsId used with getOrCreateArtifactsDir

  • This registers the artifact for proper handling when the context is saved

Chart Output Functions

create_template_output_chart

@staticmethod
def create_template_output_chart(context, title, metadata={}, description=None, group=None):

Purpose: Creates a base chart output without specifying the chart type or data.

Parameters:

  • context: The context dictionary

  • title: Title of the chart

  • metadata: Optional metadata for the chart

  • description: Optional description of the chart

  • group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

# Create a base chart output
chart_output = Helpers.create_template_output_chart(
    context=context,
    title="Data Distribution",
    description="Distribution of values across categories",
    group="Analysis Charts"
)

Notes:

  • This creates a basic chart output container

  • You'll typically use one of the more specific chart functions instead

createTemplateOutputEChart

@staticmethod
def createTemplateOutputEChart(context, chartTitle, dataFrame, chartType=ChartType.TABLE, 
                             params={}, description=None, group=None):

Purpose: Creates a chart output using ECharts visualization library.

Parameters:

  • context: The context dictionary

  • chartTitle: Title of the chart

  • dataFrame: Pandas DataFrame with the data to visualize

  • chartType: Type of chart from ChartType enum (default is TABLE)

  • params: Optional parameters for chart configuration

  • description: Optional description of the chart

  • group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

from utils.dtos.templateOutput import ChartType
import pandas as pd

# Create a DataFrame with sample data
data = {
    'category': ['A', 'B', 'C', 'D'],
    'value': [10, 25, 15, 30]
}
df = pd.DataFrame(data)

# Create a bar chart
bar_chart = Helpers.createTemplateOutputEChart(
    context=context,
    chartTitle="Category Distribution",
    dataFrame=df,
    chartType=ChartType.BAR,
    description="Distribution of values by category",
    group="Analysis Charts"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(bar_chart)

Notes:

  • Uses ECharts library for interactive visualizations

  • Common chart types include BAR, LINE, PIE, SCATTER, TABLE

createTemplateOutputPlotlibChart

@staticmethod
def createTemplateOutputPlotlibChart(context, chartTitle: str, plt, description=None, group=None):

Purpose: Creates a chart output from a matplotlib plot.

Parameters:

  • context: The context dictionary

  • chartTitle: Title of the chart

  • plt: Matplotlib pyplot object with the plot

  • description: Optional description of the chart

  • group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

import matplotlib.pyplot as plt
import numpy as np

# Create a matplotlib plot
x = np.linspace(0, 10, 100)
y = np.sin(x)

plt.figure(figsize=(10, 6))
plt.plot(x, y)
plt.title("Sine Wave")
plt.xlabel("X")
plt.ylabel("sin(x)")
plt.grid(True)

# Create a chart output from the matplotlib plot
sine_chart = Helpers.createTemplateOutputPlotlibChart(
    context=context,
    chartTitle="Sine Wave Plot",
    plt=plt,
    description="A plot of the sine function",
    group="Mathematical Functions"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(sine_chart)

Notes:

  • Saves the matplotlib figure as a PNG file

  • The plot is automatically saved to the artifacts directory

createTemplateOutputPlotlyChart

@staticmethod
def createTemplateOutputPlotlyChart(context, chartTitle: str, plotly_fig, description=None, group=None):

Purpose: Creates a chart output from a Plotly figure.

Parameters:

  • context: The context dictionary

  • chartTitle: Title of the chart

  • plotly_fig: Plotly figure object

  • description: Optional description of the chart

  • group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

import plotly.express as px
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'x': range(10),
    'y': [i**2 for i in range(10)],
    'category': ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B']
})

# Create a Plotly figure
fig = px.scatter(df, x='x', y='y', color='category', 
                title='Quadratic Function by Category')

# Create a chart output from the Plotly figure
scatter_chart = Helpers.createTemplateOutputPlotlyChart(
    context=context,
    chartTitle="Quadratic Scatter Plot",
    plotly_fig=fig,
    description="A scatter plot showing x² by category",
    group="Mathematical Functions"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(scatter_chart)

Notes:

  • Saves the Plotly figure as an HTML file with interactive features

  • Uses CDN for Plotly JavaScript libraries to keep file size small

createTemplateOutputPlotlyChartAsJson

@staticmethod
def createTemplateOutputPlotlyChartAsJson(context, chartTitle: str, plotly_fig, group=None):

Purpose: Creates a chart output from a Plotly figure, saving it as JSON.

Parameters:

  • context: The context dictionary

  • chartTitle: Title of the chart

  • plotly_fig: Plotly figure object

  • group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

import plotly.express as px
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
    'sales': [200, 150, 300, 250, 400]
})

# Create a Plotly figure
fig = px.bar(df, x='month', y='sales', title='Monthly Sales')

# Create a chart output from the Plotly figure as JSON
json_chart = Helpers.createTemplateOutputPlotlyChartAsJson(
    context=context,
    chartTitle="Monthly Sales Chart",
    plotly_fig=fig,
    group="Business Analytics"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(json_chart)

Notes:

  • Saves the Plotly figure as a JSON file

  • This format is useful when you need to load and modify the chart later

Model and Metadata Functions

create_template_output_rc_ml_model

@staticmethod
def create_template_output_rc_ml_model(context, model_name, model_obj, artifacts, version="default"):

Purpose: Creates an output for a machine learning model.

Parameters:

  • context: The context dictionary

  • model_name: Name of the model

  • model_obj: Model object (must extend RCMLModel class)

  • artifacts: Dictionary of model artifacts

  • version: Optional version string for the model

Returns: TemplateOutput object

Raises: Exception if model_obj doesn't extend RCMLModel

Example:

from utils.dtos.rc_ml_model import RCMLModel

# Define a custom model class
class MyClassifier(RCMLModel):
    def __init__(self):
        super().__init__()
        self.model = None
    
    def fit(self, X, y):
        # Train the model
        from sklearn.ensemble import RandomForestClassifier
        self.model = RandomForestClassifier()
        self.model.fit(X, y)
    
    def predict(self, X):
        # Make predictions
        return self.model.predict(X)

# Create and train the model
model = MyClassifier()
model.fit(X_train, y_train)

# Save model artifacts
artifacts_dir = Helpers.getOrCreateArtifactsDir(context, "model_artifacts")
artifacts = {
    "model_info": "Random Forest Classifier with default parameters"
}

# Create model output
model_output = Helpers.create_template_output_rc_ml_model(
    context=context,
    model_name="customer_classifier",
    model_obj=MyClassifier,
    artifacts="model_artifacts",
    version="v1.0"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(model_output)

Notes:

  • The model class must extend RCMLModel

  • Artifacts should contain any files needed for the model to function

get_rc_ml_model

@staticmethod
def get_rc_ml_model(context, model_name, model_version="default"):

Purpose: Loads a previously saved ML model.

Parameters:

  • context: The context dictionary

  • model_name: Name of the model to load

  • model_version: Version of the model (default is "default")

Returns: Loaded model object

Example:

# Load a saved model
model = Helpers.get_rc_ml_model(
    context=context,
    model_name="customer_classifier",
    model_version="v1.0"
)

# Use the model to make predictions
predictions = model.predict(X_test)

create_template_output_metadata

@staticmethod
def create_template_output_metadata(context, metadata_list):

Purpose: Creates an output for metadata information.

Parameters:

  • context: The context dictionary

  • metadata_list: List of metadata objects

Returns: TemplateOutput object

Example:

from utils.dtos.metadata import Metadata, MetadataSubjectType

# Create metadata objects
metadata1 = Metadata(
    subject_type=MetadataSubjectType.DATASET,
    subject_name="customers",
    key="row_count",
    value="5000"
)

metadata2 = Metadata(
    subject_type=MetadataSubjectType.MODEL,
    subject_name="customer_classifier",
    key="accuracy",
    value="0.92"
)

# Create metadata output
metadata_output = Helpers.create_template_output_metadata(
    context=context,
    metadata_list=[metadata1, metadata2]
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(metadata_output)

get_metadata_value

@staticmethod
def get_metadata_value(context, subject_type: MetadataSubjectType, subject_name: str, key: str):

Purpose: Retrieves a specific metadata value from the context.

Parameters:

  • context: The context dictionary

  • subject_type: Type of subject from MetadataSubjectType enum

  • subject_name: Name of the subject

  • key: Metadata key to retrieve

Returns: The metadata value, or None if not found

Example:

from utils.dtos.metadata import MetadataSubjectType

# Get metadata value
accuracy = Helpers.get_metadata_value(
    context=context,
    subject_type=MetadataSubjectType.MODEL,
    subject_name="customer_classifier",
    key="accuracy"
)

print(f"Model accuracy: {accuracy}")

get_all_metadata

@staticmethod
def get_all_metadata(context):

Purpose: Retrieves all metadata from the context.

Parameters:

  • context: The context dictionary

Returns: Dictionary of all metadata

Example:

# Get all metadata
all_metadata = Helpers.get_all_metadata(context)

# Print metadata information
for metadata in all_metadata:
    print(f"Subject: {metadata['subject_name']}, Key: {metadata['key']}, Value: {metadata['value']}")

create_template_output_answer

@staticmethod
def create_template_output_answer(context, answer):

Purpose: Creates an output containing an answer or result.

Parameters:

  • context: The context dictionary

  • answer: The answer text or object

Returns: TemplateOutput object

Example:

# Analyze data and create an answer
analysis_result = "Based on the data analysis, customer segment A shows the highest retention rate at 87%."

# Create answer output
answer_output = Helpers.create_template_output_answer(
    context=context,
    answer=analysis_result
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(answer_output)

create_template_output_vector_store

@staticmethod
def create_template_output_vector_store(context, vector_store_obj: VectorStoreBase):

Purpose: Creates an output for a vector store (used for embeddings and similarity search).

Parameters:

  • context: The context dictionary

  • vector_store_obj: Vector store object (must extend VectorStoreBase)

Returns: TemplateOutput object

Example:

from utils.libutils.vectorStores.faiss import FaissVectorStore
import numpy as np

# Create a vector store
vector_store = FaissVectorStore(name="document_embeddings")

# Add documents with their embeddings
documents = ["This is document 1", "This is document 2", "This is document 3"]
embeddings = np.random.rand(3, 128)  # Simulated embeddings
for i, (doc, embedding) in enumerate(zip(documents, embeddings)):
    vector_store.add_item(i, doc, embedding)

# Create vector store output
vs_output = Helpers.create_template_output_vector_store(
    context=context,
    vector_store_obj=vector_store
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(vs_output)

Convenience Save Functions

These functions combine creating an output and saving it to the context in a single step.

save_output_dataset

@staticmethod
def save_output_dataset(context, output_name, data_frame) -> TemplateOutput:

Purpose: Creates a dataset output, adds it to the collection, and saves the context in one step.

Parameters:

  • context: The context dictionary

  • output_name: Name of the dataset

  • data_frame: Pandas DataFrame with the data

Returns: The created TemplateOutput object

Example:

import pandas as pd

# Create a processed dataset
processed_df = pd.DataFrame({'id': [1, 2, 3], 'processed_value': [10, 20, 30]})

# Create, add to collection, and save the dataset output in one step
Helpers.save_output_dataset(context, "processed_data", processed_df)

Notes:

  • Automatically adds the output to the template output collection

  • Calls Helpers.save(context) for you

  • Returns the TemplateOutput object for reference if needed

save_output_plotly_chart_as_json

@staticmethod
def save_output_plotly_chart_as_json(context, chart_title, plotly_fig, group=None) -> TemplateOutput:

Purpose: Creates a Plotly chart output as JSON, adds it to the collection, and saves the context in one step.

Parameters:

  • context: The context dictionary

  • chart_title: Title of the chart

  • plotly_fig: Plotly figure object

  • group: Optional group name for organizing outputs

Returns: The created TemplateOutput object

Example:

import plotly.express as px
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
    'sales': [200, 150, 300, 250, 400]
})

# Create a Plotly figure
fig = px.bar(df, x='month', y='sales', title='Monthly Sales')

# Create, add to collection, and save the chart output in one step
Helpers.save_output_plotly_chart_as_json(context, "Monthly Sales Chart", fig, group="Business Analytics")

save_output_plot_lib_chart

@staticmethod
def save_output_plot_lib_chart(context, chart_title, plt, description=None, group=None) -> TemplateOutput:

Purpose: Creates a matplotlib chart output, adds it to the collection, and saves the context in one step.

Parameters:

  • context: The context dictionary

  • chart_title: Title of the chart

  • plt: Matplotlib pyplot object with the plot

  • description: Optional description of the chart

  • group: Optional group name for organizing outputs

Returns: The created TemplateOutput object

Example:

import matplotlib.pyplot as plt
import numpy as np

# Create a matplotlib plot
x = np.linspace(0, 10, 100)
y = np.sin(x)

plt.figure(figsize=(10, 6))
plt.plot(x, y)
plt.title("Sine Wave")
plt.xlabel("X")
plt.ylabel("sin(x)")
plt.grid(True)

# Create, add to collection, and save the chart output in one step
Helpers.save_output_plot_lib_chart(context, "Sine Wave Plot", plt, 
                                   description="A plot of the sine function", 
                                   group="Mathematical Functions")

save_output_echart

@staticmethod
def save_output_echart(context, chart_title, data_frame, chart_type: ChartType = ChartType.TABLE, 
                      params=None, description=None, group=None) -> TemplateOutput:

Purpose: Creates an ECharts visualization, adds it to the collection, and saves the context in one step.

Parameters:

  • context: The context dictionary

  • chart_title: Title of the chart

  • data_frame: Pandas DataFrame with the data to visualize

  • chart_type: Type of chart from ChartType enum (default is TABLE)

  • params: Optional parameters for chart configuration

  • description: Optional description of the chart

  • group: Optional group name for organizing outputs

Returns: The created TemplateOutput object

Example:

from utils.dtos.templateOutput import ChartType
import pandas as pd

# Create a DataFrame with sample data
data = {
    'category': ['A', 'B', 'C', 'D'],
    'value': [10, 25, 15, 30]
}
df = pd.DataFrame(data)

# Create, add to collection, and save the chart output in one step
Helpers.save_output_echart(context, "Category Distribution", df, ChartType.BAR,
                          description="Distribution of values by category", 
                          group="Analysis Charts")

save_output_chart

@staticmethod
def save_output_chart(context, title, metadata=None, description=None, group=None) -> TemplateOutput:

Purpose: Creates a base chart output, adds it to the collection, and saves the context in one step.

Parameters:

  • context: The context dictionary

  • title: Title of the chart

  • metadata: Optional metadata for the chart

  • description: Optional description of the chart

  • group: Optional group name for organizing outputs

Returns: The created TemplateOutput object

Example:

# Create, add to collection, and save a base chart output in one step
Helpers.save_output_chart(context, "Data Distribution",
                         description="Distribution of values across categories",
                         group="Analysis Charts")

save_output_rc_ml_model

@staticmethod
def save_output_rc_ml_model(context, model_name, model_obj, artifacts, version="default") -> TemplateOutput:

Purpose: Creates a machine learning model output, adds it to the collection, and saves the context in one step.

Parameters:

  • context: The context dictionary

  • model_name: Name of the model

  • model_obj: Model object (must extend RCMLModel class)

  • artifacts: Dictionary of model artifacts

  • version: Optional version string for the model

Returns: The created TemplateOutput object

Example:

from utils.dtos.rc_ml_model import RCMLModel

# Define a custom model class
class MyClassifier(RCMLModel):
    def __init__(self):
        super().__init__()
        self.model = None
    
    def fit(self, X, y):
        # Train the model
        from sklearn.ensemble import RandomForestClassifier
        self.model = RandomForestClassifier()
        self.model.fit(X, y)
    
    def predict(self, X):
        # Make predictions
        return self.model.predict(X)

# Create and train the model
model = MyClassifier()
model.fit(X_train, y_train)

# Save model artifacts
artifacts_dir = Helpers.getOrCreateArtifactsDir(context, "model_artifacts")
artifacts = {
    "model_info": "Random Forest Classifier with default parameters"
}

# Create, add to collection, and save the model output in one step
Helpers.save_output_rc_ml_model(context, "customer_classifier", MyClassifier, "model_artifacts", "v1.0")

save_output_artifacts

@staticmethod
def save_output_artifacts(context, artifact_name) -> TemplateOutput:

Purpose: Creates an artifact output, adds it to the collection, and saves the context in one step.

Parameters:

  • context: The context dictionary

  • artifact_name: Name of the artifact

Returns: The created TemplateOutput object

Example:

# First create an artifacts directory
artifacts_dir = Helpers.getOrCreateArtifactsDir(context, "model_files")

# Save model files to the directory
model_path = os.path.join(artifacts_dir, "model.pkl")
with open(model_path, "wb") as f:
    pickle.dump(model, f)

# Create, add to collection, and save the artifact output in one step
Helpers.save_output_artifacts(context, "model_files")

save_output_answer

@staticmethod
def save_output_answer(context, answer) -> TemplateOutput:

Purpose: Creates an answer output, adds it to the collection, and saves the context in one step.

Parameters:

  • context: The context dictionary

  • answer: The answer text or object

Returns: The created TemplateOutput object

Example:

# Analyze data and create an answer
analysis_result = "Based on the data analysis, customer segment A shows the highest retention rate at 87%."

# Create, add to collection, and save the answer output in one step
Helpers.save_output_answer(context, analysis_result)

save_output_metadata

@staticmethod
def save_output_metadata(context, metadata_list) -> TemplateOutput:

Purpose: Creates a metadata output, adds it to the collection, and saves the context in one step.

Parameters:

  • context: The context dictionary

  • metadata_list: List of metadata objects

Returns: The created TemplateOutput object

Example:

from utils.dtos.metadata import Metadata, MetadataSubjectType

# Create metadata objects
metadata1 = Metadata(
    subject_type=MetadataSubjectType.DATASET,
    subject_name="customers",
    key="row_count",
    value="5000"
)

metadata2 = Metadata(
    subject_type=MetadataSubjectType.MODEL,
    subject_name="customer_classifier",
    key="accuracy",
    value="0.92"
)

# Create, add to collection, and save the metadata output in one step
Helpers.save_output_metadata(context, [metadata1, metadata2])

save_output_file

@staticmethod
def save_output_file(context, output_name, file_contents, file_type: FileType = FileType.TEXT) -> TemplateOutput:

Purpose: Creates a file output, adds it to the collection, and saves the context in one step.

Parameters:

  • context: The context dictionary

  • output_name: Name of the output

  • file_contents: String content of the file

  • file_type: Type of file from FileType enum (TEXT, JSON, MARKDOWN, HTML)

Returns: The created TemplateOutput object

Example:

from utils.dtos.templateOutput import FileType

# Create a JSON configuration file
config = {
    "model_params": {
        "n_estimators": 100,
        "max_depth": 5
    }
}
json_content = json.dumps(config, indent=2)

# Create, add to collection, and save the file output in one step
Helpers.save_output_file(context, "model_config", json_content, FileType.JSON)

save_output_vector_store

@staticmethod
def save_output_vector_store(context, vector_store_obj) -> TemplateOutput:

Purpose: Creates a vector store output, adds it to the collection, and saves the context in one step.

Parameters:

  • context: The context dictionary

  • vector_store_obj: Vector store object (must extend VectorStoreBase)

Returns: The created TemplateOutput object

Example:

from utils.libutils.vectorStores.faiss import FaissVectorStore
import numpy as np

# Create a vector store
vector_store = FaissVectorStore(name="document_embeddings")

# Add documents with their embeddings
documents = ["This is document 1", "This is document 2", "This is document 3"]
embeddings = np.random.rand(3, 128)  # Simulated embeddings
for i, (doc, embedding) in enumerate(zip(documents, embeddings)):
    vector_store.add_item(i, doc, embedding)

# Create, add to collection, and save the vector store output in one step
Helpers.save_output_vector_store(context, vector_store)

Utility Functions

initH2o

@staticmethod
def initH2o(h2o=None, h2oServerUrl=None, init_type=1):

Purpose: Initializes the H2O machine learning library.

Parameters:

  • h2o: Optional H2O module (imported if None)

  • h2oServerUrl: URL of H2O server to connect to

  • init_type: 0 for connect only, 1 (default) for init local if cannot connect

Returns: Initialized H2O module

Example:

# Initialize H2O framework
h2o = Helpers.initH2o()

# Load data into H2O frame
h2o_frame = h2o.import_file("/path/to/data.csv")

# Train a model using H2O
model = h2o.estimators.gbm.H2OGradientBoostingEstimator()
model.train(x=predictors, y=target, training_frame=h2o_frame)

Notes:

  • If no server URL is provided, tries to get from environment variable

  • If no server is available, creates a local instance with a random name

getChildDir

@staticmethod
def getChildDir(context):

Purpose: Returns the transform directory for saving temporary files.

Parameters:

  • context: The context dictionary

Returns: Path to the child directory

Example:

# Get the transform directory
temp_dir = Helpers.getChildDir(context)

# Use it to save temporary files
temp_file_path = os.path.join(temp_dir, "temp_data.csv")
df.to_csv(temp_file_path, index=False)

getTenantId

@staticmethod
def getTenantId(context):

Purpose: Retrieves the tenant ID from the context.

Parameters:

  • context: The context dictionary

Returns: Tenant ID string

Example:

# Get the tenant ID
tenant_id = Helpers.getTenantId(context)
print(f"Current tenant: {tenant_id}")

Notes:

  • Checks multiple keys for backward compatibility

  • Returns "test-tenant" as fallback if not found

get_file_data

@staticmethod
def get_file_data(context, file_name):

Purpose: Gets the content of a file by name.

Parameters:

  • context: The context dictionary

  • file_name: Name of the file to read

Returns: File content as string or None if file doesn't exist

Raises: Exception if the file is not found

Example:

# Get the content of a configuration file
try:
    config_content = Helpers.get_file_data(context, "config.json")
    config = json.loads(config_content)
    print(f"Loaded configuration with {len(config)} settings")
except Exception as e:
    print(f"Error loading file: {e}")

How it works:

  1. Checks if files_data exists in the context

  2. Searches for the file with the matching name

  3. If the file path doesn't exist locally, downloads it

  4. Reads and returns the file contents

  5. Raises an exception if the file is not found, listing available files

generate_warning

@staticmethod
def generate_warning(context, warning):

Purpose: Adds a warning message to the context.

Parameters:

  • context: The context dictionary

  • warning: Warning message to add

Example:

# Generate a warning about missing data
Helpers.generate_warning(
    context, 
    "10% of records have missing values in the 'age' column"
)

Notes:

  • Warnings are automatically included in the output when save() is called

save

@staticmethod
def save(context):

Purpose: Saves all outputs and artifacts in the context to disk.

Parameters:

  • context: The context dictionary

Raises: Exception if no outputs are in the template output collection

Example:

# At the end of your notebook, save all outputs
Helpers.save(context)

How it works:

  1. Checks if there are outputs in the template output collection

  2. Only uploads artifacts if should_materialize is not in the context or is True

  3. Validates that all artifacts have corresponding outputs

  4. Processes special output types (like ML models)

  5. Persists all outputs to disk

  6. Writes a configuration map with output file names and warnings

Notes:

  • This should be called at the end of your notebook

  • Handles uploading artifacts, packaging ML models, and saving all outputs

  • You must have at least one output in the template output collection

  • Uses the should_materialize flag to conditionally skip artifact uploading in test environments

PreviousInstalling and setting up the SDKNextNotebook Guide

Last updated 1 month ago