Helper Functions

This page provides a list of all helper functions supported within RapidCanvas

Import Helper Functions

from utils.notebookhelpers.helpers import Helpers

Context Management

getOrCreateContext

@staticmethod
def getOrCreateContext(contextId, localVars, entities=None):

Purpose: Creates or retrieves a context object which is essential for most helper functions. This is typically the first function called in a notebook.

Parameters:

contextId: String identifier for the context (used for local testing)
localVars: Local variables in the current environment (pass locals())
entities: Optional dictionary of local datasets with entity name as key and file path as value

Returns: A context dictionary that will be used by other helper functions

Example:

# Initialize context at the beginning of your notebook
context = Helpers.getOrCreateContext(
    contextId="my_test_context", 
    localVars=locals(),
    entities={
        "customers": "/path/to/customers.csv",
        "orders": "/path/to/orders.csv"
    }
)

How it works:

Checks if internalContext already exists in local variables
For production environments, returns an empty context
For local execution, it either:
- Creates a new context with entity paths if provided
- Retrieves previously prepared context data

Notes:

This function must be called at the beginning of your notebook
The context object is used by most other helper functions
For local testing, you can specify local file paths for entities
The context now includes additional fields such as:
- vector_stores: List of vector stores for embedding-based retrieval
- files_data: List of file paths and metadata
- global_vars: Dictionary of global variables

Parameter and Variable Management

addParam

@staticmethod
def addParam(context, paramKey, paramVal):

Purpose: Adds a parameter to the context dictionary.

Parameters:

context: The context dictionary
paramKey: Key name for the parameter
paramVal: Value of the parameter

Example:

# Store a configuration value in the context
Helpers.addParam(context, "max_iterations", 100)

getParam

@staticmethod
def getParam(context, param):

Purpose: Retrieves a parameter value from the context by its key.

Parameters:

context: The context dictionary
param: Key name of the parameter to retrieve

Returns: The value of the parameter, or None if the parameter doesn't exist

Example:

# Get a configuration value from the context
max_iterations = Helpers.getParam(context, "max_iterations")
if max_iterations is None:
    max_iterations = 50  # Default value if not found

getAllParams

@staticmethod
def getAllParams(context):

Purpose: Gets a list of all parameter keys available in the context.

Parameters:

context: The context dictionary

Returns: List of all parameter keys

Example:

# List all available parameters in the context
all_params = Helpers.getAllParams(context)
print("Available parameters:", all_params)

get_global_var

@staticmethod
def get_global_var(context, key):

Purpose: Retrieves a global variable from the context by key.

Parameters:

context: The context dictionary
key: Key name of the global variable to retrieve

Returns: The value of the global variable, or None if not found

Example:

# Get a global variable from the context
environment = Helpers.get_global_var(context, "environment")
if environment == "production":
    # Apply production-specific logic
    pass

get_secret

@staticmethod
def get_secret(context: dict, key: str):

Purpose: Securely retrieves and decrypts secret values from the context.

Parameters:

context: The context dictionary
key: Key name of the secret to retrieve

Returns: The decrypted secret value

Raises: Exception if the secret is not found

Example:

# Get a secret API key
try:
    api_key = Helpers.get_secret(context, "API_KEY")
    # Use the API key for authentication
except Exception as e:
    print(f"Error retrieving secret: {e}")

How it works:

Checks if the secret exists in the context
If it's a string, decodes the base64 encoding
Decrypts the value using AES encryption with environment keys
Removes control characters from the result
If not found in context, tries to get from environment variables
Raises an exception if the secret is not found anywhere

Notes:

This function handles decryption of sensitive information
Secrets are stored encrypted in the context for security

Entity Data Management

getAllEntities

@staticmethod
def getAllEntities(context):

Purpose: Returns a list of all entity names available in the context.

Parameters:

context: The context dictionary

Returns: List of entity names

Example:

# Get all available entities
entities = Helpers.getAllEntities(context)
print("Available entities:", entities)

getEntityData

@staticmethod
def getEntityData(context, entityName, inferDTypesFromSchema=False, numRows=None, pandas_lib=None):

Purpose: Loads an entity (dataset) as a pandas DataFrame.

Parameters:

context: The context dictionary
entityName: Name of the entity to load
inferDTypesFromSchema: If True, will use schema to set data types
numRows: Optional limit on number of rows to read
pandas_lib: Optional pandas library to use (defaults to standard pandas)

Returns: DataFrame containing the entity data

Raises: Exception if the entity is not found

Example:

# Load customer data
try:
    customers_df = Helpers.getEntityData(context, "customers")
    
    # Load just the first 100 rows
    sample_df = Helpers.getEntityData(context, "customers", numRows=100)
    
    # Process the data
    customers_df['full_name'] = customers_df['first_name'] + ' ' + customers_df['last_name']
except Exception as e:
    print(f"Error loading entity: {e}")

How it works:

Looks up the entity in the context's entity paths
Gets the schema if needed
Downloads the entity file if necessary
Reads the data into a DataFrame
Raises an informative exception if the entity doesn't exist

load_all_entities

@staticmethod
def load_all_entities(context):

Purpose: Loads all entities into a dictionary of DataFrames.

Parameters:

context: The context dictionary

Returns: Dictionary with entity names as keys and DataFrames as values

Example:

# Load all available entities as DataFrames
entities_dict = Helpers.load_all_entities(context)

# Access individual DataFrames
customers_df = entities_dict['customers']
orders_df = entities_dict['orders']

# Process multiple datasets
for entity_name, df in entities_dict.items():
    print(f"Entity: {entity_name}, Rows: {len(df)}")

getEntityFilePath

@staticmethod
def getEntityFilePath(context, entityName):

Purpose: Gets the file path for a specific entity.

Parameters:

context: The context dictionary
entityName: Name of the entity

Returns: File path string, or None if entity doesn't exist

Example:

# Get the path to the customers entity file
entity_path = Helpers.getEntityFilePath(context, "customers")
if entity_path:
    print(f"Entity file is located at: {entity_path}")
else:
    print("Entity not found")

get_data_from_source

@staticmethod
def get_data_from_source(source_type: DataSourceType, source: str, name: str = None, **options):

Purpose: Retrieves data from various data sources.

Parameters:

source_type: Type of data source (enum from DataSourceType)
source: Path or identifier for the data source
name: Optional name for the data source
**options: Additional options for the data source

Returns: Data from the source (typically a DataFrame)

Example:

from utils.rcclient.enums import DataSourceType

# Read data from a CSV file
csv_data = Helpers.get_data_from_source(
    source_type=DataSourceType.CSV,
    source="path/to/file.csv"
)

# Read data from a database
db_data = Helpers.get_data_from_source(
    source_type=DataSourceType.SQL,
    source="SELECT * FROM customers",
    connection_string="postgresql://user:pass@localhost/dbname"
)

write_data_to_source

@staticmethod
def write_data_to_source(df, source_type: DataSourceType, target: str, name: str = None, **options):

Purpose: Writes data to various data destinations.

Parameters:

df: DataFrame to write
source_type: Type of data source (enum from DataSourceType)
target: Path or identifier for the data destination
name: Optional name for the data source
**options: Additional options for the data source

Example:

from utils.rcclient.enums import DataSourceType
import pandas as pd

# Create a simple DataFrame
df = pd.DataFrame({'id': [1, 2, 3], 'value': ['a', 'b', 'c']})

# Write to a CSV file
Helpers.write_data_to_source(
    df=df,
    source_type=DataSourceType.CSV,
    target="path/to/output.csv"
)

# Write to a database table
Helpers.write_data_to_source(
    df=df,
    source_type=DataSourceType.SQL,
    target="my_table",
    connection_string="postgresql://user:pass@localhost/dbname",
    if_exists="replace"  # Options: fail, replace, append
)

Artifact Management

getOrCreateArtifactsDir

@staticmethod
def getOrCreateArtifactsDir(context, artifactsId, purgeOld=False):

Purpose: Creates or retrieves a directory for storing artifacts (models, files, etc.)

Parameters:

context: The context dictionary
artifactsId: Identifier for the artifacts collection (auto-generated if None)
purgeOld: If True, will create a fresh directory even if it exists

Returns: Path to the artifacts directory

Example:

# Create an artifacts directory
artifacts_dir = Helpers.getOrCreateArtifactsDir(context, "my_model_files")

# Save a model file to the artifacts directory
model_path = os.path.join(artifacts_dir, "model.pkl")
with open(model_path, "wb") as f:
    pickle.dump(model, f)

print(f"Artifacts directory: {artifacts_dir}")
print(f"Use artifacts ID 'my_model_files' to download these artifacts later")

How it works:

If no artifactsId is provided, generates a random UUID
Adds the artifactsId to the context for tracking
Constructs the local path for the artifacts directory
Either creates a fresh directory or downloads existing artifacts
Returns the path to the artifacts directory

Notes:

Artifacts are automatically uploaded when you call Helpers.save(context)
Use the same artifactsId to access the same artifacts in different sessions

downloadArtifacts

@staticmethod
def downloadArtifacts(context, artifactsId):

Purpose: Downloads artifacts from storage to the local filesystem.

Parameters:

context: The context dictionary
artifactsId: ID of the artifacts to download

Returns: Dictionary of file names and their paths

Example:

# Download model artifacts by ID
files_dict = Helpers.downloadArtifacts(context, "my_model_files")

# Access specific files
model_path = files_dict.get("model.pkl")
if model_path:
    with open(model_path, "rb") as f:
        model = pickle.load(f)

list_artifact_files

@staticmethod
def list_artifact_files(context, artifacts_id):

Purpose: Lists files in an artifact without downloading them.

Parameters:

context: The context dictionary
artifacts_id: ID of the artifacts to list

Returns: List of file names in the artifact

Example:

# See what files are available in an artifact
file_list = Helpers.list_artifact_files(context, "my_model_files")
print("Available files:", file_list)

# Check if a specific file exists
if "model.pkl" in file_list:
    # Download only that file
    model_path = Helpers.download_artifact_file(context, "my_model_files", "model.pkl")

download_artifact_file

@staticmethod
def download_artifact_file(context, artifacts_id, file_name):

Purpose: Downloads a single file from an artifact.

Parameters:

context: The context dictionary
artifacts_id: ID of the artifact containing the file
file_name: Name of the file to download

Returns: Path to the downloaded file

Example:

# Download just the model file from an artifact
model_path = Helpers.download_artifact_file(context, "my_model_files", "model.pkl")

# Load the model
with open(model_path, "rb") as f:
    model = pickle.load(f)

get_artifact

@staticmethod
def get_artifact(context, artifact_id, artifact_name):

Purpose: Gets the path to a specific artifact file, downloading it if necessary.

Parameters:

context: The context dictionary
artifact_id: ID of the artifact
artifact_name: Name of the specific file in the artifact

Returns: Path to the artifact file

Example:

# Get a specific file from an artifact
config_path = Helpers.get_artifact(context, "project_files", "config.json")

# Read the config file
with open(config_path, "r") as f:
    config = json.load(f)

uploadArtifacts

@staticmethod
def uploadArtifacts(context):

Purpose: Uploads all artifacts in the context to remote storage.

Parameters:

context: The context dictionary

Example:

# Upload all artifacts registered in the context
Helpers.uploadArtifacts(context)

Notes:

This is automatically called by Helpers.save(context)
Only uploads artifacts with IDs listed in the context

Output Management

createOutputCollection

@staticmethod
def createOutputCollection(context):

Purpose: Creates or retrieves a template output collection for storing outputs.

Parameters:

context: The context dictionary

Returns: TemplateOutputCollection object

Example:

# Create or get the output collection
output_collection = Helpers.createOutputCollection(context)

Notes:

Most output creation functions add to this collection automatically
This is called internally by other output functions

getOutputCollection

@staticmethod
def getOutputCollection(context):

Purpose: Gets the existing template output collection from the context.

Parameters:

context: The context dictionary

Returns: TemplateOutputCollection object or None if not found

Example:

# Get the output collection
output_collection = Helpers.getOutputCollection(context)
if output_collection:
    # Check how many outputs are in the collection
    print(f"Number of outputs: {len(output_collection.templateOutputs)}")

createTemplateOutput

@staticmethod
def createTemplateOutput(context, outputName: str, outputType: OutputType, data=None, 
                         dataType: FileType = FileType.CSV, outputFileName: str = None,
                         custom_params: dict = {}, metadata: dict = {}, 
                         description: str = None, group: str = None):

Purpose: Creates a template output of any type.

Parameters:

context: The context dictionary
outputName: Name of the output
outputType: Type of output from OutputType enum
data: Data for the output (typically DataFrame)
dataType: Type of data file from FileType enum (CSV, PARQUET, etc.)
outputFileName: Optional custom filename
custom_params: Additional parameters for the output
metadata: Metadata for the output
description: Optional description
group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

from utils.dtos.templateOutput import OutputType, FileType
import pandas as pd

# Create a DataFrame
df = pd.DataFrame({'id': [1, 2, 3], 'value': ['a', 'b', 'c']})

# Create a CSV output
output = Helpers.createTemplateOutput(
    context=context,
    outputName="my_output",
    outputType=OutputType.ENTITY,
    data=df,
    dataType=FileType.CSV,
    description="Sample data output",
    group="Data Outputs"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(output)

Notes:

This is a general-purpose output creation function
Specialized versions exist for specific output types
Automatically handles file generation and storage

createTemplateOutputDataset

@staticmethod
def createTemplateOutputDataset(context, outputName, dataFrame):

Purpose: Creates a dataset (entity) output from a DataFrame.

Parameters:

context: The context dictionary
outputName: Name of the dataset
dataFrame: Pandas DataFrame with the data

Returns: TemplateOutput object

Example:

import pandas as pd

# Create a processed dataset
processed_df = pd.DataFrame({'id': [1, 2, 3], 'processed_value': [10, 20, 30]})

# Create a dataset output
dataset_output = Helpers.createTemplateOutputDataset(
    context=context,
    outputName="processed_data",
    dataFrame=processed_df
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(dataset_output)

Notes:

This is a convenience wrapper around createTemplateOutput
Automatically sets outputType to ENTITY and dataType to PARQUET

create_template_output_file

@staticmethod
def create_template_output_file(context, output_name, file_contents, file_type: FileType = FileType.TEXT):

Purpose: Creates a file output with the specified content.

Parameters:

context: The context dictionary
output_name: Name of the output
file_contents: String content of the file
file_type: Type of file from FileType enum (TEXT, JSON, MARKDOWN, HTML). Defaults to TEXT if not specified.

Returns: TemplateOutput object

Example:

from utils.dtos.templateOutput import FileType

# Create a JSON configuration file
config = {
    "model_params": {
        "n_estimators": 100,
        "max_depth": 5
    }
}
json_content = json.dumps(config, indent=2)

# Create a file output
file_output = Helpers.create_template_output_file(
    context=context,
    output_name="model_config",
    file_contents=json_content,
    file_type=FileType.JSON
)

# Create a text file output with default file type
text_output = Helpers.create_template_output_file(
    context=context,
    output_name="readme",
    file_contents="This is a simple text file."
    # file_type parameter omitted, defaults to FileType.TEXT
)

# Add the outputs to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(file_output)
output_collection.addOutput(text_output)

create_template_output_artifact

@staticmethod
def create_template_output_artifact(context, artifact_name):

Purpose: Creates an artifact output for storing files.

Parameters:

context: The context dictionary
artifact_name: Name of the artifact (should match an artifactsId)

Returns: TemplateOutput object

Example:

# First create an artifacts directory
artifacts_dir = Helpers.getOrCreateArtifactsDir(context, "model_files")

# Save model files to the directory
model_path = os.path.join(artifacts_dir, "model.pkl")
with open(model_path, "wb") as f:
    pickle.dump(model, f)

# Create an artifact output
artifact_output = Helpers.create_template_output_artifact(
    context=context,
    artifact_name="model_files"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(artifact_output)

Notes:

The artifact_name should match the artifactsId used with getOrCreateArtifactsDir
This registers the artifact for proper handling when the context is saved

Chart Output Functions

create_template_output_chart

@staticmethod
def create_template_output_chart(context, title, metadata={}, description=None, group=None):

Purpose: Creates a base chart output without specifying the chart type or data.

Parameters:

context: The context dictionary
title: Title of the chart
metadata: Optional metadata for the chart
description: Optional description of the chart
group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

# Create a base chart output
chart_output = Helpers.create_template_output_chart(
    context=context,
    title="Data Distribution",
    description="Distribution of values across categories",
    group="Analysis Charts"
)

Notes:

This creates a basic chart output container
You'll typically use one of the more specific chart functions instead

createTemplateOutputEChart

@staticmethod
def createTemplateOutputEChart(context, chartTitle, dataFrame, chartType=ChartType.TABLE, 
                             params={}, description=None, group=None):

Purpose: Creates a chart output using ECharts visualization library.

Parameters:

context: The context dictionary
chartTitle: Title of the chart
dataFrame: Pandas DataFrame with the data to visualize
chartType: Type of chart from ChartType enum (default is TABLE)
params: Optional parameters for chart configuration
description: Optional description of the chart
group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

from utils.dtos.templateOutput import ChartType
import pandas as pd

# Create a DataFrame with sample data
data = {
    'category': ['A', 'B', 'C', 'D'],
    'value': [10, 25, 15, 30]
}
df = pd.DataFrame(data)

# Create a bar chart
bar_chart = Helpers.createTemplateOutputEChart(
    context=context,
    chartTitle="Category Distribution",
    dataFrame=df,
    chartType=ChartType.BAR,
    description="Distribution of values by category",
    group="Analysis Charts"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(bar_chart)

Notes:

Uses ECharts library for interactive visualizations
Common chart types include BAR, LINE, PIE, SCATTER, TABLE

createTemplateOutputPlotlibChart

@staticmethod
def createTemplateOutputPlotlibChart(context, chartTitle: str, plt, description=None, group=None):

Purpose: Creates a chart output from a matplotlib plot.

Parameters:

context: The context dictionary
chartTitle: Title of the chart
plt: Matplotlib pyplot object with the plot
description: Optional description of the chart
group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

import matplotlib.pyplot as plt
import numpy as np

# Create a matplotlib plot
x = np.linspace(0, 10, 100)
y = np.sin(x)

plt.figure(figsize=(10, 6))
plt.plot(x, y)
plt.title("Sine Wave")
plt.xlabel("X")
plt.ylabel("sin(x)")
plt.grid(True)

# Create a chart output from the matplotlib plot
sine_chart = Helpers.createTemplateOutputPlotlibChart(
    context=context,
    chartTitle="Sine Wave Plot",
    plt=plt,
    description="A plot of the sine function",
    group="Mathematical Functions"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(sine_chart)

Notes:

Saves the matplotlib figure as a PNG file
The plot is automatically saved to the artifacts directory

createTemplateOutputPlotlyChart

@staticmethod
def createTemplateOutputPlotlyChart(context, chartTitle: str, plotly_fig, description=None, group=None):

Purpose: Creates a chart output from a Plotly figure.

Parameters:

context: The context dictionary
chartTitle: Title of the chart
plotly_fig: Plotly figure object
description: Optional description of the chart
group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

import plotly.express as px
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'x': range(10),
    'y': [i**2 for i in range(10)],
    'category': ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B']
})

# Create a Plotly figure
fig = px.scatter(df, x='x', y='y', color='category', 
                title='Quadratic Function by Category')

# Create a chart output from the Plotly figure
scatter_chart = Helpers.createTemplateOutputPlotlyChart(
    context=context,
    chartTitle="Quadratic Scatter Plot",
    plotly_fig=fig,
    description="A scatter plot showing x² by category",
    group="Mathematical Functions"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(scatter_chart)

Notes:

Saves the Plotly figure as an HTML file with interactive features
Uses CDN for Plotly JavaScript libraries to keep file size small

createTemplateOutputPlotlyChartAsJson

@staticmethod
def createTemplateOutputPlotlyChartAsJson(context, chartTitle: str, plotly_fig, group=None):

Purpose: Creates a chart output from a Plotly figure, saving it as JSON.

Parameters:

context: The context dictionary
chartTitle: Title of the chart
plotly_fig: Plotly figure object
group: Optional group name for organizing outputs

Returns: TemplateOutput object

Example:

import plotly.express as px
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
    'sales': [200, 150, 300, 250, 400]
})

# Create a Plotly figure
fig = px.bar(df, x='month', y='sales', title='Monthly Sales')

# Create a chart output from the Plotly figure as JSON
json_chart = Helpers.createTemplateOutputPlotlyChartAsJson(
    context=context,
    chartTitle="Monthly Sales Chart",
    plotly_fig=fig,
    group="Business Analytics"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(json_chart)

Notes:

Saves the Plotly figure as a JSON file
This format is useful when you need to load and modify the chart later

Model and Metadata Functions

create_template_output_rc_ml_model

@staticmethod
def create_template_output_rc_ml_model(context, model_name, model_obj, artifacts, version="default"):

Purpose: Creates an output for a machine learning model.

Parameters:

context: The context dictionary
model_name: Name of the model
model_obj: Model object (must extend RCMLModel class)
artifacts: Dictionary of model artifacts
version: Optional version string for the model

Returns: TemplateOutput object

Raises: Exception if model_obj doesn't extend RCMLModel

Example:

from utils.dtos.rc_ml_model import RCMLModel

# Define a custom model class
class MyClassifier(RCMLModel):
    def __init__(self):
        super().__init__()
        self.model = None
    
    def fit(self, X, y):
        # Train the model
        from sklearn.ensemble import RandomForestClassifier
        self.model = RandomForestClassifier()
        self.model.fit(X, y)
    
    def predict(self, X):
        # Make predictions
        return self.model.predict(X)

# Create and train the model
model = MyClassifier()
model.fit(X_train, y_train)

# Save model artifacts
artifacts_dir = Helpers.getOrCreateArtifactsDir(context, "model_artifacts")
artifacts = {
    "model_info": "Random Forest Classifier with default parameters"
}

# Create model output
model_output = Helpers.create_template_output_rc_ml_model(
    context=context,
    model_name="customer_classifier",
    model_obj=MyClassifier,
    artifacts="model_artifacts",
    version="v1.0"
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(model_output)

Notes:

The model class must extend RCMLModel
Artifacts should contain any files needed for the model to function

get_rc_ml_model

@staticmethod
def get_rc_ml_model(context, model_name, model_version="default"):

Purpose: Loads a previously saved ML model.

Parameters:

context: The context dictionary
model_name: Name of the model to load
model_version: Version of the model (default is "default")

Returns: Loaded model object

Example:

# Load a saved model
model = Helpers.get_rc_ml_model(
    context=context,
    model_name="customer_classifier",
    model_version="v1.0"
)

# Use the model to make predictions
predictions = model.predict(X_test)

create_template_output_metadata

@staticmethod
def create_template_output_metadata(context, metadata_list):

Purpose: Creates an output for metadata information.

Parameters:

context: The context dictionary
metadata_list: List of metadata objects

Returns: TemplateOutput object

Example:

from utils.dtos.metadata import Metadata, MetadataSubjectType

# Create metadata objects
metadata1 = Metadata(
    subject_type=MetadataSubjectType.DATASET,
    subject_name="customers",
    key="row_count",
    value="5000"
)

metadata2 = Metadata(
    subject_type=MetadataSubjectType.MODEL,
    subject_name="customer_classifier",
    key="accuracy",
    value="0.92"
)

# Create metadata output
metadata_output = Helpers.create_template_output_metadata(
    context=context,
    metadata_list=[metadata1, metadata2]
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(metadata_output)

get_metadata_value

@staticmethod
def get_metadata_value(context, subject_type: MetadataSubjectType, subject_name: str, key: str):

Purpose: Retrieves a specific metadata value from the context.

Parameters:

context: The context dictionary
subject_type: Type of subject from MetadataSubjectType enum
subject_name: Name of the subject
key: Metadata key to retrieve

Returns: The metadata value, or None if not found

Example:

from utils.dtos.metadata import MetadataSubjectType

# Get metadata value
accuracy = Helpers.get_metadata_value(
    context=context,
    subject_type=MetadataSubjectType.MODEL,
    subject_name="customer_classifier",
    key="accuracy"
)

print(f"Model accuracy: {accuracy}")

get_all_metadata

@staticmethod
def get_all_metadata(context):

Purpose: Retrieves all metadata from the context.

Parameters:

context: The context dictionary

Returns: Dictionary of all metadata

Example:

# Get all metadata
all_metadata = Helpers.get_all_metadata(context)

# Print metadata information
for metadata in all_metadata:
    print(f"Subject: {metadata['subject_name']}, Key: {metadata['key']}, Value: {metadata['value']}")

create_template_output_answer

@staticmethod
def create_template_output_answer(context, answer):

Purpose: Creates an output containing an answer or result.

Parameters:

context: The context dictionary
answer: The answer text or object

Returns: TemplateOutput object

Example:

# Analyze data and create an answer
analysis_result = "Based on the data analysis, customer segment A shows the highest retention rate at 87%."

# Create answer output
answer_output = Helpers.create_template_output_answer(
    context=context,
    answer=analysis_result
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(answer_output)

create_template_output_vector_store

@staticmethod
def create_template_output_vector_store(context, vector_store_obj: VectorStoreBase):

Purpose: Creates an output for a vector store (used for embeddings and similarity search).

Parameters:

context: The context dictionary
vector_store_obj: Vector store object (must extend VectorStoreBase)

Returns: TemplateOutput object

Example:

from utils.libutils.vectorStores.faiss import FaissVectorStore
import numpy as np

# Create a vector store
vector_store = FaissVectorStore(name="document_embeddings")

# Add documents with their embeddings
documents = ["This is document 1", "This is document 2", "This is document 3"]
embeddings = np.random.rand(3, 128)  # Simulated embeddings
for i, (doc, embedding) in enumerate(zip(documents, embeddings)):
    vector_store.add_item(i, doc, embedding)

# Create vector store output
vs_output = Helpers.create_template_output_vector_store(
    context=context,
    vector_store_obj=vector_store
)

# Add it to the collection
output_collection = Helpers.getOutputCollection(context)
output_collection.addOutput(vs_output)

Convenience Save Functions

These functions combine creating an output and saving it to the context in a single step.

save_output_dataset

@staticmethod
def save_output_dataset(context, output_name, data_frame) -> TemplateOutput:

Purpose: Creates a dataset output, adds it to the collection, and saves the context in one step.

Parameters:

context: The context dictionary
output_name: Name of the dataset
data_frame: Pandas DataFrame with the data

Returns: The created TemplateOutput object

Example:

import pandas as pd

# Create a processed dataset
processed_df = pd.DataFrame({'id': [1, 2, 3], 'processed_value': [10, 20, 30]})

# Create, add to collection, and save the dataset output in one step
Helpers.save_output_dataset(context, "processed_data", processed_df)

Notes:

Automatically adds the output to the template output collection
Calls Helpers.save(context) for you
Returns the TemplateOutput object for reference if needed

save_output_plotly_chart_as_json

@staticmethod
def save_output_plotly_chart_as_json(context, chart_title, plotly_fig, group=None) -> TemplateOutput:

Purpose: Creates a Plotly chart output as JSON, adds it to the collection, and saves the context in one step.

Parameters:

context: The context dictionary
chart_title: Title of the chart
plotly_fig: Plotly figure object
group: Optional group name for organizing outputs

Returns: The created TemplateOutput object

Example:

import plotly.express as px
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
    'sales': [200, 150, 300, 250, 400]
})

# Create a Plotly figure
fig = px.bar(df, x='month', y='sales', title='Monthly Sales')

# Create, add to collection, and save the chart output in one step
Helpers.save_output_plotly_chart_as_json(context, "Monthly Sales Chart", fig, group="Business Analytics")

save_output_plot_lib_chart

@staticmethod
def save_output_plot_lib_chart(context, chart_title, plt, description=None, group=None) -> TemplateOutput:

Purpose: Creates a matplotlib chart output, adds it to the collection, and saves the context in one step.

Parameters:

context: The context dictionary
chart_title: Title of the chart
plt: Matplotlib pyplot object with the plot
description: Optional description of the chart
group: Optional group name for organizing outputs

Returns: The created TemplateOutput object

Example:

import matplotlib.pyplot as plt
import numpy as np

# Create a matplotlib plot
x = np.linspace(0, 10, 100)
y = np.sin(x)

plt.figure(figsize=(10, 6))
plt.plot(x, y)
plt.title("Sine Wave")
plt.xlabel("X")
plt.ylabel("sin(x)")
plt.grid(True)

# Create, add to collection, and save the chart output in one step
Helpers.save_output_plot_lib_chart(context, "Sine Wave Plot", plt, 
                                   description="A plot of the sine function", 
                                   group="Mathematical Functions")

save_output_echart

@staticmethod
def save_output_echart(context, chart_title, data_frame, chart_type: ChartType = ChartType.TABLE, 
                      params=None, description=None, group=None) -> TemplateOutput:

Purpose: Creates an ECharts visualization, adds it to the collection, and saves the context in one step.

Parameters:

context: The context dictionary
chart_title: Title of the chart
data_frame: Pandas DataFrame with the data to visualize
chart_type: Type of chart from ChartType enum (default is TABLE)
params: Optional parameters for chart configuration
description: Optional description of the chart
group: Optional group name for organizing outputs

Returns: The created TemplateOutput object

Example:

from utils.dtos.templateOutput import ChartType
import pandas as pd

# Create a DataFrame with sample data
data = {
    'category': ['A', 'B', 'C', 'D'],
    'value': [10, 25, 15, 30]
}
df = pd.DataFrame(data)

# Create, add to collection, and save the chart output in one step
Helpers.save_output_echart(context, "Category Distribution", df, ChartType.BAR,
                          description="Distribution of values by category", 
                          group="Analysis Charts")

save_output_chart

@staticmethod
def save_output_chart(context, title, metadata=None, description=None, group=None) -> TemplateOutput:

Purpose: Creates a base chart output, adds it to the collection, and saves the context in one step.

Parameters:

context: The context dictionary
title: Title of the chart
metadata: Optional metadata for the chart
description: Optional description of the chart
group: Optional group name for organizing outputs

Returns: The created TemplateOutput object

Example:

# Create, add to collection, and save a base chart output in one step
Helpers.save_output_chart(context, "Data Distribution",
                         description="Distribution of values across categories",
                         group="Analysis Charts")

save_output_rc_ml_model

@staticmethod
def save_output_rc_ml_model(context, model_name, model_obj, artifacts, version="default") -> TemplateOutput:

Purpose: Creates a machine learning model output, adds it to the collection, and saves the context in one step.

Parameters:

context: The context dictionary
model_name: Name of the model
model_obj: Model object (must extend RCMLModel class)
artifacts: Dictionary of model artifacts
version: Optional version string for the model

Returns: The created TemplateOutput object

Example:

from utils.dtos.rc_ml_model import RCMLModel

# Define a custom model class
class MyClassifier(RCMLModel):
    def __init__(self):
        super().__init__()
        self.model = None
    
    def fit(self, X, y):
        # Train the model
        from sklearn.ensemble import RandomForestClassifier
        self.model = RandomForestClassifier()
        self.model.fit(X, y)
    
    def predict(self, X):
        # Make predictions
        return self.model.predict(X)

# Create and train the model
model = MyClassifier()
model.fit(X_train, y_train)

# Save model artifacts
artifacts_dir = Helpers.getOrCreateArtifactsDir(context, "model_artifacts")
artifacts = {
    "model_info": "Random Forest Classifier with default parameters"
}

# Create, add to collection, and save the model output in one step
Helpers.save_output_rc_ml_model(context, "customer_classifier", MyClassifier, "model_artifacts", "v1.0")

save_output_mlflow_model

@staticmethod
def save_output_mlflow_model(context, model_name, rc_version=None, mlflow_model_version=None, experiment_name=None, run_id=None) -> TemplateOutput:

Purpose:

Creates and saves a template output for an MLflow-tracked model. This function stores the MLflow model reference in the context and adds the generated output to the output collection in a single step.

Parameters:

context: The workflow or notebook context dictionary.
model_name: (str) Name of the MLflow model.
rc_version: (str, optional) Version identifier for the model within RapidCanvas. Defaults to "default" if not provided.
mlflow_model_version: (str or int, optional) Specific version of the MLflow model to reference.
experiment_name: (str, optional) MLflow experiment name where the model run is logged.
run_id: (str, optional) MLflow run ID to uniquely identify a model run.

Returns:

The created TemplateOutput object representing the MLflow model output

Example:

# Assuming 'context' is initialized and MLflow tracking is set up

model_name = "churn_prediction_model"
rc_version = "v2.1"
mlflow_model_version = 5
experiment_name = "ChurnPrediction"
run_id = "e9bdc6c1e2b442b38bdaafc8798e8a1f"

# Store the MLflow model reference and save the output in one step
Helpers.save_output_mlflow_model(
    context, 
    model_name, 
    rc_version=rc_version, 
    mlflow_model_version=mlflow_model_version, 
    experiment_name=experiment_name, 
    run_id=run_id
)

Notes:

This method is typically used after training and logging a model to MLflow.
It leverages the MLflowUtils helper methods to ensure that model references and outputs are properly recorded.
The context is saved automatically after adding the output, ensuring changes are persisted.

save_output_artifacts

@staticmethod
def save_output_artifacts(context, artifact_name) -> TemplateOutput:

Purpose: Creates an artifact output, adds it to the collection, and saves the context in one step.

Parameters:

context: The context dictionary
artifact_name: Name of the artifact

Returns: The created TemplateOutput object

Example:

# First create an artifacts directory
artifacts_dir = Helpers.getOrCreateArtifactsDir(context, "model_files")

# Save model files to the directory
model_path = os.path.join(artifacts_dir, "model.pkl")
with open(model_path, "wb") as f:
    pickle.dump(model, f)

# Create, add to collection, and save the artifact output in one step
Helpers.save_output_artifacts(context, "model_files")

save_output_answer

@staticmethod
def save_output_answer(context, answer) -> TemplateOutput:

Purpose: Creates an answer output, adds it to the collection, and saves the context in one step.

Parameters:

context: The context dictionary
answer: The answer text or object

Returns: The created TemplateOutput object

Example:

# Analyze data and create an answer
analysis_result = "Based on the data analysis, customer segment A shows the highest retention rate at 87%."

# Create, add to collection, and save the answer output in one step
Helpers.save_output_answer(context, analysis_result)

save_output_metadata

@staticmethod
def save_output_metadata(context, metadata_list) -> TemplateOutput:

Purpose: Creates a metadata output, adds it to the collection, and saves the context in one step.

Parameters:

context: The context dictionary
metadata_list: List of metadata objects

Returns: The created TemplateOutput object

Example:

from utils.dtos.metadata import Metadata, MetadataSubjectType

# Create metadata objects
metadata1 = Metadata(
    subject_type=MetadataSubjectType.DATASET,
    subject_name="customers",
    key="row_count",
    value="5000"
)

metadata2 = Metadata(
    subject_type=MetadataSubjectType.MODEL,
    subject_name="customer_classifier",
    key="accuracy",
    value="0.92"
)

# Create, add to collection, and save the metadata output in one step
Helpers.save_output_metadata(context, [metadata1, metadata2])

save_output_file

@staticmethod
def save_output_file(context, output_name, file_contents, file_type: FileType = FileType.TEXT) -> TemplateOutput:

Purpose: Creates a file output, adds it to the collection, and saves the context in one step.

Parameters:

context: The context dictionary
output_name: Name of the output
file_contents: String content of the file
file_type: Type of file from FileType enum (TEXT, JSON, MARKDOWN, HTML)

Returns: The created TemplateOutput object

Example:

from utils.dtos.templateOutput import FileType

# Create a JSON configuration file
config = {
    "model_params": {
        "n_estimators": 100,
        "max_depth": 5
    }
}
json_content = json.dumps(config, indent=2)

# Create, add to collection, and save the file output in one step
Helpers.save_output_file(context, "model_config", json_content, FileType.JSON)

save_output_vector_store

@staticmethod
def save_output_vector_store(context, vector_store_obj) -> TemplateOutput:

Purpose: Creates a vector store output, adds it to the collection, and saves the context in one step.

Parameters:

context: The context dictionary
vector_store_obj: Vector store object (must extend VectorStoreBase)

Returns: The created TemplateOutput object

Example:

from utils.libutils.vectorStores.faiss import FaissVectorStore
import numpy as np

# Create a vector store
vector_store = FaissVectorStore(name="document_embeddings")

# Add documents with their embeddings
documents = ["This is document 1", "This is document 2", "This is document 3"]
embeddings = np.random.rand(3, 128)  # Simulated embeddings
for i, (doc, embedding) in enumerate(zip(documents, embeddings)):
    vector_store.add_item(i, doc, embedding)

# Create, add to collection, and save the vector store output in one step
Helpers.save_output_vector_store(context, vector_store)

Utility Functions

initH2o

@staticmethod
def initH2o(h2o=None, h2oServerUrl=None, init_type=1):

Purpose: Initializes the H2O machine learning library.

Parameters:

h2o: Optional H2O module (imported if None)
h2oServerUrl: URL of H2O server to connect to
init_type: 0 for connect only, 1 (default) for init local if cannot connect

Returns: Initialized H2O module

Example:

# Initialize H2O framework
h2o = Helpers.initH2o()

# Load data into H2O frame
h2o_frame = h2o.import_file("/path/to/data.csv")

# Train a model using H2O
model = h2o.estimators.gbm.H2OGradientBoostingEstimator()
model.train(x=predictors, y=target, training_frame=h2o_frame)

Notes:

If no server URL is provided, tries to get from environment variable
If no server is available, creates a local instance with a random name

getChildDir

@staticmethod
def getChildDir(context):

Purpose: Returns the transform directory for saving temporary files.

Parameters:

context: The context dictionary

Returns: Path to the child directory

Example:

# Get the transform directory
temp_dir = Helpers.getChildDir(context)

# Use it to save temporary files
temp_file_path = os.path.join(temp_dir, "temp_data.csv")
df.to_csv(temp_file_path, index=False)

getTenantId

@staticmethod
def getTenantId(context):

Purpose: Retrieves the tenant ID from the context.

Parameters:

context: The context dictionary

Returns: Tenant ID string

Example:

# Get the tenant ID
tenant_id = Helpers.getTenantId(context)
print(f"Current tenant: {tenant_id}")

Notes:

Checks multiple keys for backward compatibility
Returns "test-tenant" as fallback if not found

get_file_data

@staticmethod
def get_file_data(context, file_name):

Purpose: Gets the content of a file by name.

Parameters:

context: The context dictionary
file_name: Name of the file to read

Returns: File content as string or None if file doesn't exist

Raises: Exception if the file is not found

Example:

# Get the content of a configuration file
try:
    config_content = Helpers.get_file_data(context, "config.json")
    config = json.loads(config_content)
    print(f"Loaded configuration with {len(config)} settings")
except Exception as e:
    print(f"Error loading file: {e}")

How it works:

Checks if files_data exists in the context
Searches for the file with the matching name
If the file path doesn't exist locally, downloads it
Reads and returns the file contents
Raises an exception if the file is not found, listing available files

generate_warning

@staticmethod
def generate_warning(context, warning):

Purpose: Adds a warning message to the context.

Parameters:

context: The context dictionary
warning: Warning message to add

Example:

# Generate a warning about missing data
Helpers.generate_warning(
    context, 
    "10% of records have missing values in the 'age' column"
)

Notes:

Warnings are automatically included in the output when save() is called

save

@staticmethod
def save(context):

Purpose: Saves all outputs and artifacts in the context to disk.

Parameters:

context: The context dictionary

Raises: Exception if no outputs are in the template output collection

Example:

# At the end of your notebook, save all outputs
Helpers.save(context)

How it works:

Checks if there are outputs in the template output collection
Only uploads artifacts if should_materialize is not in the context or is True
Validates that all artifacts have corresponding outputs
Processes special output types (like ML models)
Persists all outputs to disk
Writes a configuration map with output file names and warnings

Notes:

This should be called at the end of your notebook
Handles uploading artifacts, packaging ML models, and saving all outputs
You must have at least one output in the template output collection
Uses the should_materialize flag to conditionally skip artifact uploading in test environments

PreviousInstalling and setting up the SDK NextNotebook Guide

Last updated 13 days ago