Create a template
NOTE: Charts and Artifacts to be added.
This is a guide to creating a curated template in RapidCanvas. Our templates that you create will be used by our no-code users on the RapidCanvas platform to perform a wide range of tasks on a wide range of datasets.
Goal
Your goal is to create a generalized template notebook file that is well documented, usable and understandable for no-code users, and works across any relevant dataset.
Setup
RapidCanvas allows you to create a template and use existing templates. Templates can make it easier to work with different types of data. Templates help you save time and effort. They also make it easy to create common logic.
To start building a template, follow the below instructions:
You will work within a folder named template-lib. Your project will be located in the folder "projects".
Under projects you will find an example of an employee-v3-syntax project implementation. This file contains transforms and the main flow file. You can observe how the SDK is used in a project inside the projects.
Create an account on http://staging.dev.rapidcanvas.net/#/
Go to the template-lib folder and run_tmpltlib.py script (./run_tmpltlib.py) and follow the steps inside your command line.
If you open the example employee project file, you'll see a few folders (data, output, and transforms) as well as a jupyter notebook file. For now, you only need to be concerned with transforms and the flow notebook
Building a solution in RapidCanvas involves two types of jupyter notebook files: transform files and flow files. Transform files are notebook files that are individual data transformations or operations that you perform on an input dataset.
Transforms are used inside of the main flow file. The flow file is where we create our project, authenticate on RapidCanvas, and build our data pipeline. To use our transform file inside of the flow we create a Recipe. A recipe is an instantiation of 1 or more transforms. See the titanic example. The recipe is then run inside of the flow file and produces an output dataset.
Try running employee_flow.ipynb and follow along inside of the RapidCanvas UI step-by-step to see how the pipeline is built from the flow.
Creating the Template
To create a template, you will be writing a jupyter notebook that takes a set of input parameters and outputs the required output. Requirements for each specific template will be sent to you.
You are given a project folder called “employee-v3-syntax” which contains a flow file, 2 example transform Notebooks, and some test datasets to use to build your template.
Start by running your flow file by launching with the run_tmpltlib.py script (you may need to edit the flow notebook to remove the existing auth token) to see how the flow corresponds with the UI. Follow along on the UI to see what is happening.
Next, create a copy of of one of the transform notebooks inside of the transforms folder. Change the name of the notebook to the name of the transform that you are building.
Inside of this notebook is where you will write all of the code for your template.
Then, run and debug the notebook you have created by following the example code inside of the flow notebook. The notebook demonstrates how to publish and use your template with the two given examples.
Documenting the Template
When creating the template, you need to document inside of the notebook a few things:
The input parameters These should be understandable for a no-code user and have solid descriptions for the tool-tip in the UI. See the example transform for more. While creating the template be sure to check inside of the UI how it is behaving- this is the main point. A business user should be able to use the template on their own dataset.
Usage of the template You should write in a markdown block the expected usage of the template including what datasets this works for, best practices, and any usage boundaries/conditions.
Testing the Template To test the template, pick a few relevant datasets. Some have been provided in the data folder of the template_creation folder. Test the template using the chosen datasets. Try using it inside of the UI by modifying the input parameters using the UI. You can do this by editing the recipe and hitting the edit & test button on the listed transform.
After testing, fix any unexpected results/bugs and document any usage boundaries that you have discovered. Once this is done and you are satisfied with the result, send the project folder back to us. We will then review your results and come back with notes or any needed changes.
Displaying a template on the UI
When creating a transform, you can display it on the UI. You can use this to expose parameters inside of the UI so that your transform can be used by non-technical users and work across projects.
Displaying the transformation using DataApp v3
To expose parameters on the UI, you must define each parameter and their properties in the second code block of your transform notebook. Each parameter makes a call to a “get or create” function that takes a name which is the name can be used inside the flow file to pass variables into your transformation, metadata which is a dictionary that defines how the parameter will be used and displayed, and a local_context which will always be equal to locals().
Creating an input dataset
To add an input dataset to be used in your transformation, use the get_or_create_input_dataset method. Here is an example:
The input dataset can then be used in your transform by using inputDatasetParameter.value
We recommend getting the input dataset value and assigning it to a variable, like the following:
Required metadata fields: input_name
, is_required
, tooltip
Creating an input variable
To add an input variable to be used in your transformation, use the get_or_create_input_var method. Here is an example:
Then, the variable can be used in your transform by using start_dateParameter.value
Required metadata fields: input_name
, is_required
, tooltip
, multiple
, datatypes
, options
Creating an output dataset parameter
To give users the ability to define the name of an output dataset, use the get_or_create_output_dataset method. Here is an example:
The output dataset name given by the user can then be accessed by using outputDatasetParameter.value
Required metadata fields: input_name
, is_required
, tooltip
Creating an output chart
To give users the ability to name the output charts, use the get_or_create_output_chart method. Example:
The name of the output chart given by the user is then accessed inside the transform notebook using outputChartParameter.value
Required metadata fields: input_name
, is_required
, tooltip
Metadata fields
default_value
Value that will be used if the user does not enter any input for the parameter.
input_name
Name of the parameter to be displayed on the UI
datatypes
List that limits the datatypes accepted by the parameter to any of STRING LONG DOUBLE BOOLEAN TIMESTAMP or ALL
options
List that defines what the data input options are. They can be 'FIELDS' or 'CONSTANT' or both. Fields allows a user to select a column/field from the dataset defined by the datasets . If you choose FIELDS for options, you must define the datasets from which the user can select the columns. If CONSTANT is used, you can add a list of the possible options that users can choose from by defining the constant_options
constant_options
List of options that you want presented to the user. See options
above
datasets
List of datasets that will be used to populate field/column options. See options above
tooltip
Description of the input parameter. This will be displayed to the user when they click on a “?” icon. Here is where you should clearly and concisely describe the parameter and how it will be used. It is best to keep it to a sentence or two.
is_required
Boolean value that defines whether or not the value is needed for your transformation
multiple
Boolean value that defines whether or not a user can select multiple values for the input parameter.
Using a template in a flow notebook
To use a template inside of the flow notebook, start by creating a new template using the syntax in the following example:
Give your template a name
, description
, and tags
. For now, your source
should always be "CUSTOM" and status
should always be “ACTIVE"
Next, you will add the transform notebook you have made to a template transform. See the following example for syntax:
Then you will add the template transform to the template and publish your template:
To use your published template as a transform, you then need to create a transform object, assign its templateId
for the id of your published template, give it a name, and pass in the values you want to use for your variables:
Note that the names of the variables were defined previously inside of the transform notebook with the get_or_create...()
method.
Now you are ready to add the transform to a recipe and run the recipe.
For further reference, see the employee_flow.ipynb example in the employee-v3-syntax project folder.