Connect to external connectors
RapidCanvas connectors module enables you to interact with different external connectors to import data into the platform and make predictions on this data with the built machine learning models.
List of connectors supported:
Google cloud storage. For more information, see Import data from Google Cloud Storage.
Amazon S3. For more information, see Import data from Amazon S3.
Azure blob. For more information, see Import data from Azure Blob.
MongoDB. For more information, see Import data from Mongo DB.
Snowflake. For more information, see Import data from Snowflake.
MySQL. For more information, see Import data from MySQL.
Amazon Redshift. For more information, see Import data from Amazon Redshift.
Fivetran connectors. Example : Google Drive. For more information, see Import data from Google Drive.
Viewing options in dataset pull-out window
Use this procedure to view all actions you can perform through pull-out window of a dataset.
To view dataset options:
Click on the dataset block on the canvas. This opens the pull-out window.
Perform any of these actions:
Click Preview to view the data in the file you have uploaded. For more information, see Viewing the dataset information.
Click the AI Guide icon to navigate to the AI Guide to query about the dataset.
Click the plus icon to select Template Recipe, AI-assisted Recipe, or Rapid Model Recipe options. For more information, see Recipes.
View the significance of the dataset node. This AI snippet is only visible when the AI snippet generator is run. You can always update the content using the Update content option.
Click the ellipses icon and select Export to export the file onto your local system in CSV format. For more information, see Exporting a dataset to the local system.
View the summary of the dataset in the Summary section. You can create the summary in the Notebook.
- Review these details:
Created: The date on which the file was uploaded.
Updated: The date on which the file was last updated.
Total size: The total file size.
Rows: The total number of rows in the file.
Columns: The total number of columns in the file.
Source: The source from where the dataset has been imported. Clicking on the file import will redirect you to the View Data page.
If the dataset has been imported from connectors, you can view the following source details:
Data connector name: The name of the connector. Clicking on the link will redirect you to the data connector configuration page.
Type: The type of connector.
If this is an output dataset, you can view the Destination Details section:
Select the data connector to which you want to send the output dataset.
If the selected connector is GCS, specify the destination folder name and destination file name, then click Save to save these details. Later, you can export the file. You can also use the Delete option to remove the configured destination.
If the selected connector is MySQL, specify the table name. There are two options:
Append: Appends the dataset to the existing table, provided both have the same schema.
Replace: Replaces the existing data and schema with the new one.
Once configured, click Save to save the details. Later, you can export the file.
If the selected connector is Snowflake, specify the table name, database name, schema, warehouse, and role. There are two options:
Append: Appends the dataset to the existing table, provided both have the same schema.
Replace: Replaces the existing data and schema with the new one.
Once configured, click Save to save the details. Later, you can export the file.
Note: Fivetran connectors cannot be configured in the Destination Details section.
Viewing the dataset information
Use this procedure to view dataset details and perform various actions.
To view data:
Select the dataset block uploaded onto the canvas. This opens the pull-out window.
Click Preview to navigate to the Data page. The dataset records are displayed in a tabular format.
View the data in each column under the Data tab:
Check the data type associated with each column.
Use the search box to find specific terms and clear the search when needed.
Column Options
Click the ellipses icon next to each column to:
Hide – Hide the column in the table.
Sort ascending – Sort the column in ascending order.
Sort descending – Sort the column in descending order.
Search column – Search for a specific column name.
Additional options:
Click Resize Columns to adjust column width.
Click Schema to view the data type of each column in the dataset.
View associated segments under the Segments tab:
Field
Description
Name
The name of the segment.
Description
The description of the segment.
Created
The date and time the segment was created.
Rows
The row limit for segmentation.
Actions
Edit or delete the segment.
Under the Data Analysis tab, analyze the dataset to identify:
Missing values
Total variables (numeric, text, categorical)
Total observations
Duplicate values
You can also generate charts to gain insights from the complex data. The charts also help you understand the data patterns and trends required to perform feature engineering. You can generate four charts initially, and doing advanced analysis gives you up to 10 charts by sending the dataset to AI.
Navigate to the Correlation tab to extract correlations and relationships within the data. The correlation heat map shows how each variable in the dataset is correlated with one another, as a color-coded matrix.
Under the Alerts sub-tab in Data Analysis, view alerts and associated tags.
Click the Source tab. Please note that this tab is only visible for SQL based connectors. This allows you to modify the SQL query directly within the interface, ensuring that your dataset reflects the latest query data. Any changes made to the query will automatically update the dataset with the most current data available. However, updating the query data on the data pipeline will cause all related recipes linked to this dataset to move to an unbuilt state. To incorporate the changes, you will need to re-run the entire data flow.
Review the source details:
Field
Description
Source Type
SQL-based connector type.
Connection Name
Name of the connector.
Updated on
Date the connector was last updated.
Query
Click Edit to modify the SQL query and run it.
Important: Updating the query in the data pipeline moves all related recipes to an unbuilt state, requiring a re-run of the data flow.
Click Save Query to update the query. Doing this replaces the existing data with the latest query data.
Click Yes, Replace on the dialog box. This updates the existing query with new SQL query.
Click Audit History in the Actions tab to check the log of user activities. Each entry will include the user who performed the action, type of action performed and timestamp.
You can perform the following actions on the data view page clicking the plus icon:
Append a file: Add a new file to the dataset (default is Append).
Add a template recipe: Use the Template option.
Use AI-Assisted recipe: Use the AI-assisted option.
Run Rapid Model recipe: Use the Rapid Model option.
Add a segment: Available only for the source dataset.
You can perform the following action clicking the Actions drop-down:
Download the dataset, using the Export option.
Use the Export Search Results option in the Data tab to export only the results for your specific search term. After conducting a search, you can export just the results obtained for that term
Delete the dataset and associated recipes with the dataset, using the Delete option.
You can also use the AI guide to query on this dataset, using the AI guide icon.
Exporting a dataset to the local system
Use this procedure to download the input and output dataset to your local system in the csv file format.
To export a dataset:
Select the dataset block that you have uploaded onto the canvas. This opens the pull-out window.
Click Export to download the dataset in csv format to your local system.
You can also export the dataset from the dataset page, using the EXPORT option. This dataset page is displayed clicking VIEW DATA in the pull-out window.
Deleting the uploaded dataset
Use this procedure to delete a dataset block from the canvas.
To delete a dataset block:
Select the dataset block that you want to delete from the canvas. This opens the pull-out window.
Click Delete to delete the dataset.
A dialog box prompts that deleting the dataset also deletes the recipes associated with it.
Click Delete to delete the dataset permanently from the project or click Cancel to discard the action.
Reloading latest datasets from fivetran connector
Use this procedure to reload fresh data from the fivetran connector. This dataset syncs with the remote storage and retrieves the latest dataset.
To reload a dataset:
Select the dataset block that you want to reload from the canvas. This opens the pull-out window.
Click Reload to reload the dataset.
A dialog appears. Click Reload to fetch the latest dataset. Ensure that the schema of this dataset same as the current one.
Configuring Destination Details for Output Datasets
When working on the Canvas, you may generate an output dataset. You can configure the Destination Details to specify where the latest output dataset will be stored each time the canvas flow runs.
Steps to Configure Destination Details:
Select the Output Dataset – Click on the output dataset block on the canvas to open the side sheet.
Choose a Data Connector – Select the connector where you want to save the output dataset. The available fields vary based on the selected connector.
Specify Table Name (for MySQL Connector) – If using a MySQL database connector, enter the table name where the output dataset should be stored.
Set Save Mode – Choose either:
Append – Adds new data to the existing dataset, provided both share the same schema.
Replace – Replaces the existing dataset with the latest output.
Save the Configuration – Click Save to apply the changes. You can then use the Export option to send the dataset to the selected connector.
Once saved, the destination connector appears as a node on the canvas, serving as a visual indicator that the output dataset is linked to a destination.
Additional Actions:
Delete Connector – Remove the connector linked to the output dataset using the Delete button.
Preview Data – Click Preview to view the contents of the dataset.
Use AI Guide – Get AI-generated prompts to explore and analyze the dataset.
Run Recipes – Click the + (plus) button to apply different recipes and perform data transformations.
Export Dataset – Click the ellipses (⋮) icon to export the dataset as CSV or Parquet.
Delete Dataset – Click the ellipses (⋮) icon to remove the output dataset from the canvas.
Last updated