Databricks data lakehouse unifies the best of data warehouses and data lakes in one simple platform to handle all your data, analytics and AI use cases. You can create a direct connection to export data from Crisp to Databricks via Databricks-to-Databricks delta sharing (an open protocol for secure real-time exchange of large datasets created by Databricks).
Requirements
To create this connection you will need to meet the following requirements and gather the required information:
- Have a Databricks account with admin access.
- Set up a Unity Catalog metastore (with delta sharing enabled) that is associated with the workspace to which you want Crisp data to flow. To ensure your Databricks account is set up correctly to share data, make sure you have completed the following steps (you can click the links to see the relevant Databricks documentation).
Note: In Databricks, documentation is cloud provider specific, so these links display Amazon Web Services documentation by default, but you can change the cloud provider in the top-right corner of the screen.
- Get your Databricks sharing identifier.
- Requesting a sharing identifier- Note: Though this article is written as if you were directing a data recipient to locate their sharing identifier, the steps for locating your sharing identifier are the same.
Setting up the connection in Crisp
- Log into your Crisp account.
- From the main menu on the left side of the screen, select Connectors.
-
Select the Add connector button.
- Select the Data Destinations tab, then select the Databricks connector tile.
- If you want, update the default name of the connector, then enter or paste your Databricks sharing identifier. For instructions on getting your sharing identifier, see Databricks Documentation > Requesting a sharing identifier. Note: Though this article is written as if you were directing a data recipient to locate their sharing identifier, the steps for locating your sharing identifier are the same.
- From the Databricks region drop-down menu, select a Databricks cloud region (e.g., us-east-1). Note: You can select any cloud region, but for optimal performance we recommend selecting the same cloud region as your workspace in Databricks. If you need to look up your cloud region, you can find it on the Workspaces page in Databricks. For more: Databricks documentation > Update a workspace.
-
In the Connector source data section, select the Select button.
The Select source data screen appears. -
Select the retailer(s) for which you want to export data (e.g., UNFI), then choose which data tables and columns to include in the export. For detailed instructions on making your data feed selections, see Selecting Destination Data Sources.
Hint: You can select the book icon to the right of a table name to see its documentation. -
To complete your selections, select the Accept button.
The window closes and you return to the connector setup screen. - If you want, choose additional columns to include in your export:
- To include a column that identifies which source connector the data came from, select the Include Source Connector Configuration ID option.
Hint: This is useful if you have multiple connectors for the same retailer, such as one for U.S. sales and one for Canadian sales. - To include a column that provides the date and time data was last sent from Crisp to the destination, select Include Last Exported Timestamp.
Note: Crisp retroactively ingests and re-exports data for a set period of time after the initial data capture, known as a lookback period, to account for changes from the source (such as missing register sales added later). You can use the last exported timestamp column to confirm if data was recently exported from Crisp to help troubleshoot issues or reduce costs by automating how often data flows from a destination to other systems at your organization. For more on retroactive data changes, see Data changes for destination connections.
- To include a column that identifies which source connector the data came from, select the Include Source Connector Configuration ID option.
- Select the Save button. The connection setup is complete. You can check the status of the connection by clicking the new Databricks tile.
Accessing Crisp Data in Databricks
The Crisp connection utilizes Databricks-to-Databricks sharing, for more information on accessing shared data in Databricks see Databricks documentation > Read data shared using Databricks-to-Databricks Delta Sharing.