Data Upload Architecture

The material in this document is for informational purposes only. This guide assumes that the most recent version of Rampiva Automate is in use unless otherwise noted in the prerequisites. The products it describes are subject to change without prior notice, due to the manufacturer’s continuous development program. Rampiva makes no representations or warranties with respect to this document or with respect to the products described herein. Rampiva shall not be liable for any damages, losses, costs or expenses, direct, indirect or incidental, consequential or special, arising out of, or related to the use of this material or the products described herein.

Introduction

The Data Upload feature in Automate allows users to upload and stage data for processing directly in the Automate portal, without requiring the use of a dedicated third-party data transfer solution.

This document describes the architecture considerations of the solution. For an overview of the Automate Security Design, please see https://rampiva.atlassian.net/wiki/spaces/KB/pages/1292238849/Automate+-+Security+Design.

Data Flow

When using the Data Upload feature, data is transferred from the user’s computer through the browser used to access the Automate portal to the Scheduler service, which stores the data stream as it is being received to a Data Set. Data Sets are folders associated to a specific Matter in Automate, and are organized in a Data Repository. A Data Repository typically corresponds to a Windows File Share folder, or a Linux mount point such as AWS FSx Lustre.

Data Upload is performed using the tus protocol and is compatible with modern browsers without requiring the installation of a browser plugin.

When files are being uploaded, system metadata is automatically generated for each file, containing the name and size of the file, the name of the user that performed the update, the upload time and the cryptographic hash value. Additionally, the user can associate and update custom metadata to each file, for example, custodian information or other tracking-related labels. The metadata is stored in the internal Scheduler database and can be downloaded by the user in the Automate portal.

When running a Job, the Engine assigned to the job accesses the data previously uploaded from the Windows File Share folder and receives the metadata from the Scheduler server.

Multi-Region Data Transfers

When using a centralized Scheduler server to manage Data Repositories and Engines across multiple regions, data uploaded by users from remote regions is sent across the region borders to the Scheduler instance, which in turn stores that stores that data on the configured Data Repository.

 

To address the inefficiency and potential compliance issues associated with transferring data traveling outside of the region borders, a Rampiva Scheduler Proxy server can be deployed in the remote region. In this scenario, the main Scheduler server only receives metadata from the remote region, and all data transfer and processing is performed in the region where the data originated from.

Access Control

To set up Data Sets on a Matter in Automate to a specific Matter, the user must have the Modify permission on that Matter as well as the View permission on the Data Repository to which the Data Set is assigned. Then, the upload data to the Data Set, the user must have the Modify permission on the Matter.

To view and download the metadata associated with the data previously uploaded, the user must have the View permission on that Matter.

Permissions in Automate are configured in the Security Policies configuration section, can be applied at a user or group level, on a specific Matter or an all Matters of a Client.