If you’re new to Unstructured, read this note first.Before you can create a source connector, you must first sign in to your Unstructured account:
- If you do not already have an Unstructured account, go to https://unstructured.io/contact and fill out the online form to indicate your interest.
- If you already have an Unstructured account, sign in by using the URL of the sign in page that Unstructured provided to you when your Unstructured account was created. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.
- A Google Cloud account.
- The Google Drive API enabled in the account. Learn how.
-
Within the account, a Google Cloud service account and its related
credentials.jsonkey file or its contents in JSON format. Create a service account. Create credentials for a service account. To ensure maximum compatibility across Unstructured service offerings, you should give the service account key information to Unstructured as a single-line string that contains the contents of the downloaded service account key file (and not the service account key file itself). To print this single-line string without line breaks, suitable for copying, you can run one of the following commands from your Terminal or Command Prompt. In this command, replace<path-to-downloaded-key-file>with the path to thecredentials.jsonkey file that you downloaded by following the preceding instructions.-
For macOS or Linux:
-
For Windows:
-
For macOS or Linux:
- A Google Drive shared folder or shared drive.
- Give the service account access to the shared folder or shared drive. To do this, share the folder or drive with the service account’s email address. Learn how. Learn more.
-
Get the shared folder’s ID or shared drive’s ID. This is a part of the URL for your Google Drive shared folder or shared drive, represented in the following URL as
{folder_id}:https://drive.google.com/drive/folders/{folder-id}.
Document permissions metadata
The source connector outputs any permissions information that it can find in the source location about the processed source documents and associates that information with each corresponding element that is generated. This permissions information is output into thepermissions_data field, which is within the
data_source field under the element’s metadata field. This information lists the users or groups, if any, that have
permissions to read, update, or delete the element’s associated source document.
The following example shows what the output looks like. Ellipses indicate content that has been omitted from this example for brevity.
In the following examples, you must specify the
service_account_key value as a JSON-formatted object
that contains the ID of the related registered secret and its encryption type. This information represents the
encrypted version of the contents of the Google Cloud service account’s credentials.json key file. You get this
information by following the instructions in Secrets.If you specify the service_account_key value as a plain-text string instead,
Unstructured might still create the connector successfully. However, when you then try to test or use the new connector,
the connector will fail and the following error message is returned:
Field is sensitive and must be wrapped in as a secret reference or new secret value.-
<name>(required) - A unique name for this connector. -
<drive-id>- The ID for the target Google Drive folder or drive. -
For
service_account_key, specify the ID of the registered secret and its encryption type, representing the encrypted contents of thecredentials.jsonkey file. For more information, see Secrets. -
For
extensions, set one or more<extension>values (such aspdfordocx) to process files with only those extensions. The default is to include all extensions.Do not include the leading dot in the file extensions. For example, usepdfordocxinstead of.pdfor.docx. -
Set
recursivetotrueto recursively process data from subfolders within the target folder or drive. The default isfalseif not otherwise specified.

