Self-Serve First Party Imports
The first party file import feature allows you to bring offline user data into Salesforce Audience Studio, contributing to a more comprehensive view of your customer. Offline data can include CRM data, data from other technology partners like email marketing tools, and consent data from a CMP.
Highlights of the feature include the ability to:
- Import user attributes, page attributes, transaction data, platform segments, consent data (capabilities may vary by Audience Studio Edition) in a self-service manner.
- Match customer records using one of many accepted identifier types - Salesforce Audience Studio ID (KUID), Mobile Device ID (MAID), Hashed Email, First-Party ID and Others (capabilities may vary by Audience Studio Edition).
- Customize the upload file type and format, including Audience Studio format and CSV files
- View the status of all configured offline data connectors, as well as detailed information on prior file import jobs, including number of records imported, validated, matched and consented.
How To Use the Self-Service First Party Import File Feature
In the main menu of the UI, navigate to Manage > Data Capture Sources.
If you are importing offline data files, select the Offline Files header to view the Connector Dashboard.
Within the Offline Files Connector Tile, you will see a summary for configured data connectors, including:
- Connector Name
- Data type (user, page, transaction, platform or consent data)
- Received records (# lines in the file)
- Available records (records Audience Studio has received consent for)
Here you can create a new connector for a new import or select an existing connector under the Offline Files source. You can also disable an existing connector from this screen to pause future processing of files related to that connector.
Clicking into a specific offline data connector allows for a more detailed view, including:
- Received records: Number of lines in the imported files
- Valid Records: Number of lines that were imported with no errors
- Matched Records: Number of devices (KUIDs and/or Mobile Ad IDs) in Audience Studio that match to the User IDs uploaded in the offline file:
- Native Matched Records: Devices matched using a client’s 1st party ID graph
- Identity Matched Records: Devices matched using the Salesforce ID graph
- Available Records: Devices that Audience Studio has received consent for Data Collection
- A graph visual of the metrics
- Metrics for previous imports
Please note, if your import is keyed off of an identifier other than KUIDs or MAIDs (e.g. a HEM or Bridge Key**) in your instance, the Matched Records count may appear significantly higher than Received Records and Valid Records count. A Matched Records count includes the multiple KUIDs & Mobile Ad IDs that are tied to your user ID in the import.
** A Bridge Key is an identifier to tie one or more KUIDs to devices. Look at User Identifier Definition for more information.
How To Create a New Connector
1. In the Data Capture Sources dashboard, select New Connector under the Offline Files source.
2. Enter a name for your connector in the top left of the screen.
3. Click “Add Description” to enter details about the connector being created.
4. Select the Data Type that will be uploaded via this connector.
a. If you are using the Onboarding feature, you will select User Attribute Data to upload attributes, or Platform Data to upload a list of segment members.
b. Once you’ve selected the Data Type, click Next.
5. Select the Identifier Type, or the ID associated with the connector.
a. For Onboarding CRM files, select Hashed Email Address.
b. After Selecting Hashed Email Address, you will be prompted to select the algorithm used to hash the emails being uploaded. Audience Studio Supports the SHA1, SHA256, and MD5 algorithms.
6. Select the Data Refresh Handling setting to specify how new files should be processed. Click Next. There are two options for the data refresh setting:
a. Append: Files will be imported incrementally (each new file be added to existing imported files). This option also requires a look back window to be specified in order to define how far back the system should look for files (when the users expire). For example, if you enter a look back window of 30 days, the system will only look for the last 30 days of files, no further.
b. Overwrite: Each new file will be a full refresh of the data so all users will be updated. (Note: This setting requires a look back of 0 days.)
7. Enter the File Location. The file location must be within your pre-provisioned S3 bucket. While there is flexibility in the subfolder name and the number of subfolders, there must be one subfolder in the predefined data format (manually type "YYYY-MM-DD" as a placeholder) so the system can identify when a new file is available on a given day in the S3 bucket. Click Next.
8. Select the File Compression Type. Click Next.
9. Select the File Format. You can choose between the proprietary Krux format or CSV formats. Click Next.
10. For CSV format, specify the Delimiter if it differs from the predefined delimiter. Enter the attributes from the file(For e.g: name for Segment ID column) and Click Next.
11. If you are importing user attributes, enter one row below for each column in the imported file. First, enter the attribute name as it should display in the Segment Builder. Then, enter the attribute type (e.g. “identifier”, “user attribute”, or “ignore” to disregard the field). Ensure that the rows are entered in the same order as the columns in your offline file.
a. Note: Salesforce Audience Studio only supports a single ID per file; users cannot have multiple ID types in the same file. Click Next.
12. After clicking Next, you will be presented with a summary screen, where you can make necessary edits to any of the steps conducted to this point. When ready to save, click Create Connector.
13. After clicking Create Connector, you will be brought to the Connector Summary where your new connector will appear. Note, imports run once per day. As such, your connector summary will populate the following day after it is created.
14. When you're ready to import a file, add the file to the S3 bucket with the date folder in the YYYY-MM-DD format. Audience Studio will look for today's date minus one day.
How to Import Offline Files with the Onboarding Feature
To use the onboarding feature, you will follow the exact same process outlined above for conducting traditional offline file imports:
- Set Up an Offline File Connector
a. Acceptable Data Types are “User Attribute Data” and “Platform Data”
b. Identifier Type must be Hashed Email Address
- Drop files to the corresponding S3 path designated by the user at the time of the connector creation
If the Onboarding SKU has not been purchased, Audience Studio will attempt to match uploaded HEM-keyed attributes to device IDs using the your instance’s native bridgetable only -- this KUID-BridgeKey table is built over time if you are generating and passing hashed emails to Audience Studio during an authenticated customer interaction (e.g. an email click, form completion, user sign up, log in, etc).
If the SKU has been purchased, it is auto-provisioned on the back-end and available for use within 24-48 hours. When provisioned, Audience Studio uses your native bridgetable (if present) and the Salesforce ID graph to match user attributes to device IDs for any and all HEM-based files loaded via an offline connector (newly created, or prior-existing).
Note: For Onboarding, consent files must be uploaded keyed on hashed email 24 to 48 hours in advance of uploading any User Attributes or Platform Segments. Audience Studio will propagate the consumer consent flags to all linked devices within the native and licensed identity graphs, allowing for proper gating of data collection and other data activities in subsequent uploads.
Additional First-Party Data Import Instructions
You can send a sample of the data to your Salesforce Audience Studio Implementation team to validate before sending to the Amazon S3 bucket (S3 access information will be provided securely via Box.com).
Data Privacy and Security Considerations
Audience Studio does not accept PII data and it is a breach of the MSA to send to us. Onboarding might be a better option if you have data connected to email addresses and are concerned with offline data import user matching. Usage and set up fees apply.
Note: First and Last Name, Plain Email addresses are considered PII data. City and Zipcode are considered PII in some countries but not all as it does not pertain to an individual but the population of a specific geographic area.
If conducting a Full Refresh, the file should be complete. All fields and active users should be present. If you import a file with fields “Registration – Age” and “Registration - Cable Provider” values on 10/12 and then for the next import on 11/12 you just provide a file with “Registration – Age”, then “Registration – Cable” data will be removed from the platform. The full refresh will only overwrite the standard import file, not any other data from any other source (i.e. website). A full refresh of users means that every user you want Audience Studio to be aware of needs to be included in the file on every update.
Important: If you wish to pass through a corresponding file for consent to provide consent signals for the devices included in your first party import file, you must drop the consent file three days prior to dropping the first party import file on S3. This way we can guarantee that the consent for the devices will be ingested prior to ingesting records from the import file, and that we can correctly gate data collection and other data activities accordingly.
- The file needs to be formatted according to the configuration determined during the connector setup process
- The file needs to be UTF-8 encoding (or us-ascii)
The date directory should be the day you send over the data, where YYYY-MM-DD is the date format. If you send the data on 10/12 but the date directory is 2015-10-11 then the file will not be imported. For each individual attribute, you can view the most recent date of the import in the Attributes tab to ensure the import worked properly.
File Drop Considerations
Omit any headers in the first row of the file.
The name of the file cannot have spaces or special characters. - or _ can be used.
Make sure the upload file name is as short and precise as possible. If the name of the file uploaded is too long, it might exceed the max length cap (approximately 32 characters), and AWS will truncate the name and cause a “file not found” error.
- Example file directory format + file name:
- The date should be the day you upload
- File name example =
Audience Studio stores any data collected through online mechanisms (including user matching data) for a variable period as defined in the contractual agreement between Salesforce and your organization. First-party imports remove any data not included in subsequent imports (full refresh). For best user match results, files should be uploaded frequently, or daily, even if the data output has no change.
Amazon S3 Bucket Information
Here is a general website regarding connecting to the Amazon S3 bucket via Cyberduck (though, Terminal can also be used): https://trac.cyberduck.io/wiki/help/en/howto/s3#ConnectingtoAmazonS3
- Download https://cyberduck.io
- Once installed, please follow these instructions to test your credentials supplied
- Open Cyberduck and click Open Connection in the top left.
- Select S3 (Amazon Simple Storage Service) from the drop down box at the top (if the server is not showing as s3.amazonaws.com and port is not showing as 443, the wrong connection was selected in the drop down)
- Username: (Paste the Access Key ID supplied)
- Password: (Paste the Secret Access Key supplied)
- Select the arrow by more options and Path:
- Select Connect
- You should see a file called _SUCCESS
- Click on Action Wheel and you can select upload
- The file needs to be UTF-8 encoding (or us-ascii)
- The file cannot have line breaks. Here is an article on how to fix line breaks when a file is saved via a PC/Mac vs Linux/Unix. Please refer to problem 5. If you would like to upload a test file, your Audience Studio representative will view it to verify there are no line breaks. Your IT department should be able to assist with formatting in Terminal.
- Please let us know once the file is uploaded. Once it is uploaded, our engineers have to deploy the pipeline. A safe timeline would be 2 weeks for this to be completed.
- Moving forward, the file can be dropped to the S3 by you and the Audience Studio system will automatically ingest it. The date should be the day you send the file so the system will ingest it properly. Please make sure the format is correct.