DataX Formats, Schemas, Identifiers & Limits¶
Abstract¶
Describes the ingestion formats, partner identifiers, rate-limited quotas for taxonomy and audience posts, and data retention for audiences for DataX partners. Error code responses are also described in this section.
Ingestion Formats¶
Partner will call the POST /v1/usermatch
endpoint with their assigned OAuth 1.0 credentials.
Partner Match data is shared in one of the following methods:
CSV File with bzip2 compression¶
Define Schema in the .csv
file header.
Example 1:
PXID |
SHA256EMAIL |
Partner ID 1 |
HashedEmail Value 1 |
Partner ID 2 |
HashedEmail Value 2 |
Example 2:
SHA256EMAIL |
PXID |
GPADVID |
HashedEmail Value 1 |
Partner ID 1 |
GPADVID1 |
HTTP /POST¶
Note that before uploading your JSON, you’ll need to encrypt it with SHA256. That means, you must convert your email list to a hash which you would then place in your file. For example: If the original email is barry@verizonmedia.com, the hashed value in the file would be
d48adb3c108a657adf7597921f3bfc591ee3f00d658d2d288e0bb396ac0d5964.
Important
Your file name must be properly normalized using lowercase and contain no spaces.
DataX only supports pre-encrypted files. Once the files are encrypted, all the personal data that resides on the files will be protected, using the SHA256 function, so that no raw emails are ever stored.
If an email address is not hashed in the proper format, DataX will not process the audience records.
Follow these steps:
Define schema parameter during a POST call.
Introduce a comma-separated
schema
parameter that defines the column order and the name values.
Example
POST /v1/usermatch?schema=PXID,SHA256EMAIL
Partner ID1, HashedEmail Value 1
Partner ID2, HashedEmail Value 2
Partner ID3, HashedEmail Value 3
Notes:
A Partner has the option to define the schema as they wish.
There is a priority of the schema defined at different places: schema parameter > schema header in files > default schema.
If a schema is not defined during ingestion, the default schema format will be
PXID|SHA256EMAIL
.
Data Validation¶
When calling the POST /v1/usermatch
endpoint, the uploaded bz2 file’s format and content will be validated. The header line and up to the first 10 lines in the content will be checked. If the validation is not passed, the following error code and message will be returned:
Error Code |
Error Message |
Description |
400 DxFileIsNull |
No data file provided in the request |
Make sure the data file is provided. |
400 DxFileIsEmpty |
Empty data file provided in the request |
Check the content of the file. |
400 DxFileNotValidBZ2 |
Bz2 file is malformed |
Check the file is a valid bz2 file. |
400 DxFileNotValidSchema |
Insufficient number of valid schema fields |
Check the file schema field number is 2 and the delimiter is comma. |
400 DxFileHeaderOnly |
File with only header line |
Check the content lines in the file. |
400 DxFileInvalidDelimiter |
Invalid delimiter |
Check the delimiter used is comma. |
400 DxFileHeaderOnly |
File with only header line |
Check the content lines in the file. |
400 DxFileInconsistentFieldAndSchema |
Field numbers not consistent with schema numbers |
Check the number of content fields and schema fields are the same. |
400 DxFileInvalidPxid |
Invalid pxid |
Make sure PXID is formed by the following ASCII characters from 33 to 126 in ASCII table (44 is comma and is excluded). |
400 DxFileInvalidHashedEmail |
Invalid hashedEmail |
Make sure hashedEmail’s length is 64 and the valid characters are ‘a’-’f’, ‘A’-’F’ and ‘0’-9’. |
400 DxFileInconsistentContent |
Inconsistent schema and content |
Make sure the header field order is the same as the content field order. |
DataX Partner Identifier¶
Variable |
Syntax |
PXID - Partner Cross Identifier |
|
Rate Limitation¶
DataX is rate-limited per the following quota limits for taxonomy and audience posts:
Error Code |
Error Message |
Description |
429 Too many requests |
Rate Limit exceeded per hour (Limit: 100) |
Number of requests allowed in an hour per provider. |
Data Retention¶
Segment Expiration:
Audiences will be linked to the active segment for 45 days.
Audience refreshes can be posted at any time (for example, daily, weekly, monthly, etc).
If audiences are not refreshed by the default expiration TTL, they will be unlinked from the segment(s).
User Match Expiration:
Partner IDs will be stored in-house for measurement.
In-house data will be stored/ linked to a Verizon Media ID for 90 days to support audience refreshes.
Error Responses¶
Error Code |
Error Message |
Description |
400 DxInvalidRequest |
<urtType> is not supported |
Used urnType is not supported |
400 DxJobNotFound |
Cannot Find Job with id <request_id> |
The request_id is not found in the datax db. |
400 Bad Request |
Bad Application Id |
The application id is not correct. |
500 DxInternalError |
Unable to Create Job |
The upload job can’t be created. |
500 UNABLE_TO_PROCESS_REQUEST |
Failed to process. Try again after some time |
Server is not available during the processing time. |