Data Integration - Data Import
This feature is only available on the Enterprise plan.
Data Import
The Data Import feature allows you to send event data stored in your AWS S3 or GCP GCS to Hackle. The Data Import feature imports data on a daily (Daily) basis.
Supported Cloud Storage
AWS
S3
Y
AWS
Redshift
Coming soon
GCP
GCS
Y
GCP
BigQuery
Coming soon
Requirements
The following tasks are required before importing data.
Create Key and Grant Permissions: GCP GCS
For GCP GCS, you can create a Key by referring to the GCP IAM > Creating and managing service account keys documentation.
The following permissions are required when creating a Key for GCS access.
storage.buckets.get
storage.objects.get
storage.objects.create
storage.objects.delete
storage.objects.listCreate Key and Grant Permissions: AWS S3
For AWS S3, you can create a Key and grant the necessary permissions by referring to the following documents.
Follow the AWS Docs: Creating an IAM User documentation to create an AWS IAM User.
Follow AWS Docs: Creating IAM Policies to create a Policy that includes the IAM Policy attached below. Then add the IAM Policy to the IAM Role created in the previous step.
Follow AWS Docs: Creating IAM Keys to create a Key.
Data Import Format
Data Import currently supports Apache Parquet format. Below is the schema for the Parquet format data. Process and store data in the format described in the table below.
Insert ID
insert_id
STRING
8fb8e088-9245-4fce-bb87-7e09d9917ed6
UUID value used to check for duplicate events.
Event Key
event_key
STRING
purchase
Event name.
Client Timestamp
ts
TIMESTAMP
2023-01-01 00:01:02.333 (UTC)
Timestamp in UTC (sub-millisecond values are truncated)
Metric Value
metric_value
DECIMAL(24, 6)
0.0
Used for value calculations in analytics and experiments. (Store 0.0 if not needed)
Identifiers
identifiers
Map<String, String>
{ "id": "8fb8e088-9245-4fce-bb87-7e09d9917ed6", "device_id": "89ABCDEF-01234567-89ABCDEF", "user_id": "49591", "session_id": "1659710029.4.1.1659710504.0" }
Map containing user identifiers
(Optional)
user_id: Logged-in user identifier (corresponds to userId when sent via Hackle SDK)(Required)
id: Device identifier (corresponds to id when sent via Hackle SDK)(Required)
device_id: Device identifier (corresponds to deviceId when sent via Hackle SDK)(Optional, stored when using GA)
ga_session_id,ga_device_id
Identifier key values are stored in lowercase.
Event Properties
event_properties
Map<String, String>
{ "product_id": "33537", "product_category": "LEISURE", "order_id": "291994100" }
Properties containing event information
Property key values are stored in lowercase.
User Properties
user_properties
Map<String, String>
{ "grade": "GOLD", "date_signed": "2022-07-01", "date_recent": "2023-01-17" }
Properties containing user information
Property key values are stored in lowercase.
Platform Properties
platform_properties
Map<String, String>
# Android example
/{ "osname":"Android", "appversion": "6.9.0", "language":"ko", "osversion":"12", "devicevendor":"samsung", "versionname":"6.77.0-DEBUG", "platform":"Mobile", "devicemodel":"SM-S908N" /}
iOS example
{ "osname":"iOS", "appversion": "6.9.3", "language":"ko-KR", "osversion":"16.0.2", "devicevendor":"Apple", "versionname":"6.77.0", "platform":"Mobile", "devicemodel":"iPhone14,2" /}
Properties containing platform information
(Required) osname (Android, iOS)
(Required) appversion
Property key values are stored in lowercase.
Below is a summary of the data format described in the table above.
Processing Data for Data Import
Process the data in the Parquet Format described above and store it by day within the bucket.
Below are examples of stored partitions and files.
Data Import Request
Please contact Hackle to request Data Import. The following information is required for Data Import.
Last updated