Sensor Analysis supports multiple SDKs in different languages. Although these SDKs have different external interfaces, they all use a unified data format internally. Here, we provide a more detailed description of the data format.

If you use non-SDK to collect data, you need to construct the data in accordance with the data format described in this section.

  • Note: This describes the definition of the underlying data transmission format, which has nothing to do with the specific SDK calling interface.

1. Overall Data Format

The sender uses JSON as the data transmission format. This system is based on the JSON data type, with some specific restrictions.

1.1. Event Data Example

Record an event and associated attributes.

{ 	
	"distinct_id": "0f485d4daaadedae5f",
	"anonymous_id":"0f485d4daaadedae5f", 
	"time": 1434556935000,
	"type": "track",
	"event": "ViewProduct",
	"project": "ebiz_test",
	"time_free": true, //建议在导入历史数据时使用,SDK 采集的实时数据不建议使用
	"identities":{ 
		"$identity_android_id":"0f485d4daaadedae5f"
	},
	"properties": {
		"$app_version":"1.3",
		"$wifi":true,
		"$province":"湖南",
		"$city":"长沙",
		"$user_agent":"Mozilla/5.0 (iPhone; CPU iPhone OS 10_3_2 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) CriOS/58.0.3029.113 Mobile/14F89 Safari/602.1",
		"$screen_width":320,
		"$screen_height":568,
		"product_id":12345,
		"product_name":"苹果",
		"product_classify":"水果",
		"product_price":14.0
	}
}
CODE


Explanation of the above fields as follows:

  • distinct_id:The type is string, it is a user identifier, for unlogged users, can fill in device identifiers, CookieID etc. for logged in users, it should fill in the registered account; in this example, it is assumed to be an anonymous user, so a device ID is filled in;
  • login_id, anonymous_id:type is string, user identity. For unlogged users, there is only anonymous_id, no login_id information;
  • time:type is number, the actual timestamp of the event, accurate to milliseconds;
  • type: indicates the specific operation of a data (will be detailed in the summary), track means to record an event, here assumed to be a product browsing behavior;
  • event: event name, must be a valid variable name, can't start with a number and only contains: uppercase and lowercase letters, numbers, underscores and $, with $ indicating that it is a system's preset event, custom event name should not start with $ and the length of event field can't exceed 100;
  • project: the project name that this data belongs to, if this parameter is not specified, the default project is used when this field is needed. The specified project must be an existing project in the system, otherwise this data will be invalid, for more project related information, please see.multi-project v1.13;
  • time_free: optional field, indicates that this event is not filtered according to the event occurrence time.As long as time_free key is present and value is not null, time will not be checked over whether it is in the allowable import time range. When importing historical data, this field may be used;
  • identities:The user identification field in the global user association business, which can contain multiple user identifications, for specific reference. Global User Association
  • properties:The specific attributes of this event exist in the form of a dict. Those starting with $ indicate that they are system's preset attributes, and their types and Chinese names have been predefined.
    The custom attribute name needs to be a valid variable name, cannot start with a number, and can only contain: uppercase and lowercase letters, numbers, underscores. Custom attributes cannot start with $.
    The property with the same name must maintain a consistent definition and type in different events;
    The property name with the same name cannot be the same in uppercase and lowercase. If there is already a lowercase property, you cannot import the corresponding uppercase property (for example, if there is an abc property name in the metadata, you cannot pass ABC, Abc, etc. property names), otherwise the data will fail to verify and not be stored.

    • $app_version:The version of the App used by the user;
    • $wifi: Whether user was using wifi when this event occurred;
    • $province, $city: Province, City, when these two fields are not filled, it will be parsed based on IP;
    • $user_agent: Optional parameter. If this parameter is passed in, User-Agent will be parsed, and the result includes: device manufacturer, device model, operating system, operating system version, browser, browser version, crawler name (if it is a crawler)
      ; Currently, SensorsAnalytics identifies through UA and has a default property $bot_name (crawler name). For the types of crawlers, we can’t add all types in advance. The mainstream crawlers have been recognized by SensorsAnalytics, but there are two situations that can’t be identified:
      • The first type: If UA does not specify, and it is an illegal crawler that triggers JS script
      • The second type: If the spider does not trigger the JS script, it will not trigger the event collection of the SDK, so it will not be counted at all.
    • $screen_width, $screen_height: The width and height of the screen;
    • product_id, product_name, product_classify, product_price: Some specific properties related to the product.

1.2. User associated event data example

This feature is a relatively complicated one, please read Identify users before using, and contact our technical support staff when necessary.


{ 	
	"distinct_id":"130xxxx1234",
	"original_id":"0f485d4d12345e5f",
	"login_id":"130xxxx1234",
	"anonymous_id":"0f485d4d12345e5f",
	"time": 1434557935000,
	"type": "track_signup",
	"event": "$SignUp",
	"project": "ebiz_test",
	"identities":{
		"$identity_android_id":"0f485d4d12345e5f",
		"$identity_login_id":"130xxxx1234"
	},
	"properties": {
		"$manufacturer":"Apple",
		"$model": "iPhone5,2",
		"$os":"iOS",
		"$os_version":"7.0",
		"$app_version":"1.3",
		"$wifi":true,
		"$ip":"180.79.35.65",
		"$province":"湖南",
		"$city":"长沙",
		"$screen_width":320,
		"$screen_height":568
	}
}
CODE

This data means that a user with an Android ID of 0f485d4d12345e5f has successfully registered, and the registration ID after registration is 130xxxx1234. And the system backend will treat the user with Android ID of 0f485d4d12345e5f and the user with registration ID of 130xxxx1234 as the same user.

It should be noted that the distinct_id and original_id in this data structure are required fields, where disitnct_id has the same value as login_id, and original_id has the same value as anonymous_id.


Note: If you need to use this event, you need to first confirm that the user association strategy of the current project of the sensor's metric system is global user association, if using a simple user association strategy, this data will be rejected entirely.

{
	"time":1622199005123,
	"type":"track_id_bind",
	"distinct_id":"3335654b922c4686",
	"anonymous_id":"3335654b922c4686",
	"identities":{
		"$identity_android_id":"3335654b922c4686",
		"$identity_email":"test@163.com"
	},
	"event":"$BindID",
	"properties":{
		"$app_name":"Test",
		"$device_id":"3335654b922c4686",
		"$model":"Redmi Note 4X",
		"$os_version":"7.0",
		"$app_version":"1.0",
		"$wifi":true,
		"$network_type":"WIFI",
		"$lib_method":"code",
	}
}
CODE

This data indicates that an attempt was made to associate an Android ID of 3335654b922c4686 and an email of test@163.com. After the association is successful, when the two IDs report events independently in the future, they will be treated as the same user in the Sensors Analytics system.

Note: If you need to use this event, you first need to confirm that the user association strategy of the current project in the Sensors System is Global User Association, if you use a simple user association strategy, This data will be rejected in whole.

{
	"time":1622199169262,
	"type":"track_id_unbind",
	"distinct_id":"3335654b922c4686",
	"anonymous_id":"3335654b922c4686",
	"identities":{
		"$identity_email":"test@163.com"
	},
	"event":"$UnbindID",
	"properties":{
		"$app_name":"test",
		"$device_id":"3335654b922c4686",
		"$model":"Redmi Note 4X",
		"$os_version":"7.0",
		"$app_version":"1.0",
		"$wifi":true,
		"$network_type":"WIFI",
		"$lib_method":"code",
	}
}
CODE

This data indicates that the email test@163.com is disassociated from the existing users in the system. After the disassociation is successful, in the Sensor System, no user will be associated with the email test@163.com.

1.3. Example of User Data

The related operations of updating user data are mainly used to update and delete user attributes

Directly set a user attribute, if the attribute field exists it will overwrite, if it does not exist it will be automatically created.

{
	"distinct_id": "12345",
	"login_id":"12345",
	"anonymous_id":"0f485d4da1111fe5f",
	"type": "profile_set",
	"time": 1435290195610,
	"project": "ebiz_test",
	"identities":{
		"$identity_android_id":"0f485d4da1111fe5f",
		"$identity_login_id":"12345"
	},
	"properties": {
		"$province":"湖南",
		"FavoriteFruits": ["苹果","香蕉","芒果"],
		"Age":33,
		"$city":"长沙",
		"IncomeLevel": "3000~5000",
		"$name": "小明",
		"Gender":"男",
		"$signup_time": "2015-06-26 11:43:15.610"
	}
}
CODE

Unlike profile_set data, if the corresponding attribute field already exists, then this record will be ignored and it will not overwrite existing data. If the attribute does not exist, it will be automatically created.

Therefore, profile_set_once is more suitable for setting attributes that are only valid when first set, such as the user's first activation time and first registration time.

{
	"distinct_id": "12345",
	"login_id":"12345",
	"anonymous_id":"0f485d4da1111fe5f",
	"type": "profile_set_once",
	"time": 1435290195610,
	"project": "ebiz_test",
	"identities":{
		"$identity_android_id":"0f485d4da1111fe5f",
		"$identity_login_id":"12345"
	},
	"properties": {
		"$province":"湖南",
		"FavoriteFruits": ["苹果","香蕉","芒果"],
		"Age":33,
		"$city":"长沙",
		"IncomeLevel": "3000~5000",
		"$name": "小明",
		"Gender":"男",
		"$signup_time": "2015-06-26 11:43:15.610"
	}
}
CODE

Increase or decrease a NUMBER type attribute value for a user, such as incrementing the user attribute age by 1.

If there is no this user in the users table, then this user's record will be automatically created in the user table and set the corresponding attribute value, increase the value in the uploaded data on the basis of the default value 0.

{
	"distinct_id": "12345",
	"login_id":"12345",
	"anonymous_id":"0f485d4da1111fe5f",
	"type": "profile_increment",
	"time": 1435290200354,
	"project": "ebiz_test",
	"identities":{
		"$identity_android_id":"0f485d4da1111fe5f",
		"$identity_login_id":"12345"
	},
	"properties": {
		"age": 1
	}
}
CODE

Append one or more values to an array type attribute of a user. If the value uploaded this time is repeated with the value already exists in the system, duplicates are not removed by default. If the value uploaded this time is repeated, duplicates will also not be removed.

{
	"distinct_id": "12345",
	"login_id":"12345",
	"anonymous_id":"0f485d4da1111fe5f",
	"type": "profile_append",
	"time": 1437280200354,
	"project": "ebiz_test",
	"identities":{
		"$identity_android_id":"0f485d4da1111fe5f",
		"$identity_login_id":"12345"
	},
	"properties": {
		"FavoriteFruits": ["橘子","西瓜"]
	}
}
CODE

Set some attribute values of a user to null, in the uploaded data, please set the value of the attribute to any non-null value, such as true.

{
	"distinct_id":"12345",
	"login_id":"12345",
	"anonymous_id":"0f485d4da1111fe5f",
	"type":"profile_unset",
	"time":1437280200354,
	"project": "ebiz_test",
	"identities":{
		"$identity_android_id":"0f485d4da1111fe5f",
		"$identity_login_id":"12345"
	},
	"properties":{
		"Age":true,
		"FavoriteFruits":true
	}
}
CODE

Delete a user's record.

{
	"distinct_id": "12345",
	"login_id":"12345",
	"anonymous_id":"0f485d4da1111fe5f",
	"type": "profile_delete",
	"time": 1437290200354,
	"project": "ebiz_test",
	"identities":{
		"$identity_android_id":"0f485d4da1111fe5f",
		"$identity_login_id":"12345"
	},
	"properties":{
	}
}
CODE

1.4. Item Data Example

An example of item table data, mainly used to add, delete, and update the related content of the data table. The data example is as follows

Create/update a record in the item table. If the record already exists, it will be overwritten, if not, it will be automatically created.

{
	"type":"item_set",
	"item_id":"12",
	"item_type":"dub",
	"project": "ebiz_test",
	"properties":{
		"title":"because of u",
		"sub_title":"st",
		"xxx":"xxx"
	}
}
CODE


Delete a record in the item table.

{
	"type":"item_delete",
	"item_id":"16",
	"item_type":"dub",
	"project": "ebiz_test"
}
CODE


The explanation of the above fields is as follows:

  • type: item_set means to create/update a record, item_delete means to delete a record;
  • item_id: represents the id of the item;
  • item_type: represents the type of the item, distinguishing different item tables. It must be a valid variable name, that is, it cannot start with a number, and only contains: upper and lower case letters, numbers, underscores and $, and the length of the item_type field is up to 100;
    • Note, the item table uses item_id and item_type as composite primary keys;
  • project:The project name to which this data belongs. If this parameter is not specified, the value of default will be taken when this field is used, i.e., the default project. The specified project must be an existing project in the system, otherwise this data will be invalid. For more project-related information, please refer to .Multi-Project v1.13;
  • properties: The specific attributes of the reported item, existing in the form of a dict. The attribute name needs to be a valid variable name, cannot start with a number, and can only contain: uppercase and lowercase letters, numbers, and underscores;

1.5. Summary

The type field in the data represents the specific operation of a piece of data, which is to record a user's action, update a user's attributes, or create an item record. Therefore, there must be a type field in each piece of data. If this field is missing, the data will be rejected by the system and cannot be stored.

See the table below for the type and corresponding operations:

TypeCorresponding operation
trackData is imported into the events table, a row of records represents an event
track_signupData is imported into the events table, and the users table will record the corresponding login ID and anonymous ID of the event
track_id_bind, track_id_unbindData is imported into the events table, and corresponding ID will be added or deleted in users
profile_*Data is imported into the users table, a row of records represents a user
item_*Data is imported into the items table, a row of records represents an item

2. Attribute data type

2.1. Automatic recognition rule of attribute data type

If an attribute is not pre-defined in the system, when the attribute is imported for the first time, the system will determine the data type of this attribute in the system based on the value imported for the first time.

Type in JSONExample valueThe data type recognized by the system after importCorresponding constraints in the system for this type
Number12 or 12.0NUMBER-9E15 to 9E15 with up to 3 decimal places reserved
Booltrue or falseBOOLNone
String"SensorsData"STRINGMax length after UTF-8 encoding is 1024 bytes. If exceeded, the system will truncate and retain only the first 1024 bytes, and properly store into our system.
List["orange","watermelon"]LIST (array of strings)

By default it is an array of string elements (no deduplication performed for input strings), the maximum number of elements is 500, and each element is up to 255 bytes in length after UTF-8 encoding.

If you need to adjust whether the List is an array or a collection, please contact Sensor Technical Support. If append causes exceeding the maximum number of elements, the newly stored elements will eliminate the earliest stored elements.

String
  • "2015-06-19 17:51:21.234"
  • "2015-06-19 17:51:21"
  • "2015-06-19"
DATETIME (Date Time)

The first one is recommended, where SSS stands for milliseconds; the range of year is [1900, 2199]

  • yyyy-MM-dd HH:mm:ss.SSS
  • yyyy-MM-dd HH:mm:ss
  • yyyy-mm-dd (Hours, minutes, and seconds are treated as 00:00:00)

2.2. Property Data Type Conversion Rules

When a property is created in the system, its corresponding data type has been determined. If the type does not match the type recorded in the system when importing data later, it will attempt to convert the data. If the conversion is unsuccessful or fails, then the data will be rejected entirely.

Try to perform the following type conversion (space means no conversion):

Original Type →

Target Type ↓

NumericalBooleanStringString CollectionDate Time
Numerical
true -> 1; false -> 0Empty string "" discards this attribute; others are parsed as numbers

Boolean0 -> false; non-zero values -> true
String "true", "false" converted to boolean type

StringThe original value as a stringThe original value as a string

The original value as a string, e.g.

["Hello","World"]

The original value as a string
String Set




Date and TimeConversion in certain range according to UNIX timestamp in seconds or milliseconds
Parse various date time format patterns

  • The columns on the left side of the table above correspond to the target types. The rows above correspond to the original types. The target type corresponds to the data types in the metadata. The original type is the type of the attribute value when the data is uploaded.
  • When to use the numeric attribute:
    • Values that need to be aggregated (e.g. Sum, average) or sorted by range, typically such as price, duration, age, etc.
    • Unless there is a special need, all kinds of IDs (such as order ID) are not recommended to be stored as numeric types.

3. Limitations on Importing Data

3.1. General Limitations

  1. Both the event variable name (value of event) and attribute variable name (key value in properties) need to be legal variable names, which means they cannot start with a number and can only contain: uppercase and lowercase letters, numbers, underscores and $, and the maximum length of both event variable name and attribute variable name is 100. Custom event or property names cannot start with $;
  2. The variable names cannot be duplicated with the variable names of existing virtual events or virtual attributes in the system;
  3. The system has special requirements for case sensitivity of variable names. Variable names with exactly same alphabetic content but different case will be blocked;
  4. The value of the "type type" field can only be a few listed above (track, profile_* etc.), and it is case sensitive;
  5. The properties field must exist and can be empty ({});
  6. The variable name of custom events or properties cannot have the same name as system reserved fields. This section lists reserved fields.

3.2. Event time limit

Importing unreasonable time user events will affect the accuracy of data (such as client time errors cause imported future data), so by default, the event time is limited:

  1. For data imported using the client SDK (iOS, Android, Web, Mini Program, etc.), the server by default only receives data for events that occurred within 10 days and forward 1 hour in the future (compared to the current system time );
  2. Using the backend language SDK (such as Java, Python, etc.) or import tools (such as LogAgent, etc.), you can only import data within 2 years from the current event time to 1 hour in the future;

Note:

  • If you want to import data outside the default time window, you can contact on-duty students to change the window limit, or add a `time_free` field to the data (see the event data example in this document).

About the time correction mechanism for events:

  • Because the App can only use client time as the time an event occurs, if client time is inaccurate, it will cause abnormal data collection side. Therefore, Sensors Data defaults to opening the time correction mechanism: The time t1 value of the event at the time of the App is t1, the time of sending data _flush_time value is t2 (client time, and _flush_time is not stored), and the time $receive_time when the server receives the data is t3 (server time) . If t3 - t2 > 60s or t2 > t3, the client's time is considered inaccurate, and the event trigger time will be corrected. After the correction, event time t1 '= t1 + (t3-t2) .
    The following scenarios will not correct the time when the event occurs:
    • If the data is delayed in reporting (for example, the user forcibly kills the App before sending the data, causing some data not to be sent in time, it will be cached locally first, and it will try to re-send the local cache data when the App is opened next time and the network is normal), the _flush_time time for sending data is accurate and the event triggering time will not be fixed.

3.3. Property limits in different tables

For event table properties, one property can only have one type (the types of the same named property must be the same for different specific events);

For user table properties, one property can only have one type;

For item table properties, one property can only have one type;

For the same property name, it can have different types in the event table, the user table, and the item table.

3.4. Property length restriction

The data type of the property and the length limit of special fields are as follows:

ProjectLimit
Data type NUMBER-9E15 to 9E15, up to 3 digits after the decimal point

Data type STRING

Using UTF-8 encoding, the maximum length is 1024 bytes. If it exceeds, the system will truncate it, retain the first 1024 bytes of content, and store it normally
Data type LISTEach LIST can contain up to 500 strings, each not exceeding 255 bytes
User Identifier ($identity_login_id etc.)Maximum length 255 bytes
distinct_id, original_idMaximum length 255 bytes

3.5. Attribute Limit

It is recommended to reasonably set the properties of a single project's event table / user table / item table, too many will affect the import and query performance, and reaching the limit will cause import anomalies.

Suggested valueHard limit
Under 3002000

3.6. Reserved Fields

In order to ensure that the attribute name does not conflict with the system variable name when querying, the following reserved fields are set. Please avoid using them as event names and attribute names (keys in properties):

Reserved PrefixReserved Fields
Extra Reserved Fields
Event TableUser TableItem Table
  • $
  • identity_
  • user_tag
  • user_group
  • segment_

  • user_id
  • distinct_id
  • original_id

  • time

  • properties

  • id

  • first_id

  • second_id

  • users

  • events

  • event

  • date

  • datetime

  • event_id
  • event_bucket
  • day
  • week_id
  • month_id
  • _offset
  • sampling_group
  • _offset
  • first_id_type
  • second_id_type
  • generated_from
  • merged_to
  • item_type
  • item_id