Choosing the right user identification has a significant impact on the accuracy of user behavior analysis, especially for analysis functions related to users such as funnels, retention, and Sessions. Therefore, before any data integration, we should first determine how to identify users. The principles of user identification in Sensor Analytics and the user identification schemes under several typical situations will be introduced below.

Note: Do not switch the data receiving URL of different projects directly on live pages, which will lead to anomalies such as first day first time. It is recommended to send test data to the test project when testing offline. If there is no problem, the data collected online can be sent directly to the official project.

Please note that there are two types of simple user association schemes, one-to-one and many-to-one. One-to-one can switch to many-to-one, and many-to-one cannot switch back to one-to-one. Please choose the most suitable association scheme before formal integration.

After project deployment, the default is to use the one-to-one association scheme. If you need to switch to the many-to-one association scheme, you can contact Sensor Analytics duty personnel.

After the project reset, it will default to the one-to-one association scheme. If you need to switch to the many-to-one association scheme, you can contact Sensor Analytics duty personnel.

1. Basic Concept

1.1. Sensor ID

Sensor Analytics uses Sensor ID (i.e., user_id in the events table and id in the users table) to uniquely identify each product user, and Sensor ID is generated based on the reported ID data following certain rules.

1.2. Device ID (Anonymous ID)

Note that the device ID is not necessarily a unique identifier for the device. For example, Web-end Cookies can be cleared (such as various security guards), while IDFVs on iOS are different across different manufacturers' apps.

SDK typeRules
AndroidBefore version 1.10.5, it defaults to UUID (for example: 550e8400-e29b-41d4-a716-446655440000), and the UUID will change if the app is uninstalled and reinstalled. To ensure that the device ID does not change, you can configure it to use AndroidId (for example: 774d56d682e549c3); Version 1.10.5 and later SDKs use AndroidId as the default device ID, and if AndroidId fails to fetch, it gets a random UUID.
iOSStarting from version 1.10.18, if the App has incorporated the AdSupport library, the SDK will use IDFA as the anonymous ID by default. Versions before 1.10.18, the Sensor SDK will first attempt to use IDFV (for example: 1E2DFA10-236A-47UD-6641-AB1FC4E6483F); if acquisition of IDFV fails, it will use a randomly generated UUID (for example: 550e8400-e29b-41d4-a716-446655440000); under normal circumstances IDFV should be obtainable. You can also configure it to use IDFA (for example: 1E2DFA89-496A-47FD-9941-DF1FC4E6484A). If IDFA is enabled, the Sensor SDK will attempt to fetch IDFA first; if that fails, it will try to fetch IDFV. Using IDFA can avoid changes in device ID when the user reinstalls the App.
JavaScriptBy default, the cookie_id (for example: 15ffdb0a3f898-02045d1cb7be78-31126a5d-250125-15ffdb0a3fa40a) is used. The cookie_id is the default generated by Sensing JavaScript SDK, stored in the browser's cookie. It is a combination of five segments with different meanings to ensure uniqueness, including two timestamp segments, a screen width and height segment, a random number segment, and a UA value segment.
WeChat Mini ProgramThe default is to use UUID (for example: 1558509239724-9278730-00c1875d5f63f8-41373096), but the UUID will change when the mini program is deleted. In order to ensure that the device ID does not change, it is recommended to get and use openid (for example: oWDMZ0WHqfsjIz7A9B2XNQOWmN3E). If you choose to use openid, please note the [operation cache], because getting openid is an asynchronous operation, but the cold start event and other events will occur first, and the device ID of this cold start event will be wrong. Therefore, we will cache the operation that happens first and wait for the openid to be called sa.init() after being obtained before the data is sent. For methods of obtaining openid and operation caching, please refer to this document WeChat applet SDK.

1.3. Login ID

The Login ID is usually the primary key or other unique identifier in the business database. Therefore, the Login ID is relatively more precise and lasting. However, users may not register or log in when using, and there is no Login ID at this time.

The Login ID will exist in the $identity_login_id field in the users table.

Once the Login ID is determined, it should not be modified as much as possible. If you need to modify it, please contact Sensing Analytics on-duty staff.

It should be specifically noted that the user in Sensing Analytics is the subject of the events that take place, not necessarily the end user, but also an enterprise, merchant, or even a car, which needs to be flexible according to specific analysis requirements.

1.4. Plan Explanation

1.4.1. One-to-one: one device ID is associated with one login ID

1.4.1.1. Applicable Scenario

There is a user registration/login system, after the device ID and login ID are associated, actions on that device ID or login ID will be considered as actions by one user (Sensory analysis ID). When performing event, funnel, retention and other user-related analysis, it will also be counted as one user.

Although associating the device ID and login ID can achieve more accurate user tracking, it also increases the complexity of embedding. Therefore, in general, we recommend only considering ID association when the following conditions are met:

  1. Need to integrate the user's behaviour before and after registration on one device.
  2. Need to integrate the behaviour of a registered user on different devices after logging in.

1.4.1.2. Limitations

  • One device ID can only be associated with one login ID, but in reality, multiple users may use one device.
  •  One login ID can only be associated with one device ID, but in reality, a user may use one login ID to log in on multiple devices.
  • Not following the sequence of Sensing Analytics interface calls could lead to abnormal user identification (e.g., during historical data importation), impacting data statistic accuracy.

1.4.1.3. Client Access Implementation Method

Client integration refers to the use of iOS/Android/JavaScript and other SDKs for recording, the specific call flow is as follows:

  1. After SDK initialization, Sensing Analytics' SDK will automatically generate a device ID as the user identifier.
  2. When the user successfully registers, logs in, initializes the SDK (if the login ID can be obtained), the client actively calls the login(Login ID) interface.
  3. When the user logs out, there are several choices:
    1. Do nothing, in this case, it is equivalent to God's analysis will continue to use the previous user identification for tracking. If there are no special circumstances, it is generally recommended to choose this method.
    2. Actively call the logout() method, which will clear the login ID and reuse the device ID as the user identifier. There is generally no need to choose this method.
    3. For JavaScript SDK, you can also call logout(true) method. This method not only clears the login ID but also reinitializes the device ID.

Note 1:

SDK type Method to get the ID in the front-end cache
AndroidTo get the anonymous ID assigned by the god's analysis SDK through the getAnonymousId method, String AnonymousId=SensorsDataAPI.sharedInstance().getAnonymousId();
iOSthrough the anonymousId method, you can get the anonymous ID assigned by the god's analysis iOS SDK, and get the anonymous id of the current user NSString *anonymousId = [[SensorsAnalyticsSDK sharedInstance] anonymousId]; (Swift code example: let anonymousId:String = SensorsAnalyticsSDK.sharedInstance().anonymousId()).
JavaScriptThe way to get the anonymous ID sensors.quick('getAnonymousID'); return anonymous id (SDK version 1.13.4 and above support)
WeChat Mini Programsensors.getAnonymousID();

1.4.1.4. Server-side access implementation method

Server-side access includes using Java / Python / PHP and other SDKs, as well as directly using tools such as LogAgent / FormatImporter to import, and the specific process is as follows:

  • When doing server-side embedding or historic data import, if you are currently in the track or profile_set interface to pass in a login ID, then the parameter value of is_login_id must be true, to tell God's analysis that this is a behavior caused by a login ID, taking the Java SDK as an example:
  • If it is a login ID generated behavior: sa.track(registerId, true, "SubmitOrderDetail", properties);
  • If it is a behavior generated by an anonymous ID: sa.track(deviceId, false, "SubmitOrderDetail", properties);
  • For any login ID, once any data has been imported, the login ID cannot be associated with any device ID. Therefore, when importing historical data (data generated before accessing Sensors Analytics), it is recommended to operate as follows:
    • First, perform normal SDK access and ensure that all users are associated by the login/track_signup interfaces. After running for a period of time, import historical data because at this time, most active users should have been successfully associated.
    • If there is a corresponding relationship between a login ID and its corresponding device ID in the historical data, you can first construct track_signup requests to import this batch of data, and then import specific user behavior or user attribute data.
    • Due to the possibility of data loss in client-side tracking, we recommend that developers also call the track_signup method in the server's registration interface to associate the device ID and login ID of new users to achieve more accurate user identification.

1.4.1.5. Case Study

Detailed steps are described as follows:

  1. 1. A user installs an App on a Xiaomi phone and performs a series of operations. The device ID generated by the SDK is X, and the distinct_id sent is X, corresponding to the assigned unique ID is 1. Store policy ID 1 and device ID X in the id, first_id field of the users table.
  2. The user has registered and logged in, and its login ID is A. Here, the login (client) or track_signup interface of SDK is invoked, and the device ID X is successfully associated with login ID A. And store the login ID A in the second_id field of the users table, with the wizard ID still being 1.
  3. After the user logs in, the user performs A set of operations to send the distinct_id to A, and the distinctID to 1.
  4. The user logs out and performs A series of operations. The SDK does not invoke any method to send A distinct_id to identify the current user (because login ID A is bound to DistinctID 1).
  5. The user gave the mobile phone to A friend, and the friend used his own account (registered but not connected to the Sensors system) to log in to device X, and the login ID was B. At this time, Sensors SDK tried to associate device ID X with login ID B, but X was already associated with A, so the association failed. At the same time, a new ID 2 will be assigned to identify the user, and the login ID B will be stored in the first_id and second_id fields of the users table at the same time. (The user's friend account has not been associated with other devices before, and the first login device association fails.) The login ID is also recorded on first_id).
  6. After that, the user's friend uses account B to perform a series of operations on device X. The application distinct_id of B uses DistinctID 2 to identify the user (because login ID B is bound with DistinctID 2).
  7. The user changes to a new iPhone and performs a series of operations. Since the user has not logged in yet, distinct_id is sent with a new device ID Y. The distinct_id is Y, corresponding to the assigned Sensors ID 3. Store the ID 3 and device ID Y in the id, first_id field of the users table.

  8. 8. When the user logs in with account A on the iPhone, Oracle will try to associate device ID Y with login ID A. Since A is already associated with X, the association will fail, but it will still switch to the user with login ID A, whose corresponding Oracle ID is still 1.
  9. After the user logs in, the distinct_id sent by the user is set to A. Therefore, the distinct_id specified by the user is still marked with a distinctID 1.


In the above cases, user penetration across devices has been achieved to a large extent, but there are still limitations:

  • When a user changes phones, although the behavior after logging in to the account is connected with the behavior before the phone change, the behavior before logging in for the first time on the new device is still not connected and is still recognized as the behavior of the new user.
  • After the user gives the old phone to a friend, the old phone can no longer be associated with the friend's login ID because the old phone has been associated with its own login ID. Subsequent users of the old phone will be identified as the same user if they do not log in (the login ID that the old phone was successfully associated with).

1.4.2. Associate Device ID and Login ID (multiple to one)

Although the associated device ID and login ID (one-to-one) have achieved cross-device user penetration, they are still not accurate enough for some application scenarios, so Shenyi Analysis provides a new association scheme to support a login ID binding multiple device ids, so as to achieve more accurate user tracking.

1.4.2.1. Application scenario

It is common for a user to log in on multiple devices. For example, you may need to log in on both the Web and App devices. When multiple device ids are associated with one login ID, the user's behaviors under multiple devices are connected, which is considered to be the occurrence of a unique policy ID.

1.4.2.2. limitation

  • A device ID can only be associated with a single login ID, when in fact a device may be used by multiple users.
  • Once a device ID is associated with a login ID or a login ID is associated with a device ID, it cannot be removed (automatically removed).

1.4.2.3. Implementation method

The implementation method of the client and server is exactly the same as scheme 2, and the processing behavior of the divine strategy server will be different:

  • The login id of an associated device can still be associated with the new device ID and stored in the new field $device_id_list in the users table.
  • The routine task reads the list of ids that need to be fixed in the users table each day, which is $device_id_list. Read past 7days of all events data, find the data that needs to be repaired. Change the user_id field to be consistent with the id field in the profile table.

1.4.2.4. Case

Detailed steps are described as follows:

1. A user installs an App on a Xiaomi phone and performs a series of operations. The device ID generated by the SDK is X, and the distinct_id sent is X, corresponding to the assigned unique ID is 1. Store policy ID 1 and device ID X in the id, first_id field of the users table.
2. The user has registered and logged in, and its login ID is A. Here, the login (client) or track_signup interface of SDK is invoked, and the device ID X is successfully associated with login ID A. And store the login ID A in the second_id field of the users table, with the wizard ID still being 1.
3. After the user logs in, the user performs A set of operations to send the distinct_id to A, and the distinctID to 1.
4. The user logs out and performs A series of operations. The SDK does not invoke any method to send A distinct_id to identify the current user (because login ID A is bound to DistinctID 1).
5. The user gave the mobile phone to A friend, and the friend used his own account (registered but not connected to the Sensors system) to log in to device X, and the login ID was B. At this time, Sensors SDK tried to associate device ID X with login ID B, but X was already associated with A, so the association failed. At the same time, a new ID 2 will be assigned to identify the user, and the login ID B will be stored in the first_id and second_id fields of the users table at the same time. (The user's friend account has not been associated with other devices before, and the first login device association fails.) The login ID is also recorded on first_id).
6. After that, the user's friend uses account B to perform a series of operations on device X. The application distinct_id of B uses DistinctID 2 to identify the user (because login ID B is bound with DistinctID 2).
7. The user changes to a new iPhone and performs a series of operations. Since the user has not logged in yet, distinct_id is sent with a new device ID Y. The distinct_id is Y, corresponding to the assigned Sensors ID 3. Store the ID 3 and device ID Y in the id, first_id field of the users table.
8. When the user logs in with account A on the iPhone, Shenze associates device ID Y with login ID A. The association is successful, and the corresponding Sensors ID is still 1. Also add the device ID Y to the $device_id_list field of ID 1 in the users table.
9. After the user logs in, the distinct_id sent by the user is set to A. Therefore, the distinct_id specified by the user is still marked with a distinctID 1.

Subsequent fixes are as follows:

  • Because device Y is associated with login ID A, restore data on device Y before login: Sensors ID 3 -> Sensors ID 1. Note that for the data to be repaired, a new parquet file is generated with the new user_id. The repaired file is not modified for the time being, only the index is marked which data has been invalid in the source file.
  • At the same time, merge the user attributes of the user ID 3 in the users table to the user ID 1, and delete the data of the user ID 3 in the users table. During attribute merging, if the attribute of the user with Sensors ID 1 has a value, the value of the attribute is not modified. If the attribute of the user with Sensors ID 1 has no value, and the attribute of the user with Sensors ID 3 has a value, the corresponding value is merged with the user with Sensors ID 1, and the data of Sensors ID 3 in the users table is deleted.

In the above cases, cross-device user penetration is truly realized, and the behavioral penetration problem before changing mobile phone login in Scheme 2 is solved through repair, but there are still limitations:

  • A device can only be associated with one login ID. When a user gives their old phone to a friend, because the old phone has been associated with their own login ID, it cannot be associated with the friend's login ID. Subsequent users of the old phone, if they do not log in, will be recognized as the same user (the login ID successfully associated with the old phone).
  • In fact, it's hard to identify who is the anonymous login on the old phone afterwards, and it may be more reasonable to attribute it to the user who logged in most recently before the anonymous login.

1.5. Scheme Comparison

  • Plan One: Association of Device ID and Login ID (One-to-One),
    • After a user changes their phone, the behavior after logging in to the account is consistent with the behavior before changing the phone, but the behavior before the first login on the new device still cannot be consistent and is still identified as the behavior of a new user.
    • When a user gives their old phone to a friend, because the old phone has been associated with their own login ID, it cannot be associated with the friend's login ID. Subsequent users of the old phone, if they do not log in, will be recognized as the same user.
  • Plan Two: Association of Device ID and Login ID (Many-to-One)
    • When a user gives their old phone to a friend, because the old phone has been associated with their own login ID, it cannot be associated with the friend's login ID. Subsequent users of this old phone, if they do not log in, will all be recognized as the same user.
    • In fact, it's hard to identify who is the anonymous login on the old phone afterwards, and it may be more reasonable to attribute it to the user who logged in most recently before the anonymous login.

In fact, there is no right or wrong in these two solutions, and we suggest customers to choose the appropriate solution according to the application scenario of the product and the complexity of tracking.

Accurately identifying users is a very complex problem. Sensor Analytics is committed to finding the more reasonable and accurate method to meet various application scenarios.