Overall process

Sensors Analytics provides a very complete data access solution, no matter what kind of technical architecture your product uses, it can be very easy to access the Sensors system.

In general, the process for a complete data access is as follows:

  1. Understand the basic concepts of Sensors Analytics and understand what divine analysis is, especially if you need to focus on readingData model instructions.
  2. If you have the assistance of the corresponding data analyst in Sensors Analytics, please confirm that you have obtained the corresponding event design plan, which should include all the event and attribute design suggestions.
  3. If private deployment is used, check with relevant op colleagues to ensure that correct data access addresses are configured. If you are not sure about this, contact Sensors technical support.
  4. Test data and prod data should be connected to different projects, please refer to the specific concepts Multiple Projects.
  5. According to the event design and related requirements, perform specific access work as required, such as implementing buried points on the client or importing historical data. For details, see Section 2.
  6. Test and validate the data, and then go live to the prod environment.

The above procedure is for reference only. If you have any questions, please contact your Sensors advisor.

Access procedure

Bullet point

  1. No matter which access method is used, you are advised to read it firstData format, to better understand the principles of divine data access.
  2. Recommended for developmentDebug Mode test the correctness of data acquisition and trace problems.
    Note: Debug mode is set for the convenience of developers debugging mode, this mode will check the data one by one and throw an exception when the check fails, the performance is much lower than normal mode. Using the Debug mode in an online environment seriously affects performance and risks crashes. You must replace or disable the Debug mode before launching the product.
  3. Frequent useUser Behavior Tracking ManagementView access details.
  4. In strict accordance with the definition of the event design to bury the point, especially pay attention to different sources (such as Android /iOS, or historical data, etc.) events, attributes need to be considered uniformly, so as to avoid definition conflicts, especially the definition of data types. For example, here are some typical mistakes:
    1. Android payment events are called pay_order, and iOS payment events are called PayOrder, which can cause difficulties in use and understanding.
    2. The amount attribute of the Android side is called money, and the type is a number, while the iOS side uses the string type, which will cause data cannot be imported.
  5. The type of an attribute is determined by the type when the attribute is imported for the first time. Only input of the same type is accepted for subsequent imports. Input data of different types will be rejected.
  6. Generally, the event name, attribute name, and attribute type cannot be modified. You must confirm the event attribute design before adding data. If there are changes in events and properties during the test phase, you can use the project reset function to re-initialize the test environment:Multiple Project Management Tool User Guide.

How to identify users

To identify a user is to select an appropriate identifier (such as a device ID or registration ID)distinct_id to send data to the Sensors. Whether you chose the right onedistinct_id This can have a significant impact on the accuracy of data analysis, so it is important to confirm your user identification method before making any data access. Creative analysis provides flexible and powerful user identification capabilities. You can choose the right solution according to your needs. For details, please read the documentationIdentifying Users - Easy User Association (IDM 2.0 & IDM 1.0). If you are still unsure about user identification, please contact our data analysts.

Client buried point

If you do not need to access data from the client, you can skip this section.

The following client access schemes are available:

  1. Use the Sensors SDK directly(.Client SDK v1.13). The solution is relatively simple, easy to use, and the SDK provides more built-in features (e.gChannel Track Etc.) and reliability guarantees (such as delayed sending in case of poor network). At the same time, all SDKS of Sensors are fully open source, and there is no need to worry about security issues such as backdoors. In general, we recommend this option.
  2. Use the existing business API, synchronize the required data to the business server, and then use the server SDK of Sensors on the server side(Java SDK)The system is added. This scheme is essentially a server-side buried point, the advantage is that it may be more accurate for business statistics (because it is synchronous with business calls), and the security is relatively high (certain client-side encryption can be carried out to increase the difficulty of falsifying data), the disadvantage is that the implementation is more difficult. We generally recommend this approach for critical business events such as purchases, payments, and so on.
  3. Use your own buried SDK. If you have already used your own buried SDK and have perfected it, you can continue to use this solution and then access through the server side as in Solution 2.
  4. If you are using a client (such as PC/Mac software) that is not currently supported by Sensors Solution, you can use either Solution 2 or Solution 3, or you can use Sensors Solution directly on the clientData Access APIto access.

Service end buried point

If you do not need to access data from the server, you can skip this section.

Whether the buried data of the client is sent to the server through the API, or directly buried in the existing business logic of the server, it belongs to the server access. Server access is availableServer SDK Each SDK has different types of delivery schemes (i.e. Consumer) to choose from, and there are two categories of schemes:

  1. Use a Consumer that sends data directly (such as Java AsyncBatchConsumer) to send data to the Sensors service in real time. The advantage is that it is convenient and simple, but the disadvantage is that a small part of data may be lost in the extreme case of machine failure or large traffic, and the service may be affected to some extent. This scheme can be used if the amount of data is not large.
  2. Use the Consumer (for example, LoggingConsumer in Java) that writes to the local log, and import it with the LogAgent. The advantage is that there is local persistence and higher reliability. The disadvantage is that the code will be slightly complicated, and you need to be responsible for operation and maintenance operations such as storage and deletion of local logs. It is recommended to use this solution when there is a large amount of data or a high requirement for data accuracy.

In general, we recommend burying points at the entrance of the service (such as the Controller layer of MVC), so that most of the data required by the buried point can be obtained, and it is convenient to unified management: If there is additional client data required by the buried point (such as device information), it can be passed through API parameters; For the business data required by the buried point (such as the discount information of the order event), it can be returned to the Controller layer through the business processing module.

Tool import (Historical data import)

If you do not have historical data to access, skip this section.

It is recommended for historical data that already exists, whether it is events or user attributes Java SDKPHP SDK generate data in a specific format and then use it 2023-05-16_14-08-00_.LogAgent v1.13(SaaS version)、BatchImporter Small data volume/private deployment) orHdfsImporter (Large data volume/Private deployment/cluster version) and other tools to import. You can also follow the SDK without using data format to generate data and import it.

In general, historical data can be imported first or imported after real-time data is officially accessed, which does not affect the final analysis result. But if you use itlogin / track_signup, be sure to read the User Identification section here to avoid user ID association errors caused by incorrect data import sequence.

User attribute access

Profile is optional. If you do not need user attributes in your business, you can skip this section.

Event is always tracked when the event occurs, while user Profile is not so fixed, but will vary according to different attributes, mainly depending on how the specific attributes are obtained. Generally speaking, there are the following ways:

  1. The occurrence of accompanying events: For example, the client SDK provided by Sensors will default to set the user's when the user first visits first access time such attributes. Similarly, you can actively set one by calling profile_set_once the first time a user makes a purchase first purchase time attribute.
  2. Access when properties are modified: The profile_set series interface is called simultaneously when a user property is modified (such as the interface for modifying user data).
  3. Periodic synchronous import: that is, periodically export data from the service database or other data sources and then use 2023-05-16_14-08-00_.LogAgent v1.13BatchImporter to import the Sensors system. You should try to implement incremental imports with fields such as the last update time, otherwise each full import may affect performance.
  4. Real-time synchronous import: For example, MySQL can be used Applier Binlog data for real-time parsing and synchronization, other databases can use similar tools. The advantage of this solution is that there is no need to modify the existing business code and the coupling is low, but additional development is required according to the specific database type.

Because user attributes are less efficient to import than events, you should try to avoid unnecessary profile operations, such as repeatedly updating a profile when the data has not changed.