In this section, you need to achieve the following goals:

  • Understand the ideas behind data collection scheme design
  • Learn the writing specifications for data collection schemes
  • Write data collection schemes based on business requirements

1. What is data collection scheme design

To collect user behavior data, it is necessary to clarify the target behaviors to be collected based on business analysis needs, and further determine where and what kind of tracking points should be placed. The output of this process is generally referred to as a "Data Requirements Document (DRD)". In most internet companies, the standardized product iteration process involves the business-side product manager simultaneously producing a "Product Requirements Document (PRD)" and the data product manager or analyst producing a DRD, allowing both parties' requirements to enter development and testing simultaneously.

Since the underlying data model of Sensors Analytics is an event + user model, tracking points are referred to as "events" within Sensors Analysis, and tracking requirements are collectively referred to as "data collection scheme design". To complete this section, you will need to use the "Data Collection Scheme" template provided by Sensors Data. Please contact the corresponding Customer Success or Analyst for access to the template.

2. Ideas behind data collection scheme design

The core ideas behind data collection scheme design can be summarized as follows:

  1. Break down user behavior into individual clicks or browsing actions;
  2. Abstract the target actions that need to be analyzed as "events" and add event dimensions;
  3. Refine the data collection scheme design as a whole according to business requirements;

We have recorded a video tutorial on this topic, Design Ideas for Data Collection Schemes. If you still have questions after watching it, please contact the corresponding analyst.

3. Data Collection Scheme Template

To help you understand the data collection scheme template, we have recorded another tutorial video, Data Collection Scheme Template. If you still have questions after watching it, please contact the corresponding analyst.

4. Common Issues in Data Collection Scheme Design

4.1. Designing Events Based on Scenarios

For similar scenarios, such as submitting ticket orders or flight orders, should each scenario be designed as a separate event or combined into one event? There are two design approaches to consider:

A. Design them as the same event. Applicable scenarios: Events require similar properties; overall analysis is the focus.

B. Different events are designed for different scenarios: The attributes required for each event are very different, and the analysis scenarios are diverse. If this approach is adopted, it is also recommended to use the same attribute names for some common attributes to facilitate overall analysis using the "virtual event" feature in the future.

Example: When simply tracking the clicks of three buttons A, B, and C, there is no need to create three separate events like "click button A", "click button B", and "click button C". Instead, create a single event "click button" and pass the names of buttons A, B, and C as the attribute "button name".

4.2. Passive event

Passive event: In some scenarios, such as funnel analysis and retention analysis in Sensors Analytics, events need to be triggered by the same user. In such cases, passive events can be used. For example, after a user submits authentication information, the authentication process is not triggered by the user actively, so it can be set as a passive event.

4.3. Custom metric calculation requirements

In custom metric calculations in event analysis, arithmetic calculations can be performed on various event metrics. For properties that need to be calculated, their value types need to be numerical.

4.4. Considerations for the Users table

• Unilateral and bilateral users

Unilateral or bilateral classification is only relevant for products with multiple user identities. Unilateral users refer to products with only one type of user, such as fitness apps like Keep or messaging tools like QQ. Bilateral users can be found in O2O products, where users can be both regular consumers and merchants. Depending on the product, user identification and corresponding attributes need to be designed in advance.

• Slowly-changing dimensions

If there are attributes that may change over time, such as a user's VIP level, it is not sufficient to only store this information as user attributes in the user table. It is also necessary to record the "current VIP level at the time of the event" as an attribute in the event table. This is because the statistics of the current VIP level and the VIP level at the time of the event are two different scenarios.