1. Data model introduction

In Sensors analysis, we use the Event model to describe the user's various behaviors on the product, which is also the core basis for all interface and functional design of Sensors analysis.

To put it simply, the Event model includes two core entities, event and User, which can be combined with the Item entity for various dimensional analysis. In the strategy analysis, interfaces are provided respectively for users to upload and modify these two types of corresponding data. These two types of data can also be separated or combined to participate in specific analysis and query. These two concepts will be described in detail later.

1.1. Event Model Vs. Page View

In the traditional Web era, PV (short for Page View, that is, page view) is usually used to measure and analyze the quality of a product, and then, to the mobile Internet and O2O e-commerce era, PV is far from meeting the analysis needs of products and operators.

In this era, each product has a unique core indicator to measure the success of the product, this indicator may be the number of posts, the number of videos played, the number of orders or other indicators that can reflect the core value of the product, which is not measured by a simple PV.

In addition, the PV model can not meet some of the more detailed, more refined analysis. For example, we want to analyze which types of products sell best, the age and gender composition of users who visit the website, the conversion rate, retention and repeat purchase rate of users from each channel, the customer price of new and old users, the flow rate, the proportion of subsidies, and so on. These questions are the traditional statistical analysis with PV as the core can not be answered.

Therefore, the event model is used as the basic data model. Event models can give us more information about what users are doing with our products. Event models give us a more comprehensive and specific view and guide us to make better decisions.

Of course, using the event model of Sensors analytics, it is still possible to complete PV statistics, and it is also very simple to use the SDK or import tool to upload a similar interface:

{ 	"distinct_id": "2b0a6f51a3cd6775", 	"time": 1434556935000, 	"type": "track", 	"anonymous_id": "2b0a6f51a3cd6775", 	"event": "PageView", 	"properties": { 		"$ip" : "180.79.35.65", 		"user_agent" : "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.)", 		"page_name" : "网站首页", 		"url" : "www.demo.com", 		"referer" : "www.referer.com" 	} }
CODE

1.2. Event

1.2.1. The five elements of an Event

Simply put, an Event is a description: a user at a certain point in time, a certain place, in a certain way to complete a specific thing. It can be seen from this that a complete Event contains the following key factors:

  • Who: indicates the user who participates in the event. In our data interface, use distinct_id to set a unique ID for a user: for non-logged users, this can be a cookie_id, device ID, or other anonymous ID; For login users, you are advised to use the actual user ID assigned by the background. At the same time, we also provide the track_signup interface, which is called when a user registers, to combine the anonymous ID of the same user before registration with the actual ID after registration for analysis.
  • When: indicates the actual time when the event occurs. In our data interface, the time field is used to record the time of the event to the millisecond. If the caller does not actively set the time, each SDK automatically obtains the current time as the value of the Time field.
  • Where: indicates the location of the event. The user can set the $ip property in the properties, so that the system will automatically resolve the corresponding province and city according to the ip, of course, the user can also obtain the geographical location information according to the application GPS location results, or other ways, and then manually set $city and $province. In addition to the two preset fields of $city and $province, you can also set some other region-related fields. For example, a product engaged in community O2O may need to care about the situation of each cell, you can add a custom field "HousingEstate"; Or a product that engages in multinational business and needs to care about the situation of different countries, you can add a custom field "Country".
  • How: How the user engages in this event. This concept is relatively wide, including the user's device, the browser used, the App version used, the operating system version, the channel entered, the referer when jumping over, etc. At present, the following fields are preset to describe this kind of information, users can also increase the corresponding custom fields according to their own needs.

    $app_version:应用版本 $city: 城市 $manufacturer: 设备制造商,字符串类型,如"Apple" $model: 设备型号,字符串类型,如"iphone6" $os: 操作系统,字符串类型,如"iOS" $os_version: 操作系统版本,字符串类型,如"8.1.1" $screen_height: 屏幕高度,数字类型,如1920 $screen_width: 屏幕宽度,数字类型,如1080 $wifi: 是否 WIFI,BOOL类型,如true
    CODE
  • What: Describes the specific content of the event that the user did. In our data interface, we first use the event name to make a preliminary classification of the content done by the user. There are also certain guidelines for the division and design of events, which we will describe in detail later. In addition to the crucial field event, we do not set too many preset fields, but ask users to set specific Settings according to the actual situation and analysis requirements of each product and each event. The following are some typical examples:
    - For a "purchase" type event, the fields that may need to be recorded are: product name, product type, purchase quantity, purchase amount, payment method, etc.
    - For a "search" type event, the fields that may need to be recorded are: search keyword, search type, etc.;
    - For a "click" type event, the fields that may need to be recorded are: click URL, click title, click location, etc.;
    - For a "user registration" type event, the fields that may need to be recorded are: registration channel, registration invitation code, etc.;
    - For a "user complaint" type event, the fields that may need to be recorded include: complaint content, complaint object, complaint channel, complaint method, etc.
    - For a return request type event, the fields that may need to be recorded are: return amount, return reason, return method, and so on.

1.2.2. Event division and field design principles

In order to make better use of the powerful and convenient analysis function provided by Sensors analytics, we strongly recommend that users spend some time to sort out their data usage requirements, and make Event division and field design accordingly.

In the process of dividing and designing Events, the Sensory Analysis team also provides corresponding technical support and services. In addition, we have summarized some basic principles here, hoping it will be helpful to users.

1.2.2.1. Client-side tracking Vs. Recording Events on the backend

Traditional analysis tools like Umeng and Baidu Analytics embed SDK for tracking on the client side. However, we strongly recommend recording Events on the backend for the following reasons:

  1. Many actions, such as placing an order, have many fields that cannot be obtained on the front end (App and web interfaces). Some actions, such as offline user consumption, do not even have the corresponding functionality on the front end, so there is no way to obtain the corresponding data.
  2. Modifying the backend program is more convenient. If data is recorded on the App side, each modification requires waiting for the App to be released and users to update it;
  3. Collecting data on the App side poses the risk of data loss and delayed data uploads. To avoid wasting user traffic, the App generally packs multiple data points and waits for good network conditions and the App to be in the foreground before compressing and uploading them. Therefore, this naturally leads to delayed data uploads. It is very likely that data from one day will wait for several days to be transmitted to the server. This naturally causes deviations in daily metrics. Also, due to the limited content that can be cached on the App side and issues with the user's network connection, the data collected by the App currently has no good way to guarantee that it won't be lost 100%.

Based on the above considerations, unless an action only occurs on the front end and doesn't require any backend requests, we recommend always collecting data on the backend.

1.2.2.2. Event partition principles

We have the following recommendations for partitioning Events:

  1. To save costs, start with demand and only record Events that will be analyzed. This is a major difference from traditional PV analysis products. Events are recorded to better understand how users use the product. For usage scenarios that will not be analyzed for the time being, they can temporarily not be recorded.
  2. The number of Events should not be excessive. For a typical user product, it is advisable to have no more than 20 Events. Of course, this is just a principle-based recommendation for event design, and the system itself does not have this limitation. Similar user operations can be merged into one Event. For example, if a product cares about the visitation of a series of product category pages, it does not mean that every click on a product category page should be divided into a separate Event. Instead, a separate Event for visiting a product category page can be divided, and then different categories can be recorded as fields.
  3. Events are not limited to user operations and usage on the front end, such as user phone complaints, user receipt of services offline, and user offline consumption at merchants. If the corresponding data can be obtained and the data analysis will be used, they can also be treated as appropriate Events.

1.2.2.3. Field design principles

We have the following recommendations for designing fields for each Event:

  1. First, list the metrics and dimensions for analysis based on the requirements, and then deduce the fields that need to be recorded for each Event from the metrics and dimensions.
  2. Sensors Analytics is a data analysis tool, not a log storage and backup system. Therefore, some unused fields, such as the complete contents of a Cookie and the backend request return code, do not need to be recorded and collected as fields for an Event.
  3. If a field is already included in the predefined fields, it is recommended to reuse the predefined field as much as possible. For descriptions of all predefined fields, please refer to the corresponding explanations in the "Data Format" section of "Data Format".
  4. Once the design of a field of an Event is determined, do not modify its type and value meaning. For example, if we initially designed a numerical field called "Money" for the "Buy" Event to describe the purchase amount in yuan, and later we expect to change it to fen, we recommend abandoning the "Money" field and adding a new field called "MoneyByCents" instead of changing the meaning of "Money".

1.3. User

1.3.1. Record and collect User Profile

Each User entity corresponds to a real user, identified by distinct_id, describing the long-term properties of the user (also known as Profile), and the user can be associated with the behaviors the user is engaged in, also known as Events.

Sensors Analytics provides a series of profile_xxx interfaces to record and modify the profile of a user.

Typically, the place to record User Profile is limited to several occasions where users register, complete personal information, or modify personal information. Similar to Events, we strongly recommend that User Profile be recorded and collected in the backend.

Which fields should be collected as User Profile depends entirely on the product form and analysis needs. In simple terms, among the user attributes that can be obtained, those that are helpful for analysis should be collected as Profile.

1.3.2. The choice of whether the field should be recorded in Profile or Event

In some cases, we may struggle with whether a user-related field should be recorded in a Profile or an Event. A basic principle is that a Profile records attributes of a user's characteristics, such as place of birth, gender, place of registration, type of initial AD source, etc. The Event field records the characteristics of the event. The value of the field is based on scenarios, such as province, city, device model, and login status.
Take "address" as an example, Event records the address used when placing the order. Profiles often record "common addresses." The Profile field values are the user characteristics recorded by users or obtained based on certain conditions. The field values change as users update information.
For Profile fields, generally speaking, in order to reduce maintenance costs, we prefer to use some fixed fields, such as "age" and "date of birth", because the value of the "age" field needs to be updated regularly and the age can be calculated using the date of birth, which is fungible. There is no need to use the "age" field, which needs to be updated frequently, and we would prefer to use "date of birth".

1.4. Item

Starting from Sensors Analytics 1.14, Item entity is supported.

In the Event-User model, Event is designed to be immutable for performance and interpretability considerations. From a logical point of view, it seems to be no problem, because Event represents events that have occurred in history and generally do not need to be updated.

However, in the actual application process, it is not necessarily such an ideal state.

For example, in the collection and analysis process, it is discovered that in the Event entity, many basic information is constantly changing.

  • Many basic information is constantly changing in event entity.
  • In the process of data collection, it is found that some Events have incomplete data at the initial stage.

At this time, the Item entity can be used to supplement the Event-User model.

The so-called Item here strictly refers to an entity associated with user behavior, such as a product, a video series, a novel, etc. There are many scenarios where Item is used. Below are two of the most common ones.

1.4.1. The application of the Item model in Sensory Analysis systems

A typical scenario of the Item model is to be used as a dimension table for Sensors Analytics.

1.4.2. Item application in Sensors Analytics recommendation system

For more details about our referral service, please contact our technical support.

The core value of the recommendation is to calculate the items that the user is most likely to consume (that is, items), and return the recommendation results to the front end of the product for the user to consume. Based on the Item data, the recommendation system will build a portrait of the recommended item (high-dimensional vector) and calculate the item that the user is most likely to consume or similar items.

When integrating with the Sensors recommendation system, developers need to upload data through the SDK's itemSet series interface. At the same time, Sensors also provides a management backend for directly managing the Item table. For example, administrators can ban or adjust the weight of a specific item for optimizing recommendation performance.