以下列舉常用的使用自定義查詢滿足的需求:

1. 根據用戶的 distinct_id 查詢

直接使用 distinct_id 查詢這個用戶在某天的具體行為:

SELECT * FROM events WHERE distinct_id = 'wahaha' AND date = '2015-09-10' LIMIT 100
SQL

2. 查詢每天上午 10 點至 11 點的下單用戶數

使用標準的 SQL 日期函數 EXTRACT 來取出小時訊息。

SELECT date, COUNT(*) FROM events 
WHERE EXTRACT(HOUR FROM time) IN (10, 11) AND event = 'SubmitOrder'
GROUP BY 1
SQL

3. 查詢一段時間內的用戶下單次數分佈情況

首先計算每個用戶的下單次數,然後使用 CASE..WHEN 語法來分組。

SELECT 
CASE
WHEN c < 10 THEN '<10'
WHEN c < 20 THEN '<20'
WHEN c < 100 THEN '<100'
ELSE '>100'
END,
COUNT(*)
FROM (
SELECT user_id, COUNT(*) AS c FROM events
WHERE date BETWEEN '2015-09-01' AND '2015-09-20' AND event = 'SubmitOrder'
GROUP BY 1
)a 
GROUP BY 1
SQL

4. 查詢做了行為 A 而沒有做行為 B 的用戶數

使用 LEFT OUTER JOIN 計算差集。

SELECT a.user_id FROM (
SELECT DISTINCT user_id FROM events WHERE date='2015-10-1' AND event = 'BuyGold'
) a
LEFT OUTER JOIN (
SELECT DISTINCT user_id FROM events WHERE date='2015-10-1' AND event = 'SaleGold'
) b
ON a.user_id = b.user_id
WHERE b.user_id IS NULL
SQL

5. 計算用戶的使用時長

使用分析函數,根據每個用戶相鄰的兩個事件的間隔估算累計使用時長,如果兩次使用間隔超出10分鐘則不計算。

SELECT
user_id,
SUM(
CASE WHEN
end_time - begin_time < 600
THEN
end_time - begin_time
ELSE
0
END
) FROM (
SELECT
user_id,
EXTRACT(EPOCH FROM time) AS end_time,
LAG(EXTRACT(EPOCH FROM time), 1, NULL) OVER (PARTITION BY user_id ORDER BY time asc) AS begin_time
FROM events
WHERE date='2015-5-1'
) a
GROUP BY 1
SQL