题解 | #2021年11月每天新用户的次日留存率#
2021年11月每天新用户的次日留存率
https://www.nowcoder.com/practice/1fc0e75f07434ef5ba4f1fb2aa83a450
- 求次日留存,同一时间段相关的问题都要注意看in_time & Out_time 的关系,都可以参考union all 将开始和停止独立出来做相互隔离的事件
- 解题思路:因为登录登出跨天算两天登陆,则可以把时间看作相互独立事件。首先unionall 求独立的时间拉链表。其次求出每个uid的first——login date。最后join 合并成新表即 uid,操作时间,首次登陆天,是否为次日(1是/0否)。之后总结进行按天,用户分类统计是否有次日留存情况。总会进行按天汇总和求平均留存率
WITH dt_log AS (
SELECT uid, in_time AS dt FROM tb_user_log
UNION ALL
SELECT uid, out_time AS dt FROM tb_user_log
), firstLogin AS (
SELECT uid, date_format(MIN(dt),'%Y-%m-%d') AS first_login_dt
FROM dt_log
GROUP BY uid
), nextDayActivity AS (
select dl.uid as uid,date_format(dl.dt,'%Y-%m-%d') as dt,fl.first_login_dt as fl_dt, case when date_format(dl.dt,'%Y-%m-%d') = date_add(fl.first_login_dt,interval 1 day) then 1 else 0 end as isNull
from dt_log dl left join firstLogin fl on dl.uid = fl.uid
order by dl.uid,dl.dt
)
select t2.fl_dt as dt,round(avg(isChurn),2) as uv_left_rate
from (
select fl_dt,case when sum(isNull) >= 1 then 1 else 0 end as isChurn
from nextDayActivity
group by fl_dt,uid) t2
where t2.fl_dt >= '2021-11-01'
group by t2.fl_dt
# select dt
# aa as uv_left_rate
# from tb_user_log
# group by dt
