题解 | #SQL70 牛客每个日期的新用户的留存率#
牛客每个人最近的登录日期(五)
http://www.nowcoder.com/practice/ea0c56cd700344b590182aad03cc61b8
求每个日期的新用户的留存率
关键点:
1.每个日期的新用户-->对于每个用户都有自己的最小的登录日期,在最小登录日期这天,他就是新用户
2.留存率如何计算-->当日的新用户在第二天仍然登录的比例(第二天留下来的"新用户"数/当日总的新用户数)
第一种方法:分别求两个数,总的用户数和留存的用户数(都要基于最小登录日期进行去重统计,统计留存加一个筛选条件即可);这里通过临时创建两张表来保存这两个数。
with login_new as ( select user_id, date, min(date) over(partition by user_id) as min_date from login ), ret_user as ( select t1.min_date, count(distinct(t1.user_id)) as ret_num from login_new t1 where t1.date = date_add(t1.min_date, interval 1 day) group by t1.min_date ), all_user as ( select min_date, count(distinct(user_id)) as all_num from login_new group by min_date ) select l.date, round(coalesce(t.ret_num/t.all_num,0),3) as p from login_new l left join ( select a.min_date as date, a.all_num, r.ret_num from all_user a join ret_user r on a.min_date=r.min_date) t on l.date = t.date group by l.date
第二种方法:还是分别求那两个数,只不过不用创建两个表,而是直接在一个表里,用case when统计留存。
with login_new as ( select user_id, date, min(date) over(partition by user_id) as min_date from login ), ret_user as ( select t1.min_date, sum( case when t1.date = date_add(t1.min_date, interval 1 day) then 1 else 0 end) as ret_num, count(distinct(user_id)) as all_num from login_new t1 group by t1.min_date ) select l.date, round(coalesce(t.ret_num/t.all_num,0),3) as p from login_new l left join ret_user t on l.date = t.min_date group by l.date
注意需要输出所有日期的新用户的留存率,而不是只有min_date!!