题解 | #连续两次作答试卷的最大时间窗#
连续两次作答试卷的最大时间窗
http://www.nowcoder.com/practice/9dcc0eebb8394e79ada1d4d4e979d73c
本人稀碎答案
最大时间差
- 给完成时间排序:
Row_number() over (partition by uid order by start_time asc) as rn1 Row_number() over (partition by uid order by start_time desc) as rn2
- 首次作答:if(rn1=1,start_time,null) as first_time
- 末次作答:if(rn2=1,start_time,null) as last_time
- 对单个用户来说首次和最后一次作答的时间差 Timestampdiff(day,first_time,last_time) as sub_time
但其实只需要 datediff(max(start_time),min(start_time)) as sub_time
最大时间窗
1.对时间正序、倒序排列
Select *,date(start_time) as act_date,
row_number() over (partition by uid order by date(start_time) as rn1,
row_number() over (partition by uid order by date(start_time) desc) as rn2
2.仿照次日留存率,把相同的表格错位连接
Select uid,date(q1.start_time) as date1,date(q2.start_time) as date2,
q1.rn1 as rn1,q1.rn2 as rn2
from
(Select *,date(start_time) as act_date,
row_number() over (partition by uid order by date(start_time) as rn1,
row_number() over (partition by uid order by date(start_time) desc) as rn2
) as q1
left join
(Select *,date(start_time) as act_date,
row_number() over (partition by uid order by date(start_time) as rn1,
row_number() over (partition by uid order by date(start_time) desc) as rn2
) as q2
on q1.uid=q2.uid and q1.rn1=q2.rn1-1
3.取时间窗,首次作答及末次作答时间
select uid,max(datediff(date2,date1)) as days_window,
if(rn1=1,act_date,null) as first_time,
if(rn2=1,act_date,null) as last_time
Group by uid
4.计算时间窗及平均试卷数
select uid,days_window,
round(count(question_id)/Timestampdiff(day,first_time,last_time)*days_window,2) as avg_exam_cnt
- 组装结果——
select uid,days_window,
round(exam_cnt*days_window/sub_time,2) as avg_exam_cnt
from
(select uid,
max(datediff(next_time,start_time))+1 as days_window,
count(start_time) as exam_cnt,
datediff(max(start_time),min(start_time))+1 as sub_time
from
(select uid,start_time,
lead(start_time,1) over(partition by uid order by start_time)
as next_time
from exam_record
where year(start_time)=2021
)as q1
group by uid
)as q2
where sub_time>1
order by days_window desc,avg_exam_cnt desc
运行结果
提交答案
知识储备
- lead(字段名,n) over()——把一列数据往前推n个
- lag(字段名,n) over()——把一列数据往后推n个
- lag(字段名,n,x) over()——把一列数据往后推n个且将空出来的位置填上x
分析:
计算平均试卷数 =试卷总数*最大窗(max(datediff()))/最大相隔天数(datediff(max-min))
- 利用窗口函数把相邻日期的错开——为max-min做准备
Select uid,start_time,
Lead (start_time,1) over (partition by uid order by start_time) as next_time
%%%%%%不必连接多个表格利用链接条件on来筛选
From exam_record
Where year(start_time)=2021
- 计算最大窗——max(datediff())、datediff(max-min)
Select uid,count(start_time) as exam_cnt,
datediff(max(start_time),min(start_time))+1 as diff_time,
%%%%%%%%不必特意标记初始时间和末次时间
max(datediff(next_time,start_time))+1 as days_window
- 计算平均试卷数——总公式
Select uid,days_window,
Round(Exam_cnt*days_window/diff_time,2) as avg_exam_cnt
- 组装
Select uid,days_window,
round(Exam_cnt*days_window/diff_time,2) as avg_exam_cnt
from
(select uid,count(start_time) as exam_cnt,
datediff(max(start_time),min(start_time))+1 as diff_time,
max(datediff(next_time,start_time))+1 as days_window
from
(select uid,start_time,
lead (start_time,1) over (partition by uid order by start_time)
as next_time
from exam_record
where year(start_time)=2021) as q1
group by uid) as q2
where diff_time>1
order by days_window desc,avg_exam_cnt desc