题解 | #未完成率较高的50%用户近三个月答卷情况#
未完成率较高的50%用户近三个月答卷情况
https://www.nowcoder.com/practice/3e598a2dcd854db8b1a3c48e5904fe1c
花了 1.5小时的题目还是值得我纪念一下。。其中前一个小时我都在“去重”的坑里。。
看到解释表中的第二行——用户 1002未完成数、作答总数分别是 4、8,而不是 5、9。我找了半天原因,于是在 exam_record 表中发现了这样两行(牛客这自带插入表格真难用😥):
id | uid | exam_id | start_time | submit_time | score |
10 | 1002 | 9002 | 2020-02-02 12:01:01 | None | None |
11 | 1002 | 9002 | 2020-02-02 12:01:01 | 2020-02-02 12:43:01 | 81 |
因此我以为需要先去重(即 uid 用户ID, exam_id 试卷ID, start_time 开始作答时间,这三项完全一致的为重复行,现实生活中这样认为也比较合理,例如重定向导致的重复行)。
去重:
- 首先用 dense_rank 根据 uid, exam_id, start_time 排序编号,并按 submit_time倒叙排序,记为 t1。这样属于重复行的可以编号相同,并且submit_time 不为空的会优先排在该组的第一行;
- 然后用 row_number 以dr分组,根据 uid, exam_id, start_time 排序编号,记为 t2;
- 最后筛选出 t2 中的 rk = 1 即为去重结果。
with t as ( select * from ( select *, row_number() over(partition by dr order by uid, exam_id, start_time) rk from ( select *, dense_rank() over(order by uid, exam_id, start_time) dr from exam_record order by dr, submit_time desc )t1 )t2 where rk=1 )
结果一个小时过去了,案例不通过,重读题目,嗯????SQL试卷?!?tnnd
with a1 as ( select uid, sum(if(submit_time is null, 1, 0)) incomplete_cnt, count(*) total_cnt, sum(if(submit_time is null, 1, 0))/count(*) incomplete_rate, PERCENT_RANK() over(order by sum(if(submit_time is null, 1, 0))/count(*) desc) rk from exam_record left join examination_info using(exam_id) where tag='SQL' group by uid ) select uid, start_month, total_cnt, complete_cnt from ( select uid, date_format(start_time, '%Y%m') start_month, count(start_time) total_cnt, sum(if(submit_time is not null,1,0)) complete_cnt, row_number() over(partition by uid order by date_format(start_time, '%Y%m') desc) time_rk from exam_record where uid in ( select uid from a1 left join user_info using(uid) where rk<=0.5 and (level=6 or level=7) ) group by uid, start_month )a2 where time_rk<=3 # 查找作答时间最近的3个月 order by uid, start_month ;
划重点:
- PERCENT_RANK() 用得比较少,新知识点
- 查找作答时间最近的3个月,可以用排序编号的方法选出前三,根选出每个部门工资前三的员工的题目思路类似。