题解 | #SQL 31.未完成率较高的50%用户近三个月答卷情况#

未完成率较高的50%用户近三个月答卷情况

http://www.nowcoder.com/practice/3e598a2dcd854db8b1a3c48e5904fe1c

明确题意:

统计SQL试卷上未完成率较高的50%用户中,6级和7级用户在有试卷作答记录的近三个月中,每个月的答卷数目和完成数目。 按用户ID、月份升序排序。


问题分解:

  • 统计每次试卷作答记录的月份编号,生成子表 t_exam_record_month_rank:
    • 生成当次作答的月份排名,按年月降序进行窗口连续排名:
      • DENSE_RANK() OVER(ORDER BY DATE_FORMAT(start_time, "%Y%m") DESC) as start_month_rank
  • 筛选月份排名小于等于3的记录(近三个月的):WHERE start_month_rank <= 3
  • 筛选SQL试卷上未完成率较高的50%用户:
    • 计算每个用户在SQL试卷上的未完成率百分比排名:
      • 计算每个用户在SQL试卷上的未完成率:
        • 筛选SQL试卷:WHERE exam_id in (SELECT exam_id FROM examination_info WHERE tag='SQL')
        • 计算未完成率:1 - COUNT(submit_time) / COUNT(1) as incomplete_rate
      • 统计百分比排名,按未完成率降序uid升序进行百分比排名:
        • PERCENT_RANK() over(ORDER BY incomplete_rate DESC, uid) as incomp_rate_rank
    • 筛选未完成率较高的50%用户(因为按未完成率倒序了,所以<=):WHERE incomp_rate_rank <= 0.5
  • 筛选6、7级用户:uid IN (SELECT uid FROM user_info WHERE level>=6)
  • 按用户ID、月份分组:GROUP BY uid, start_month

细节问题:

  • 表头重命名:as
  • 按用户ID、月份升序排序:ORDER BY uid, start_month

完整代码:

SELECT uid, start_month, COUNT(1) as exam_cnt, COUNT(submit_time) as complete_cnt
FROM (
    SELECT uid, submit_time, DATE_FORMAT(start_time, "%Y%m") as start_month,
        DENSE_RANK() over(
            ORDER BY DATE_FORMAT(start_time, "%Y%m") DESC) as start_month_rank -- 按作答月份降序编号
    FROM exam_record
) as t_exam_record_month_rank
WHERE start_month_rank <= 3
AND uid IN (
    SELECT uid FROM (
        SELECT uid,
            PERCENT_RANK() over(ORDER BY incomp_rate DESC, uid) as incomp_rate_rank
        FROM (
            SELECT uid, 1 - COUNT(submit_time) / COUNT(1) as incomp_rate -- 此人未完成率
            FROM exam_record
            WHERE exam_id in (SELECT exam_id FROM examination_info WHERE tag='SQL')
            GROUP BY uid
        ) as t_exam_incom_rate -- 在SQL试卷上的未完成率
    ) as t_exam_incom_rate_rank -- 在SQL试卷上的未完成率百分比排名
    WHERE incomp_rate_rank <= 0.5
) -- 在SQL试卷上未完成率最高的50%用户
AND uid IN (SELECT uid FROM user_info WHERE `level`>=6)
GROUP BY uid, start_month
ORDER BY uid, start_month;
SQL进阶 文章被收录于专栏

SQL进阶step by step

全部评论
不是应该dense_rank() over(partition by uid order by date_format(start_time,'%Y%m')),因为是选取每个用户近三个月答卷情况,那应该对每个用户分组排名。当然本题没影响。
1 回复 分享
发布于 2022-02-17 22:06
请问count(1)是什么意思呢
点赞 回复 分享
发布于 2022-05-02 14:54
PERCENT_RANK() over(ORDER BY incomplete_rate DESC)这里为什么要降序呢,升序不行吗
点赞 回复 分享
发布于 2022-06-23 21:40

相关推荐

小红书 后端开发 总包n+8w+期权
点赞 评论 收藏
分享
牛客771574427号:恭喜你,华杰
点赞 评论 收藏
分享
点赞 评论 收藏
分享
评论
16
3
分享
牛客网
牛客企业服务