看了这个SQL实战面试题集,打满了鸡血,彷佛一切大厂都是弟弟
今天要分享一点出来的是,飞书文档中的 第40题: 打卡必须给我对准了再打 之 这卡不打也罢,年轻就是资本,我有的是资本
题的来源文档(数据开发必学):https://ykg8hl7h33.feishu.cn/docx/VNo7dJLJfoDwT1xNNVpcLg8Snlc
题目:
求每个员工每天的有效打卡次数,有效打卡定义:首次打卡有效,以首次打卡为基准,后续每次打卡间隔应大于等于1小时,同时要求小于3小时
有效打卡定义:
- 首次打卡有效
- 以首次打卡为基准,后续每次打卡间隔应大于等于1小时,同时要求小于3小时
备注:如果第二次打卡无效了 那第三次还可以和第二次比较
数据:
with data as ( select '张三' as name , '2024-04-16 08:31:00' as clock_time union all select '张三' as name , '2024-04-16 09:32:00' as clock_time union all select '张三' as name , '2024-04-16 10:31:00' as clock_time union all select '张三' as name , '2024-04-16 11:33:00' as clock_time union all select '张三' as name , '2024-04-16 12:35:00' as clock_time union all select '张三' as name , '2024-04-16 14:31:00' as clock_time union all select '张三' as name , '2024-04-16 15:31:00' as clock_time union all select '张三' as name , '2024-04-16 16:31:00' as clock_time union all select '张三' as name , '2024-04-16 17:30:00' as clock_time union all select '张三' as name , '2024-04-16 18:31:00' as clock_time union all select '李四' as name , '2024-04-16 08:33:00' as clock_time union all select '李四' as name , '2024-04-16 09:32:00' as clock_time union all select '李四' as name , '2024-04-16 10:31:00' as clock_time union all select '李四' as name , '2024-04-16 11:33:00' as clock_time union all select '李四' as name , '2024-04-16 12:35:00' as clock_time union all select '李四' as name , '2024-04-16 16:31:00' as clock_time union all select '李四' as name , '2024-04-16 17:30:00' as clock_time union all select '李四' as name , '2024-04-16 18:31:00' as clock_time union all select '王朝' as name , '2024-04-16 08:33:00' as clock_time union all select '王朝' as name , '2024-04-16 09:32:00' as clock_time union all select '王朝' as name , '2024-04-16 10:31:00' as clock_time union all select '王朝' as name , '2024-04-16 11:33:00' as clock_time union all select '王朝' as name , '2024-04-16 12:35:00' as clock_time union all select '王朝' as name , '2024-04-16 16:31:00' as clock_time union all select '王朝' as name , '2024-04-16 17:30:00' as clock_time union all select '王朝' as name , '2024-04-17 17:35:00' as clock_time union all select '王朝' as name , '2024-04-16 18:31:00' as clock_time )
解题思路:
--解题思路: --既然首次打卡有效,所以每天的第一条打卡就不用关心 --我们怎么计算第二次以及后面多次是否和上一次的时间范围在 1<=x<3 大于等于1小时,且小于3小时 --我感觉需要拿到上一条数据才行,所以我们按照 name,day 来作分区 取每条数据的上一条,如果上一条数据为null,则无需多言,因为只会影响我的速度(哈哈 --剩下拿到了上一条数据的,我们和当前时间的小时做一个比较即可 1 <= x <3 就是我们的公式 --select -- name , -- clock_time , -- unix_timestamp( clock_time ) clock_time_second, -- lag(clock_time) over(partition by name , to_date(clock_time) order by clock_time asc) as prev_time , --拿到每个人在每天的打卡记录上一个时间,为空代表是第一天 -- unix_timestamp( lag(clock_time) over(partition by name , to_date(clock_time) order by clock_time asc) ) as prev_time_second --拿到每个人在每天的打卡记录上一个时间,为空代表是第一天 --from -- data --2. 我们和当前时间的小时做一个比较即可 1 <= x <3 就是我们的公式 --select -- name , -- clock_time , -- lag(clock_time) over(partition by name , to_date(clock_time) order by clock_time asc) as prev_time, --拿到每个人在每天的打卡记录上一个时间,为空代表是第一天 -- --用时间戳去比较 -- if( lag(clock_time) over(partition by name , to_date(clock_time) order by clock_time asc) is null , --为空代表是第一天,直接为有效 -- true, -- if( -- --用当前打卡时间 - 上一次打卡时间 = 与上次打卡的时间差 -- ( unix_timestamp( clock_time ) - unix_timestamp( lag(clock_time) over(partition by name , to_date(clock_time) order by clock_time asc) ) ) >= 60*60 and -- >=60*60 = 大于1小时 -- ( unix_timestamp( clock_time ) - unix_timestamp( lag(clock_time) over(partition by name , to_date(clock_time) order by clock_time asc) ) ) < 60*60*3 , -- <60*60*3 = 小于3小时 -- true, -- false -- ) -- ) as is_valid -- 是否有效打卡 --from -- data --3. 对有效状态进行聚合 select name, to_date(clock_time), count(is_valid) as day_total_valid_cnt from( select name , clock_time , --用时间戳去比较 if( lag(clock_time) over(partition by name , to_date(clock_time) order by clock_time asc) is null , --为空代表是第一天,直接为有效 true, if( --用当前打卡时间 - 上一次打卡时间 = 与上次打卡的时间差 ( unix_timestamp( clock_time ) - unix_timestamp( lag(clock_time) over(partition by name , to_date(clock_time) order by clock_time asc) ) ) >= 60*60 and -- >=60*60 = 大于1小时 ( unix_timestamp( clock_time ) - unix_timestamp( lag(clock_time) over(partition by name , to_date(clock_time) order by clock_time asc) ) ) < 60*60*3 , -- <60*60*3 = 小于3小时 true, false ) ) as is_valid -- 是否有效打卡 from data )t where is_valid = true group by name, to_date(clock_time)
结果: