题解 | #对试卷得分做min-max归一化#
对试卷得分做min-max归一化
https://www.nowcoder.com/practice/2b7acdc7d1b9435bac377c1dcb3085d6
with
t1 as (
select
b.uid,
a.exam_id,
b.score
from
examination_info as a,
exam_record as b
where
a.exam_id = b.exam_id
and a.difficulty = 'hard'
and b.score is not null
),
t2 as (
SELECT
a.exam_id,
MAX(b.score) - MIN(b.score) as diff_score,
MIN(b.score) as min_score,
Max(b.score) as max_score
FROM
examination_info as a
JOIN exam_record as b ON a.exam_id = b.exam_id
WHERE
a.difficulty = 'hard'
AND b.score IS NOT NULL
GROUP BY
a.exam_id
),
tc as
( select
t1.uid,
t1.exam_id,
t1.score,
if (
t2.min_score = t2.max_score,
t2.min_score,
(t1.score - t2.min_score) * 100 / t2.diff_score
) as diff_last_score
from
t1
left join t2 on t1.exam_id = t2.exam_id
)
select
uid,
exam_id,
round(avg(diff_last_score), 0) as avg_new_score
from
tc
group by
uid,
exam_id
order by
exam_id asc,
avg_new_score desc
#先读题:请你将用户作答高难度试卷的得分在每份试卷作答记录内执行min-max归一化后缩放到[0,100]区间,并输出用户ID、试卷ID、归一化后分数平均值;最后按照试卷ID升序、归一化分数降序输出
1、条件:作答高难度试卷(join加hard),最后按照试卷ID升序、归一化分数降序输出
2、内容:在每份试卷作答记录内执行min-max归一化后缩放到[0,100]区间,并输出用户ID、试卷ID、归一化后分数平均值
解决:
作答高难度试卷(join加hard)--------------join两个表备用,用where筛选
select
b.uid,
a.exam_id,
b.score
from
examination_info as a,
exam_record as b
where
a.exam_id = b.exam_id
and a.difficulty = 'hard'
and b.score is not null
min-max归一化后缩放到[0,100]区间---------求这些最大值,最小值,最大值-最小值
SELECT
a.exam_id,
MAX(b.score) - MIN(b.score) as diff_score,
MIN(b.score) as min_score,
Max(b.score) as max_score
FROM
examination_info as a
JOIN exam_record as b ON a.exam_id = b.exam_id
WHERE
a.difficulty = 'hard'
AND b.score IS NOT NULL
GROUP BY
a.exam_id
最后就是通过join 对数据进行标准化
select
t1.uid,
t1.exam_id,
t1.score,
if (
t2.min_score = t2.max_score,
t2.min_score,
(t1.score - t2.min_score) * 100 / t2.diff_score
) as diff_last_score
from
t1
left join t2 on t1.exam_id = t2.exam_id
最后在求相应的平均值,round()函数,avg()聚合函数
select
uid,
exam_id,
round(avg(diff_last_score), 0) as avg_new_score
from
tc
group by
uid,
exam_id
order by
exam_id asc,
avg_new_score desc
不过现在max() over 窗口函数貌似可以一步到位
MIN(score) OVER(PARTITION BY exam_id) min_x, #求每类试卷的得分最小值
MAX(score)OVER(PARTITION BY exam_id) max_x #求每类试卷的得分最大值
结束了,这个的判断很重要avg求均值,如果是1个数据就会叛定为null,所以必须使用if函数进行限制
结束
