MySql语句
1.GROUP BY 后增加筛选条件,用HAVING效率更高
having用于分组后的过滤,放到group by后面,常和聚合函数一起用 where 用于全表数据集的筛选,在分组动作的前面
LeetCode.182 查找重复的电子邮箱
三种方法统计重复
having +聚合函数 > SELECT Email from Person group by Email having count(Email) > 1;
自链接 > SELECT DISTINCT p1.Email from Person p1, Person p2 where p1.Email = p2.Email and p1.Id != p2.Id;
子表查询 > SELECT Email from (select count(1) as t, Email from Person group by Email) r where r.t > 1;
LeetCode.196 删除重复的邮箱
有慢查询优化经验会清楚,在实际生产中,面对千万上亿级别的数据,连接的效率往往最高,因为用到索引的概率较高。
DELETE p1 FROM Person p1, Person p2 WHERE p1.Email = p2.Email AND p1.Id > p2.Id
1、DELETE p1
在DELETE官方文档中,给出了这一用法,比如下面这个DELETE语句👇
DELETE t1 FROM t1 LEFT JOIN t2 ON t1.id=t2.id WHERE t2.id IS NULL;
这种DELETE方式很陌生,竟然和SELETE的写法类似。它涉及到t1和t2两张表,DELETE t1表示要删除t1的一些记录,具体删哪些,就看WHERE条件,满足就删;
这里删的是t1表中,跟t2匹配不上的那些记录。
所以,官方sql中,DELETE p1就表示从p1表中删除满足WHERE条件的记录。
2、p1.Id > p2.Id
继续之前,先简单看一下表的连接过程,这个搞懂了,理解WHERE条件就简单了👇
a. 从驱动表(左表)取出N条记录;
b. 拿着这N条记录,依次到被驱动表(右表)查找满足WHERE条件的
a. 从表p1取出3条记录;
b. 拿着第1条记录去表p2查找满足WHERE的记录,代入该条件p1.Email = p2.Email AND p1.Id > p2.Id后,发现没有满足的,所以不用删掉记录1;
c. 记录2同理;
d. 拿着第3条记录去表p2查找满足WHERE的记录,发现有一条记录满足,所以要从p1删掉记录3;
e. 3条记录遍历完,删掉了1条记录,这个DELETE也就结束了。
2. SQL最值问题之IN关键字
查找部门工资最高的员工
给出Employee员工表(Id,Name,Salary,DepartmentId)与Department部门表(Id,DepartmentName)
查询每个部门工资最高的员工姓名、薪水、部门名称
SELECT d.Name as 'Department' , e.Name as 'Employee' ,e.Salary as 'Salary' FROM Employee e JOIN Department d ON e.DepartmentId = d.Id WHERE (e.Salary , e.DepartmentId) IN (SELECT MAX(Salary) , DepartmentId FROM Employee GROUP BY DepartmentId)
查询部门工资前三高的所有员工
思路1 分组内取前几名的问题,可以先group by然后用having count()来筛选
思路2 利用函数来解决dense_rank() + partition by + order by
SELECT Department , Employee , Salary FROM ( SELECT d.Name Department, e.Name Employee , e.Salary Salary, DENSE_RANK() OVER( PARTITION BY DepartmentId ORDER BY Salary DESC ) AS rk FROM Employee as e JOIN Department as d ON e.DepartmentId = d.Id ) f WHERE rk <= 3