在PostgreSQL中,窗口函数(也称为窗口或分析功能)是在结果集中与当前行相关的一组行进行计算的强大功能。窗口函数通常用于涉及排名,聚合和比较特定行窗口中的值的任务。与传统的聚集功能相比 这是与窗口功能有关的一些关键特征和概念:
1.分区:窗口函数通常涉及根据一个或多个列将结果设置为子集或行组。这些分区定义了窗口函数运行的范围。指定分区列中具有相同值的行属于同一窗口。
2.排序:在每个分区中,您可以根据一个或多个列定义订单。排序确定如何布置分区中的行。这对于等级,累积总和和计算百分位数等功能至关重要。
3.Window框架:窗框定义了相对于窗口函数运行的当前行的行范围。您可以使用ROWS BETWEEN
,RANGE BETWEEN
或UNBOUNDED PRECEDING
之类的关键字来设置边界。
4.窗口函数语法:窗口函数在SELECT
claueser中使用,并具有独特的语法。其次是OVER
par,指定分区,订购和框架从句。这是一个基本的语法示例:
SELECT
column1,
column2,
window_function(column3) OVER (
PARTITION BY partition_column
ORDER BY order_column
frame_clause
) AS result_column
FROM table_name;
PostgreSQL中的常见窗口函数包括:
1.ROW_NUMBER()函数:
此功能根据指定顺序为分区中的每一行分配一个唯一的整数。它不会留下空白,并且连续的行接收连续的整数。
假设我们有一个具有以下数据的“员工”表:
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
employee_name VARCHAR(100),
department VARCHAR(50),
salary DECIMAL(10, 2)
);
INSERT INTO employees (employee_name, department, salary)
VALUES
('Alice', 'HR', 50000),
('Bob', 'HR', 52000),
('Charlie', 'Finance', 60000),
('David', 'Finance', 55000),
('Eve', 'IT', 65000),
('Frank', 'IT', 62000);
row_number()函数
SELECT
department,
employee_name,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;
预期输出:
department | employee_name | salary | row_num
-----------+---------------+--------+--------
Finance | Charlie | 60000 | 1
Finance | David | 55000 | 2
HR | Bob | 52000 | 1
HR | Alice | 50000 | 2
IT | Eve | 65000 | 1
IT | Frank | 62000 | 2
分区:该结果集由部门列分区,为每个部门创建单独的分区。
订购:在每个部门的分区中,行以降级命令订购。
rank()函数:
RANK()
在一个分区内为行分配了排名,并且在纽带的情况下留出了差距。如果多行具有相同的值,则它们会获得相同的等级,并且下一个排名。
示例:
SELECT
department,
employee_name,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees;
预期输出:
-- If there is another employee in the Finance department who receives a salary of 55000, their rank will also be 2 but the next one will be 4 instead of 3.
department | employee_name | salary | rank
-----------+---------------+--------+------
Finance | Charlie | 60000 | 1
Finance | David | 55000 | 2
Finance | Mark | 55000 | 2
Finance | John | 45000 | 4
HR | Bob | 52000 | 1
HR | Alice | 50000 | 2
IT | Eve | 65000 | 1
IT | Frank | 62000 | 2
3.dense_rank()函数:
DENSE_RANK()
is类似于RANK()
,但在纽带的情况下不会留出空白。如果多行具有相同的值,则它们获得相同的等级,并且下一个排名不会跳过。
示例:
SELECT
department,
employee_name,
salary,
DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank
FROM employees;
预期输出:
-- If there is another employee in the Finance department who receives a salary of 55000, their rank will also be 2 and the next one will be 3.
department | employee_name | salary | dense_rank
-----------+---------------+--------+------------
Finance | Charlie | 60000 | 1
Finance | David | 55000 | 2
Finance | Mark | 55000 | 2
Finance | John | 45000 | 3
HR | Bob | 52000 | 1
HR | Alice | 50000 | 2
IT | Eve | 65000 | 1
IT | Frank | 62000 | 2
--If there is another employee in the Finance department who receives a salary of 55000, their rank will be 3
4.Lead()函数:
这些功能允许您从分区中的下一行访问值,并尊重指定的顺序。
示例:
SELECT
department,
employee_name,
salary,
LEAD(salary) OVER (PARTITION BY department ORDER BY salary DESC) AS next_highest_salary
FROM employees;
预期输出:
department | employee_name | salary | next_highest_salary
-----------+---------------+--------+---------------------
Finance | Charlie | 60000 | 55000
Finance | David | 55000 |
HR | Bob | 52000 | 50000
HR | Alice | 50000 |
IT | Eve | 65000 | 62000
IT | Frank | 62000 |