窗口功能
#postgres #database #apacheage #postgressql

在PostgreSQL中,窗口函数(也称为窗口或分析功能)是在结果集中与当前行相关的一组行进行计算的强大功能。窗口函数通常用于涉及排名,聚合和比较特定行窗口中的值的任务。与传统的聚集功能相比 这是与窗口功能有关的一些关键特征和概念:

1.分区:窗口函数通常涉及根据一个或多个列将结果设置为子集或行组。这些分区定义了窗口函数运行的范围。指定分区列中具有相同值的行属于同一窗口。

2.排序:在每个分区中,您可以根据一个或多个列定义订单。排序确定如何布置分区中的行。这对于等级,累积总和和计算百分位数等功能至关重要。

3.Window框架:窗框定义了相对于窗口函数运行的当前行的行范围。您可以使用ROWS BETWEENRANGE BETWEENUNBOUNDED PRECEDING之类的关键字来设置边界。

4.窗口函数语法:窗口函数在SELECTclaueser中使用,并具有独特的语法。其次是OVERpar,指定分区,订购和框架从句。这是一个基本的语法示例:

SELECT
    column1,
    column2,
    window_function(column3) OVER (
        PARTITION BY partition_column
        ORDER BY order_column
        frame_clause
    ) AS result_column
FROM table_name;

PostgreSQL中的常见窗口函数包括:

1.ROW_NUMBER()函数:
此功能根据指定顺序为分区中的每一行分配一个唯一的整数。它不会留下空白,并且连续的行接收连续的整数。

假设我们有一个具有以下数据的“员工”表:

CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    employee_name VARCHAR(100),
    department VARCHAR(50),
    salary DECIMAL(10, 2)
);

INSERT INTO employees (employee_name, department, salary)
VALUES
    ('Alice', 'HR', 50000),
    ('Bob', 'HR', 52000),
    ('Charlie', 'Finance', 60000),
    ('David', 'Finance', 55000),
    ('Eve', 'IT', 65000),
    ('Frank', 'IT', 62000);

row_number()函数

SELECT
    department,
    employee_name,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;

预期输出:

department | employee_name | salary | row_num
-----------+---------------+--------+--------
Finance    | Charlie       | 60000  | 1
Finance    | David         | 55000  | 2
HR         | Bob           | 52000  | 1
HR         | Alice         | 50000  | 2
IT         | Eve           | 65000  | 1
IT         | Frank         | 62000  | 2

分区:该结果集由部门列分区,为每个部门创建单独的分区。
订购:在每个部门的分区中,行以降级命令订购。

rank()函数:
RANK()在一个分区内为行分配了排名,并且在纽带的情况下留出了差距。如果多行具有相同的值,则它们会获得相同的等级,并且下一个排名。

示例:

SELECT
    department,
    employee_name,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees;

预期输出:

-- If there is another employee in the Finance department who receives a salary of 55000, their rank will also be 2 but the next one will be 4 instead of 3.

department | employee_name | salary | rank
-----------+---------------+--------+------
Finance    | Charlie       | 60000  | 1
Finance    | David         | 55000  | 2
Finance    | Mark          | 55000  | 2
Finance    | John          | 45000  | 4
HR         | Bob           | 52000  | 1
HR         | Alice         | 50000  | 2
IT         | Eve           | 65000  | 1
IT         | Frank         | 62000  | 2


3.dense_rank()函数:
DENSE_RANK()is类似于RANK(),但在纽带的情况下不会留出空白。如果多行具有相同的值,则它们获得相同的等级,并且下一个排名不会跳过。
示例:

SELECT
    department,
    employee_name,
    salary,
    DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank
FROM employees;

预期输出:

-- If there is another employee in the Finance department who receives a salary of 55000, their rank will also be 2 and the next one will be 3.

department | employee_name | salary | dense_rank
-----------+---------------+--------+------------
Finance    | Charlie       | 60000  | 1
Finance    | David         | 55000  | 2
Finance    | Mark          | 55000  | 2
Finance    | John          | 45000  | 3
HR         | Bob           | 52000  | 1
HR         | Alice         | 50000  | 2
IT         | Eve           | 65000  | 1
IT         | Frank         | 62000  | 2

--If there is another employee in the Finance department who receives a salary of 55000, their rank will be 3

4.Lead()函数:
这些功能允许您从分区中的下一行访问值,并尊重指定的顺序。
示例:

SELECT
    department,
    employee_name,
    salary,
    LEAD(salary) OVER (PARTITION BY department ORDER BY salary DESC) AS next_highest_salary
FROM employees;

预期输出:

department | employee_name | salary | next_highest_salary
-----------+---------------+--------+---------------------
Finance    | Charlie       | 60000  | 55000
Finance    | David         | 55000  | 
HR         | Bob           | 52000  | 50000
HR         | Alice         | 50000  | 
IT         | Eve           | 65000  | 62000
IT         | Frank         | 62000  |