精品丨DAX性能优化问题

BOSS:那个,白茶啊,这个报表刷新有点慢啊,你看,每次我点筛选或者刷新都会让我等很久。
白茶:(认真脸)BOSS,那您再等会就好!

BOSS:白茶!!两小时了!!还没出来!!
白茶:(思考)老板,这个有点难啊,这个问题技术要求比较高。
BOSS:加钱!!!!
白茶:好嘞!

一张好的报表是如何界定的?DAX计算无误、前端展现明了、业务思路清晰、报表响应速度,白茶觉得这些因素就可以界定一张好的报表。

本期我们来聊一聊PowerBI中DAX函数性能优化的问题。

毕竟一张可视化报表需要15分钟刷新才能呈现出来,这对用户来说太不友好了。

先来看看本期的示例文件:

一张产品维度表,一张销售明细表。

需求是什么?

这张是销售明细表中的分店维度信息,为了便于小伙伴理解,白茶单独整理出来。

这张表是需求的计算逻辑图。什么意思呢?就是当Key小于15时,计算每个Key对应的分店,当Key大于14时,根据计算逻辑对不同的分店进行汇总计算。

编写基础的DAX计算代码:

SalesAmount =
SUMX ( 'Fact_SalesDetail', [Quantity] * RELATED ( Dim_Product[SalesAmount] ) )

在不考虑性能的情况下,DAX计算逻辑如下:

SalesAmountByDisplay = 
SUMX (
    'Dim_DisplayDepartment',
    SWITCH (
        TRUE (),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 1,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 1 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 2,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 2 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 3,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 3 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 4,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 4 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 5,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 5 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 6,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 6 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 7,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 7 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 8,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 8 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 9,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 9 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 10,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 10 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 11,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 11 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 12,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 12 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 13,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 13 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 14,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 14 )
            ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 15,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 13 )
            )
                + CALCULATE (
                    [SalesAmount],
                    FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 14 )
                ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 16,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 1 )
            )
                + CALCULATE (
                    [SalesAmount],
                    FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 7 )
                )
                + CALCULATE (
                    [SalesAmount],
                    FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 9 )
                ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 17,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 1 )
            )
                + CALCULATE (
                    [SalesAmount],
                    FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 2 )
                )
                + CALCULATE (
                    [SalesAmount],
                    FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 3 )
                )
                + CALCULATE (
                    [SalesAmount],
                    FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 11 )
                ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 18,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 9 )
            )
                + CALCULATE (
                    [SalesAmount],
                    FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 11 )
                )
                + CALCULATE (
                    [SalesAmount],
                    FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 12 )
                )
                - CALCULATE (
                    [SalesAmount],
                    FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 2 )
                ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 19,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 4 )
            )
                + CALCULATE (
                    [SalesAmount],
                    FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 7 )
                ),
        SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 20, CALCULATE ( [SalesAmount], ALLSELECTED ( Fact_SalesDetail ) )
    )
)

相信不用白茶多说,小伙伴也能看出来代码的问题,太长了。

这段代码功能基本上是实现了,问题点有哪些呢?

1.SELECTEDVALUE复用度较高,可以使用变量代替

2.多个条件汇总迭代次数较多,可以使用提供List

这也是小伙伴常见的问题,如果DAX的构建可以绕开这两个问题,那么性能会有很大的提升。

简化版写法:

SalesAmountByDisplay2 = 
VAR CurrentDepartmentKey =
    SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] )
VAR CurrentDepartment =
    TREATAS (
        VALUES ( Dim_DisplayDepartment[DepartmentKey] ),
        Fact_SalesDetail[DepartmentKey]
    )
VAR Results =
    SWITCH (
        TRUE (),
        CurrentDepartmentKey = 15,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 13, 14 } )
            ),
        CurrentDepartmentKey = 16,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 1, 7, 9 } )
            ),
        CurrentDepartmentKey = 17,
            CALCULATE (
                [SalesAmount],
                FILTER (
                    'Fact_SalesDetail',
                    'Fact_SalesDetail'[DepartmentKey] IN { 1, 2, 3, 11 }
                )
            ),
        CurrentDepartmentKey = 18,
            CALCULATE (
                [SalesAmount],
                FILTER (
                    'Fact_SalesDetail',
                    'Fact_SalesDetail'[DepartmentKey] IN { 9, 11, 12 }
                )
            )
                - CALCULATE (
                    [SalesAmount],
                    FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 2 )
                ),
        CurrentDepartmentKey = 19,
            CALCULATE (
                [SalesAmount],
                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 4, 7 } )
            ),
        CurrentDepartmentKey = 20, CALCULATE ( [SalesAmount], ALLSELECTED ( Fact_SalesDetail ) ),
        CALCULATE ( [SalesAmount], CurrentDepartment )
    )
RETURN
    Results

相较于之前的写法,这个写法通过VAR定义变量,和使用TREATAS来减少代码计算逻辑的书写。

通过提供List来减少迭代遍历的次数。

那么有没有继续可以优化的空间?有的。

优化写法:

SalesAmountByDisplay3 = 
VAR CurrentDetail =
    ADDCOLUMNS (
        DISTINCT ( 'Dim_DisplayDepartment' ),
        "@CurrentValue",
            VAR CurrentDepartmentKey = 'Dim_DisplayDepartment'[DepartmentKey]
            RETURN
                SWITCH (
                    TRUE (),
                    CurrentDepartmentKey = 15,
                        CALCULATE (
                            [SalesAmount],
                            FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 13, 14 } )
                        ),
                    CurrentDepartmentKey = 16,
                        CALCULATE (
                            [SalesAmount],
                            FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 1, 7, 9 } )
                        ),
                    CurrentDepartmentKey = 17,
                        CALCULATE (
                            [SalesAmount],
                            FILTER (
                                'Fact_SalesDetail',
                                'Fact_SalesDetail'[DepartmentKey] IN { 1, 2, 3, 11 }
                            )
                        ),
                    CurrentDepartmentKey = 18,
                        CALCULATE (
                            [SalesAmount],
                            FILTER (
                                'Fact_SalesDetail',
                                'Fact_SalesDetail'[DepartmentKey] IN { 9, 11, 12 }
                            )
                        )
                            - CALCULATE (
                                [SalesAmount],
                                FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 2 )
                            ),
                    CurrentDepartmentKey = 19,
                        CALCULATE (
                            [SalesAmount],
                            FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 4, 7 } )
                        ),
                    CurrentDepartmentKey = 20, CALCULATE ( [SalesAmount], ALLSELECTED ( Fact_SalesDetail ) ),
                    CALCULATE (
                        [SalesAmount],
                        'Fact_SalesDetail'[DepartmentKey] = CurrentDepartmentKey
                    )
                )
    )
RETURN
    SUMX ( CurrentDetail, [@CurrentValue] )

这种写法,相较于上一种书写量多一些,通过定义虚拟表来减少迭代遍历的次数。从理论上来说,因为定义了虚拟表,无论源代码需要对事实表迭代多少次,这个思路迭代的永远都是虚拟表,优化度很高。

我们来对比一下:

DAX Studio测试:

三种写法的差距很明显。其实这里有一些争议的,简化写法是通过减少资源占用来实现优化,优化写法是通过减少迭代遍历实现优化。

从测试结果上来看,是简化写法优化度较高,但是在实际应用中,测试发现优化写法的方式响应更迅速。

你以为到这里就结束了么?

其实还有第四种优化的思路,只不过这个思路比较难。

SalesAmountByDisplay4 = 
SUMX (
    VALUES ( 'Dim_DisplayDepartment'[DepartmentKey] ),
    VAR CurDpmKey = 'Dim_DisplayDepartment'[DepartmentKey]
    VAR TempTable =
        FILTER (
            ALL ( 'Dim_DisplayDepartment'[DepartmentKey] ),
            'Dim_DisplayDepartment'[DepartmentKey] <= 14
        )
    VAR AllDetail =
        ADDCOLUMNS (
            TempTable,
            "SalesAmount",
                VAR CurrentDepartment = 'Dim_DisplayDepartment'[DepartmentKey]
                RETURN
                    CALCULATE (
                        [SalesAmount],
                        ALL ( 'Dim_DisplayDepartment' ),
                        'Fact_SalesDetail'[DepartmentKey] = CurrentDepartment
                    )
        )
    VAR FilterContent =
        CALCULATE (
            MAX ( 'Dim_ComputationalLogic'[FilterContent] ),
            ALL ( Dim_DisplayDepartment ),
            'Dim_ComputationalLogic'[DepartmentKey] = CurDpmKey
        )
    VAR Length =
        LEN ( FilterContent )
    VAR FilterTable =
        SELECTCOLUMNS (
            GENERATESERIES ( 1, Length, 4 ),
            "DepmKey", MID ( FilterContent, [Value], 3 ) * 1
        )
    VAR Result =
        SUMX (
            FilterTable,
            VAR DpmKey = [DepmKey]
            VAR SalesValue =
                SUMMARIZE (
                    FILTER ( AllDetail, 'Dim_DisplayDepartment'[DepartmentKey] = ABS ( DpmKey ) ),
                    [SalesAmount]
                )
            RETURN
                IF ( DpmKey >= 0, SalesValue, - SalesValue )
        )
    RETURN
        Result
)
--作者:夕枫

这个优化的思路,是**@夕枫**大佬提出来的。通过定义计算表,减少代码书写量,使用ALL减少上下文转换的消耗,减少查询次数,命中缓存。

DAX Studio测试:

总结一下:

1.可以通过变量和定义表来减少代码书写量

2.可以通过减少资源调用优化

3.可以通过虚拟表减少迭代遍历

4.可以通过命中缓存进行优化

比较常用的是前三种,第四种难度系数较高。

往期推荐:

《精品丨CALCULATE进阶》

《精品丨上下文扩展》

《精品丨PowerBI内嵌分页报表》

《精品丨扩展表理论》

小伙伴们❤GET了么?

(白茶:别问我第四种,我不会TAT)

这里是白茶,一个PowerBI的初学者。

Fabric丨白茶 文章被收录于专栏

数据分析进阶之路,带你深入了解可视化技巧。

全部评论

相关推荐

11-24 11:23
门头沟学院 C++
点赞 评论 收藏
分享
牛客410815733号:这是什么电影查看图片
点赞 评论 收藏
分享
10-15 16:27
门头沟学院 C++
LeoMoon:建议问一下是不是你给他付钱😅😅
点赞 评论 收藏
分享
评论
点赞
收藏
分享
牛客网
牛客企业服务