This is the fourth article of the series: DAX Series — Learn the core of Power BI
4. Iterators and Aggregators
In the end of this article I have a link to my GitHub where you can find one empty Power BI Model and one full model with the measures and similar I have used in this article.
In this article I will describe what Iterators and Aggregators are in DAX. When should you use each and what is the theory behind these?
You can also find the story here:
What are Iterators and Aggregators?
Aggregators are functions that aggregate values of a column in a table and return a single value. They could work while only taking in one parameter, as the column Order Quantity in the example below.
Aggregators are simple functions that aggregate columns, such as
- SUM: Total Quantity = SUM ( ‘Sales’[Quantity] )
- MAX: Max Sales = MAX ( ‘Sales’[Quantity] )
- MIN: Min Sales = MIN ( ‘Sales’[Quantity] )
- AVERAGE: Average Sales = AVERAGE ( ‘Sales’[Quantity] )
Iterators on the other hand need to take in two parameters to work, which means that they do not work by only sending in a column. The first parameter that iterators works on are the table that they need to “scan”.
(They need to know which virtual table that the expression, which is the second parameter, should iterate over.)
The second parameter is in most of the cases this expression that the iterator will evaluate for each row in the table above. Enough talking, let us first create exactly the same measure as above but with an iterator instead of an aggregator.
As you can see does the iterator SUMX require the table, Sales, and the Expression, which in this case is just the column, to compute the order quantity. The result of this measure and the one earlier will be exactly the same and the performance as well.
Actually all the aggregators are internally, inside the Power BI engine, iterators. The SUM formula above is the same as the SUMX formula here below. The aggregators are only in Power BI to make it possible to write quicker syntax.
Some common iterators are:
- SUMX: Total Quantity = SUMX ( ‘Sales’, ‘Sales’[Quantity] )
- MAXX: Max Sales = MAXX ( ‘Sales’, ‘Sales’[Quantity] )
- MINX: Min Sales = MINX ( ‘Sales’, ‘Sales’[Quantity] )
- AVERAGEX: Average Sales = AVERAGEX ( ‘Sales’, ‘Sales’[Quantity] )
Yes, you can see that in these cases the only differences between the Iterators, such as SUMX, and the aggregator, such as SUM, is the table parameter as the first parameter in the iterator (‘Sales’).
When the Expression only is a column, such as ‘Sales’[Quantity] the two can be used interchangeable, but I would still recommend to write the long form/iterator when you begin to learn DAX.
As you embark on your DAX journey, taking the additional time to write in long form will pay dividends down the line. If you understand what DAX is doing under the hood, you have a much better chance of writing bug-free measures. Or at least measures you can debug. And that is worth every indent and seemingly repetitive line of code that you will ever have to write.
The picture above represents the bugs that will start spreading if you only use the short-form DAX when beginning to learn how to work with aggregators and iterators.
One quote they also say is that before you start using the short versions/Aggregators (SUM etc.) you should have written the long versions/Iterators (SUMX) at least 500 times.
The reason is that it is much easier to remember and understand how the iterator works than the aggregators when just looking at the syntax. It will also be much easier to write bug-free measures in the future.
(There are also “A-functions” (AVERAGEA, MINA etc.) but I have never used them while I have worked with DAX, and as their behavior is not as intended while working with text columns or columns with missing values, it is best to not use these functions at all at the moment).
In the next paragraphs I will describe iterators in more detail and I think that you will understand more the importance of using iterators instead of aggregators, even in cases where both works. I will describe the two parameters that are sent into iterators and common use cases for iterators.
How does iterators works?
Let us go through the syntax again
There are at least two parameter that are sent in to an iterator, the table to iterate through and the expression that will be evaluated. Let us go through these two parameter case by case first.
- First parameter — Table to Scan
This first parameter, Sales in the picture above, is what differentiates the iterator from the aggregator. It is also the reason for why it is important to write out the whole syntax and not the syntax sugared version.
The first iterator is the table to scan, where the table can be thought of like a virtual table inside DAX that the iterators will scan and then run the expressions against.
As we want to calculate the Order Quantity in the expression above we must include the Sales table as the first parameter. If this would be an aggregator the sales table would not be included in the expression.
- Second parameter — Expression to evaluate
The second parameter is the expression that we want to evaluate. The expression can be a single column like the example above, but it can also be a specific expression like the example below.
The specific thing with this expression is that it could not be written in an aggregator. As long as you want to write an expression that is not a single column, you must use an iterator.
But as I said above, I would still recommend to use aggregators for single column expressions as well, even if it just to understand what is really going on behind the scene better.
Let us try to evaluate what really happens in the iterator above.
- First the table ‘Sales’ is scanned
- For every row in this “Virtual table” ‘Sales’, the expression is evaluated (A row context is created by the iterator).
- After the expression have been evaluated for each row the rows are aggregated according to what this iterators says. (In this case they are summed together as there is a SUMX iterator).
I think it is easiest to think like the calculated column example I went through in a previous article or like an Excel Sheet. By including the Sales table as the table to scan you are “Creating” the Excel Sheet ‘Sales’ behind the scene, which have all the columns in the Sales table.
As the columns ‘Order Quantity’, ‘Discount Applied’ and ‘Unit Price’ is included in this table, we can now express a result of these columns for every row. Therefore the expression above is done for row 1, row 2, row 3 etc.
In the end after all of these rows from this virtual sales table have been evaluated in an additional virtual column, they need to somehow have be aggregated together, as this is the meaning of this expression.
As we use the SUMX iterator the virtual column that is the result of the expression is summed together. Sales Amount for Row 1 + Sales Amount for Row 2 + Sales Amount for Row 3 etc. are aggregated to one number.
If we would have used another aggregator as AVERAGEX, we would calculate the average instead of all of these rows.
Other people explaining Iterators and Aggregators
I think it is always good to get different perspectives when learning new concepts, especially when it comes down to a such important concept as this one. Below are some of the content that I went through to learn about Iterators and Aggregators.
- SQLBI — The Definitive Guide to DAX
If you have the book “The Definitive Guide to DAX” there are a part of the book where the aggregators and iterators are explained. This is the part where I learned about these new concepts the first time.
- CSG Pro — Elements of DAX
I really like the elements of DAX series by CSG Pro. I think this is one of the best online resources for really learning about iterators and aggregators. They explain it with virtual tables in a way that no one else has done before.
- EnterpriseDNA -Youtube (SUM vs SUMX), (Iterating Functions) etc.
EnterpriseDNA have several Youtube videos about iterating concepts. Both videos about the difference between SUM and SUMX and several deep dive videos about iterating functions.
- Try out yourself
Probably the most important part when it comes to iterating and aggregating functions is to try out these concepts yourself. You will get error messages and bugs, but when you try to solve these you will learn and understand the concepts better
EnterpriseDNA: Iterating Functions Deep Dive
Radacad: SUM vs SUMX — What is the Difference Between the Two DAX Functions in Power BI
Power BI Models available on GitHub
You can find the Power BI files for all of the Blog posts below:
Inside PowerBI-Datamodels you can find the folder 4. Iterators and Aggregators, where I have posted the “Clean version” and the “Answer Key” with all the measures and similar done in this article.
Questions or inquiries
If you have any questions or input please leave an input or contact me on LinkedIn: www.linkedin.com/in/elias-nordlinder
Have a great week everybody