Home Posts About FAQ?

On DAX's difficulty and CALCULATE: a rant

Why is DAX so hard to learn? It is an evergreen topic. DAX is difficult to learn because of its inherent complexity. DAX is presented as difficult to learn in ways that do not align with this complexity. The topic is evergreen because of demography among DAXists. This difficulty has spawned a movement, for lack of a better term, rebelling against CALCULATE and purporting to improve the experience of writing DAX by writing it worse. In this post, I will explore these topics and share some opinions on the same.

Before diving in, it is important to address the paucity of links and attributions in this post. I have several reasons.

  1. I do not want to signal boost incorrectness.
  2. As mentioned above, these topics are evergreen, so you can find plenty of examples on your own if you wish.
  3. This article is not about attacking anyone or making a point-by-point rebuttal of any individual's position. Rather, it is commentary on trends and ideas in the community.

Whence DAX's difficulty?

DAX is difficult to learn. It has inherent complexities that force one to learn not just DAX, but also data modeling and visualization concurrently. The community around DAX insists on an aura of the mysterious and arcane, which hurts new DAXists. The population of DAXists is primarily made up of new users, thanks to the rate of growth of the language, particularly driven by Power BI. The first must be addressed by improved didactic tools, the second by deciding we can do better, and the third is not even a problem.

DAX is objectively complex

DAX is an analytical query language with several foundational influences: Excel, SQL, and MDX. It was explicitly designed to copy a number of Excel functions and to be syntanctically similar to Excel's formula language. It is a functional relational query language whose primary operations are on tables and relationships. This relational heritage is shared with SQL, among other languages. DAX is also, in part, a response to MDX, Microsoft's earlier dimensional query language for SSAS Multidimensional, née OLAP. Jeffrey Wang, one of the creators of DAX, gives a lot of detail in this excellent presentation. DAX's history is not the source of its complexity.

DAX is objectively complex. I use Rich Hickey's definition of simplicity and complexity. Simply put, complexity is intertwining. When multiple things are deeply interconnected with one another in such a way that they form a single construct which cannot be understood except by understanding all parts together, we can say that these individual components are complected with one another.

DAX is complected with the data model which it is being executed against. Unlike many other languages, reading all of the DAX in a model is insufficient to understand what it does and what it means. Contrast this with a relational database. If you read all the SQL DDL that creates a database and have access to all of the SQL statements that have been run in that database, then you would be able to understand exactly what has been computed. If you read all of the DAX, then you can, at best, make some guesses and assumptions about what it is computing. The same DAX executed against two different models can yield very different meanings, not merely different results. Because DAX is complected with the data model, you must be proficient in both data modeling and DAX; worse, you will be limited by whichever you are less proficient at, no matter how good you may be at the other.

DAX is complected with the intended presentation of data. I posit that most people writing DAX are writing measures with the intent of displaying these in Power BI visuals and Excel pivot tables. If it is not "most" in the population of all DAXists, it is certainly the path of almost all newcomers to the language; regardless of the proportion, this is a significant and relevant portion of DAX and worth making this point. Power BI visuals are really quite dumb things. Excel pivot tables are smarter, but with respect to DAX, they are conceptually quite similar to Power BI visuals. In the context of DAX and a Tabular model, these are both tools to write queries against the data model. They do not know how to do much else besides generate a query and then present the resultset back to you. Any programmatic interaction, dynamicity, or customization comes through DAX measures. This means that when writing a measure, its author must keep in mind all possible visuals where the measure might be used. A narrowly defined measure may not work as intended in all visuals and contexts. A measure should be defined to cover the most general use case, so that it can be used in any visual, but this total generality requires deep understanding of the entire model.

DAX's concept of filter context does not have a direct analog in many other programming languages. The relationships defined in the data model are invisible from DAX code -- they are defined in the model, outside of any DAX code, and they implicitly affect all DAX executed in the scope of the model. This point is technically a component of DAX's complexity with the data model, but it is worth calling out separately. This language feature is incredibly powerful, but is also riddled with nuance and is novel even to experienced programmers. Much of the time, filter context propagation does exactly the right thing that makes intuitive sense. This is a design triumph. Some of the time, filter context is terribly difficult to understand. All of the time, filter context is there, implicitly present in the model, and affecting all of your DAX. You must keep it in mind.

Considering Power BI as an environment in which much DAX is used and learned, we must also consider the complexity that is a report. A report is usually comprised of many visuals on one or more pages. All visuals interact with one another and reports have multiple levels of filters. Thus, the environment where our DAX is executed is rather complex. To understand the filter context that exists for our DAX code, we must understand the model, the visual, and all the interactions with other report elements, not to mention understanding filter context in the first place! DAX is inherently complex.

DAX is subjectively difficult, and that is our fault

We, you reading this and I writing it, are part of a community. This community is made up of all DAXists. We meet in many fora, both online and in person. We find each other in our places of work and outside. Those new to DAX often find their way to some forum or another and engage, even if passively, in the community.

And we, the community collectively, do a disservice to these new DAXists. So much of our discussion of DAX uses language coded to make it seem magical and mystical and difficult to master. Do not misunderstand me, there are inherent complexities in DAX, but these are not mystical. DAX is a programming language with readily inventoried complexities. CALCULATE, especially, gets the treatment of a Sacred Mystery as if understanding it is some initiation rite in The Church of DAX. The words we use do have power, not as magic spells, but in the ideas we communicate.

Another theme in communication with new DAXists is one that I am personally guilty of. It is common when someone asks about what they need to know to understand DAX to hand them the Definitive Guide and tell them to read a bunch of it first. The Definitive Guide is as its title suggests; this is a good thing, but it does not lend itself to a friendly introduction to the language. Certainly, there are other friendlier resources that are recommended frequently. It is a common theme that I have observed, though, that completing the Definitive Guide is treated as a milestone in one's DAX journey that must occur before competence is achieved. I do believe that anyone serious about DAX should read the Guide, but it is an intimidating thing to require, even implicitly, of new DAXists.

If we, as a community, insist on treating this modest query language as if it is ridiculously difficult or somehow magical, then we can expect that people will believe us when we say so. DAX is not simple, but we need not burden new DAXists with unrealistic expectations that they will never be able to understand it. We can tell new DAXists whatever we want about the language; why don't we try on something like, "Here is what you need to know and you can do it!" rather than, "You should despair of understanding this if you are not a wizard."

DAX's demography

DAX has been growing rapidly, with Power BI being one of its primary vectors. Like, bonkers fast. My own casual observations indicate that the user base of Power BI has had an annual growth rate of more than 2x per year since launch. If the user base doubles every year, then it is a simple mathematical identity that at any given time, half of all users have less than one year of experience. This will remain true for at least the next several years as well; if nothing else, the recently announced inclusion of Power BI in Office will make sure of that. The topic of DAX's difficulty will remain an evergreen topic so long as the language's user base is growing rapidly (unless we come up with profoundly better ways of teaching it).

DAX is a relatively young language. Even Jeffrey Wang does not have 15 years of experience with the language (based on the 2009 patent date in the presentation linked above). Most of those recognized as experts in the community have less than a decade of experience, though there is a small and growing decade club. Just think about how few human endeavors have such a low cap on the tenure of experts. We are still figuring out how best to write DAX and how to communicate and teach this. We are young, so let us embrace the discovery!

The No-CALCULATE movement

There is a fringe movement in the field of DAX pedagogy that eschews one function, CALCULATE. Proponents of this approach have decided that one of the novel components of the language, filter context, is not worth understanding. Rather than using this language feature, the approach utilizes a series of table variables to create a more imperative programming style in DAX. There is a vocal minority who preach this style and find it to be more approachable when writing and learning DAX.

This has come up in conversation among peers in the past, and several times recently. Some respected members of the community have been vocally dismissive of the approach, saying things like, "Avoiding CALCULATE is the worst thing you can do in DAX." I have been similarly dismissive in the past, but I feel I need to walk that back.

To be abundantly clear, the no-CALCULATE approach is incorrect and will manifest performance pathologies in larger models. I am not endorsing the approach, but I must admit that casual dismissiveness is an unhelpful response. In fact, there are useful things to learn from the movement, just not any of its actual DAX.

The blunt truth is that for small N, it just does not matter. And the vast majority of people coming to Power BI have small N. A shocking number of Power BI models fit in CPU cache; hardware today is simply incredible. And the bigger challenge for many people new to Power BI and DAX is to just understand what is going on and get the results they need. After all, DAX is inherently complex!

I applaud the marketing acumen on display in this movement:

  • "DAX Counterculture"
  • "Ignore the complexity of CALCULATE"
  • "Make your DAX easier to understand"

These are great bits of marketing and you can see this signature across all the no-CALCULATE writings that pop up, regardless of author. The proponents of this approach have done a great job of crafting a message that resonates and largely self-perpetuates.

And you know what? I agree that a series of tables makes for an understandable computational model. You know who else thought that was a good idea? The Power Query folks. Power Query generates a series of steps where each step is a table representing a partial result. These partial results are refined until they add up to the result table we want. Doing the same thing in DAX lets you keep the same mental model of calculation. It is a good mental model of analytical computation!

Ultimately, we are all made better off by the proliferation of DAX and Power BI. The more people that use these tools, the larger our job market and consulting market and ISV market for Power BI-related things. And the more people using Power BI, the more product adoption, the more market penetration, the more Microsoft will continue to invest and improve the tool and DAX (more on this, soon!).

Now, I could make a point-by-point rebuttal of any of the no-CALCULATE articles out there and have done in the past. But what do any of us get out of that? I get to feel superior because I know "more DAX better", but that does not actually help anyone. I could tear someone down and thereby put myself in a position of authority, but that is a nasty way to gain authority. There are other experts who seem willing to do this, so I will let them. And ultimately, the no-CALCULATE movement has probably helped more people solve actual problems in DAX than I have in my career. Because it is more approachable to a novice. It allows a more imperative style, which is usually a more intuitively understandable programming paradigm than functional. This is especially true when the language is DAX, a functional and relational language with a concept of filter context that does not exist in the vast majority of programming languages. And it matches the way you should think about transformations in Power Query, the other data engine that goes with Tabular. And it does not matter for so very many models that will never grow beyond a few hundred thousand rows of fact data.

But it is incorrect.

Conclusion

This is a rant so there is not much of a conclusion. But bring on the new DAXists still figuring out how it all works. After all, there is a lot to figure out and keep track of. And there will only ever be more people starting. With so many people, some are bound to get it wrong. Bring on the bad DAX and the good. When someone wants thousands-of-words rants about how to think about DAX and how to write it right ... well, I will be here with those words when they are ready (:

As always, feel free to get in touch.