In the previous pumpkin I put down what I intend to learn during this exploration of MDX and how I’m going to go about it. But what does MDX mean, anyway? And where did it come from? And how is it used? And for what purpose?
Well, those are a lot of questions at once, aren’t they? How about we go through them one-by-one?
What is MDX?
MDX stands for MultiDimensional eXpressions and it is a language that I can use to query OLAP cubes, such as those in SQL Server Analysis Services. MDX also allows me to write expressions that are stored in those cubes, allowing the users to include them in their reports. The look and feel of MDX is quite similar to that of formulas in an Excel spreadsheet.
Why use MDX?
Meaning, why not just use T-SQL?
It turns out that while T-SQL is efficient enough for manipulating raw data in an OLTP-style database, it quickly proves a very clumsy and cumbersome language when we want to query the information stored in an OLAP-style database in a meaningful way. While SQL code is great for retrieving specific bits of data, when it comes to creating reports with complicated calculations, things tend to get very crazy, very fast.
Truth be told, T-SQL has improved quite a lot these past few years. In recent releases we’ve seen goodies such as new Windowed and Statistical functions that simplify old-age reporting challenges such as moving averages. However, with all of its improvements, T-SQL still tends to require a lot more code than MDX for the same end.
This comes mainly from an inherent advantage of OLAP cubes themselves, where MDX is used. OLAP cubes understand the data. Specifically, OLAP cubes understand how each entity in the data is related to other entities.
What do you mean?
Let’s say we are writing a query to list all sales of a given product.
In T-SQL, we, as programmers, need to specify in the query how sales relate to products, via a JOIN clause. We would also need a SUM and GROUP BY in there. Do you realize what this means? It means we need to say how we want to retrieve that data. This is because SQL Server only understands raw data, it doesn’t know how sales and products are related in the human world.
However, in MDX can we ask straight away for the values we require. We don’t say how to get them. We use no SUM or GROUP BY. The OLAP cube understands how sales and products are related so it takes care of that for us. MDX is built to take advantage of this.
How did MDX come about?
The MDX language was initially introduced by Microsoft in 1997 as part of the OLE DB for OLAP specification.
This is was followed up by Microsoft releasing OLAP Services 7.0 in 1998, the precursor to what we now know as SQL Server Analysis Services, the first version of which was launched in 2000.
What was remarkable in the history of MDX was that, despite starting as a Microsoft proprietary specification, it proved to be so effective in its purpose, that it was adopted by the rest of OLAP developers in the market. Vendors such as MicroStrategy, Pentaho, Oracle, SAP, Cognos, Business Objects, Crystal Reports, quickly started using MDX in their own products too. MDX became a de-facto standard by its own merit.
So when did MDX become an official standard?
It didn’t take very long at all. A mere three years after MDX was introduced, the XMLA specification was created, which proposed a standard way of communicating with OLAP databases and officiated MDX as its query language. The XMLA specification was founded by Microsoft, Hyperion and SAS and is backed by more 25 other companies in the OLAP business.
How Can We Use MDX?
Great question! There are two major ways we can use MDX. We can use it to write expressions and we can use it to write queries.
We can think of writing MDX expressions as writing formulas on an Excel spreadsheet, the Cube being the spreadsheet itself.
MDX expressions allow us to store complicated business logic in a single place – the OLAP cube. By storing complicated calculations in a single spot, we achieve two useful goals:
- Ease the report developer’s life by providing that complicated calculation off-the-shelf.
- Ensure each user sees the same information, instead of each having to rely on their own, possibly different, implementation.
Using MDX to write expressions in an OLAP cube makes it easier for us to enforce the single version of the truth mantra of OLAP deployments.
We can also use MDX to write stand-alone queries against an OLAP cube. In this regard, we can use MDX pretty much in the same way as we would use SQL against a regular database.
What’s cool about MDX is that while we can write custom expressions in the query itself, we can also make use of MDX expressions already stored in the OLAP cube, saving us from having to write them up again from scratch. Not only that, but when those expressions are updated, our queries will transparently inherit any new behaviour as well. Single Version of the Truth right there!
In the next few pumpkins, I will explore how MDX compares with T-SQL, what makes them similar and what makes them different.
Next Pumpkin: Coming Soon!
Pumpkin Index: Learn / MDX