As SQL developers, we develop a well-defined understanding of relational data structure over the years that makes it feel natural to manipulate relational data using the SQL query language. Switching to MultiDimensional eXpressions (MDX) queries can be painful if some basic differences between SQL and MDX are not clearly understood. In this article, I will explain those differences from a SQL developer point of view and illustrate via some examples.
OLTP and OLAP Databases
In OLTP databases, we are familiar with lookup tables and transaction tables; in the world of OLAP databases, their equivalent are dimensions and measures. SQL queries are written against an OLTP database, and the queries contain the elements that define how to join lookup tables and transactions tables. Similarly, MDX queries are written against an OLAP database, and the relationship among the various dimensions and measures are defined as part of the OLAP database design itself. Hence when we write MDX queries, it doesn’t contain any elements to describe relationships between dimensions and measures.
MDX Query Structure and Terminology
The below figure illustrates overall MDX query structure. MDX query will have a SELECT clause followed by one or more AXIS and FROM clauses. Optionally, there can be one or more named MEMBER and/or a SET defined prior to the SELECT statement; starting with the key word WITH. Named members in MDX is equivalent to a calculated column in SQL, and named set is like a SQL view. Named member/set can be used in the same statement in which they are defined. Alternatively, they can be defined as part of an OLAP database design and can be utilized across other MDX queries. An optional WHERE clause can also be added to filter the data.
Each AXIS produces one set. A Set is made up of one or more Tuple/Tuple sets. A Tuple is made up of one or more Attributes (from dimensions and/or Measures) just like a SQL statement contains one or more columns. A Tuple that returns more than one row is called a Tuple Set.
MDX Rules
As SQL developers, we need to keep in mind certain basic differences between writing MDX queries and SQL queries. We will review some key terms and then summarize various differences as MDX rules. We will use the following Product and Measures dimension from the AdventureWorks2012 sample OLAP database:
Attribute Hierarchy
Attributes in dimensions are analogous to columns in SQL database tables. In Product dimension, as shown above, Category, Large Photo and Model Name are some of the attributes. These attributes are populated with data from their corresponding columns in the underlying OLTP table. Each Attribute in a dimension has its unique hierarchy name.
User-Defined Hierarchy
An OLAP database allows us to create a user-defined hierarchy from the existing attributes of a dimension. As you see in the Product dimension above, one of the user-defined hierarchies is Product Categories and is made up of Category, Subcategory and Product attributes. Attributes in the user-defined hierarchy become levels underneath it and share the same hierarchy as the root. In the above example, Product Categories has its unique hierarchy name (Product Categories), and all levels (Category, Subcategory, Product) underneath share the same hierarchy.
The Measures Dimension
This dimension in the OLAP database is a single hierarchy dimension. All the measures in the Measures dimension belong to the same single hierarchy name.
Now that we went over some background concepts and terms, we are at an appropriate point to talk about MDX rules. Note that I will use attribute and attribute hierarchy interchangeably.
Rule #1: The same attribute Hierarchy cannot be repeated within the same Tuple.
In the MDX query below, the attribute Category is repeated within the same tuple — hence it will throw an error when executed:
SELECT { ( [Product].[Category].Children, [Product].[Subcategory].Children, [Product].[Product].Children, [Product].[Category].Children ) } ON AXIS (0) FROM [Adventure Works]
In the next MDX query below, the Category attribute is repeated within the same tuple but statement will execute successfully. The reason is [Product].[Product Categories].[Category] belongs to hierarchy [Product].[Product Categories] and NOT to [Product].[Category], thus no same attribute hierarchy is repeated within the same tuple:
SELECT { ( [Product].[Category].Children, [Product].[Subcategory].Children, [Product].[Product].Children, [Product].[Product Categories].[Category].Members ) } ON AXIS (0) FROM [Adventure Works]
To clarify this further, the execution of the MDX query below will result in an error message suggesting [Product].[Product Categories] is used more than once. As discussed earlier, all the attributes in a user-defined hierarchy share the same root hierarchy, which in this case is [Product].[Product Categories]:
SELECT { ( [Product].[Product Categories].[Category].Members, [Product].[Product Categories].[Subcategory].Members, [Product].[Product Categories].[Product].Members ) } ON AXIS(0) FROM [Adventure Works]
Rule #2: A Set with multiple tuples needs to follow UNION operator-like rules in SQL (i.e. the number and order of the attributes used in multiple tuples within the same set should match).
The below MDX query with empty set on AXIS(0) has a set with two tuples on AXIS(1). Both the tuples fulfill the UNION operator-like rule in the sense that both have same number and type of attributes across tuples, and thus will execute successfully:
SELECT {} ON AXIS(0), { ([Product].[Category].Members), ([Product].[Category].Members) } ON AXIS(1) FROM [Adventure Works]
The below MDX query will also execute successfully because multiple tuples in the set have the same hierarchy structure — only the member filtering value differs, which is allowed:
SELECT {} ON AXIS(0), { ([Product].[Category].[Bikes]), ([Product].[Category].[Accessories]) } ON AXIS(1) FROM [Adventure Works]
The below MDX query will result in an error when executed because tuples in the same set have a different hierarchy structure, or we can say a different dimensionality:
SELECT {} ON AXIS(0), { ([Product].[Category].Members), ([Product].[Subcategory].Members) } ON AXIS(1) FROM [Adventure Works];
In the below query, order of attributes used across multiple tuples in the same set on AXIS(1) doesn’t match, hence it will result in an error:
SELECT {} ON AXIS(0), { ([Product].[Category].Members, [Product].[Subcategory].Members, [Product].[Product].Members), ([Product].[Category].Members, [Product].[Product].Members, [Product].[Subcategory].Members) } ON AXIS(1) FROM [Adventure Works];
Similarly, in the below query the number of attributes used across multiple tuples in the same set don’t match on AXIS(1), hence it will return error when executed:
SELECT {} ON AXIS(0), { ([Product].[Category].Members, [Product].[Subcategory].Members, [Product].[Product].Members), ([Product].[Category].Members, [Product].[Subcategory].Members) } ON AXIS(1) FROM [Adventure Works]
Below are three examples, they use attributes from the user-defined hierarchy [Product].[Product Categories].
Example #1 returns an error because the set on AXIS(0) is made up of single tuple where same attribute hierarchy is getting repeated within the same tuple. Example #2 executes successfully because the set on AXIS(0) is made up of three tuples (notice the tuple identifier brackets) all of which belong to the same hierarchy of [Product].[Product Categories]. Example #3 also runs successfully as it is syntactically the same as Example #2, because when no tuple identifiers are added each attribute is considered as one tuple, hence that is also treated as a set with three tuples:
Example #1 returns an error:
SELECT { ( [Product].[Product Categories].[Category].Members, [Product].[Product Categories].[Subcategory].Members, [Product].[Product Categories].[Product].Members ) } ON AXIS(0) FROM [Adventure Works];
Example #2 runs successfully:
SELECT { ([Product].[Product Categories].[Category].Members), ([Product].[Product Categories].[Subcategory].Members), ([Product].[Product Categories].[Product].Members) } ON AXIS(0) FROM [Adventure Works];
Example #3 also runs successfully:
SELECT { [Product].[Product Categories].[Category].Members, [Product].[Product Categories].[Subcategory].Members, [Product].[Product Categories].[Product].Members } ON AXIS(0) FROM [Adventure Works];
Rule #3: Attributes used in one Axis cannot be repeated in any other Axis or in the WHERE clause in the same query.
In the below MDX query, attribute [Product].[Category] is repeated in the WHERE clause, hence it will return an error message saying Category hierarchy already appears in AXIS(0):
SELECT { ([Product].[Category].Members) } ON AXIS (0) FROM [Adventure Works] WHERE [Product].[Category].[Bikes];
This different variation of the above query executes successfully. The reason is we are using the different attribute hierarchy in the WHERE clause to achieve the filtering, instead of repeating the attribute used in AXIS(0):
SELECT { ([Product].[Category].Members) } ON AXIS (0) FROM [Adventure Works] WHERE [Product].[Product Categories].[Category].[Bikes];
In the two examples below, the first one returns the error message that Product Categories hierarchy already appears in the AXIS(0), as all attributes under the user-defined hierarchy share the common root hierarchy. (In this case that’s [Product].[Product Categories].) The second example runs successfully because the two attributes used on each AXIS have their own unique hierarchy.
Example #1 returns an error:
SELECT { ([Product].[Product Categories].[Category].Members) } ON AXIS(0), { ([Product].[Product Categories].[Subcategory].Members) } ON AXIS(1) FROM [Adventure Works];
Example #2 executes successfully:
SELECT { ([Product].[Category].Members) } ON AXIS(0), { ([Product].[Subcategory].Members) } ON AXIS(1) FROM [Adventure Works];
Lastly, let’s look at the following two MDX query examples. As we noted earlier, Measures dimension is a single hierarchy dimension where all the measures belong to the same Measures hierarchy. The query in Example #1 below will error on execution because the same attribute hierarchy is getting repeated within the same tuple, hence violating Rule #1. In the second example, the same attribute hierarchy is getting repeated across the sets on AXIS(0) and AXIS(1). This essentially means that the measures can be used only in a single AXIS or in the WHERE clause:
Example #1 returns an error:
SELECT { ([Measures].[Internet Order Count], [Measures].[Internet Sales Amount]) } ON AXIS (0) FROM [Adventure Works]
Example #2 returns an error:
SELECT { ([Measures].[Internet Order Count]) } ON AXIS (0), { ([Measures].[Internet Sales Amount]) } ON AXIS (1) FROM [Adventure Works]
Conclusion
If your background is n SQL queries, it is important to know the basic differences between SQL and MDX queries. In this article, we first talked about related concepts and terminology and followed up with some examples to have a productive experience in writing MDX queries. You can find more detailed exercises and presentations on related topics at my personal blog site www.doSQLBI.com.
AIS is hiring experienced SQL developers! Click here for our current list of opportunities.