Sunday, August 12, 2012

What’s the deal with PowerPivot?


“Insanity: doing the same thing over and over again and expecting different results.”Albert Einstein

I’ve struggled for months to not only understand PowerPivot but to make use of it in my daily healthcare analytics. I read the same Microsoft hoping to uncover the “magic” of DAX. As I dive deeper into Microsoft’s BI stack, PowerPivot keeps coming up.
I now realize that there are two core concepts that I must master:
· The various data models supported by PowerPivot (Check out Data models in PowerPivot)
· The practical use of “Filter Context” (Check out PowerPivot’s Filter Context from a SQL perspective)

Before I write about my new found understanding of PowerPivot, I want to describe why PowerPivot is even needed. In my daily life, I use SQL to extract data from relational databases like Oracle and SQL Server. I use Excel to present that data in an interactive model or to simply pass data to an end-user.
PivotTables had become an after thought before using data cubes. PivotTables make possible to display data cubes in Excel. Mainly PivotTables are analytic tools that allow one to dynamically query data. In SQL terms, PivotTables perform aggregation functions like SUM and COUNT in each cell. This is great until you want to produce a useful metric like ER rates per 1000 or percent compliance rates. There are tricks such as using a PivotTable as a source for another PivotTable. PivotTables aren’t as flexible as a well written SQL statement.
Enter PowerPivot! Now you can,
· Extract data from multiple data sources (text, data cubes, relational databases)
· Create complex formulas that answer real-world questions using DAX
· Prototype new data-sets and pass to cube developers for the creation of robust data cubes.
· Perform Business Intelligence analysis and ad-hoc analysis without knowing MDX (the language of cubes)
· Share your PowerPivot model through SharePoint to allow others to directly connect to it
That’s nice but, what’s the big deal with PowerPivot? Let’s take a typical data-set and a PivotTable
ID
Product
Color
amount
1
Pen
Red
2
3
Pencil
Blue
6
5
Pen
Blue
10
7
Pencil
Red
10
I created the PivotTable below by:
· I created an Excel Table (using my favorite shortcut Ctrl-T)
· Click “Summarize with PivotTable” from the table’s design ribbon
· Dragged Color into Row labels
· Dragged Name into Column labels
· Dragged amount into Values and leaving the “Summarize values as” SUM function
· No Slicer or PivotTable’s report filters were used
You get something like below,
Sum of amount
Column Labels
Row Labels
Pen
Pencil
Blue
10
6
Red
2
10
Not too useful. Let me switch to my SQL analogy. One can think of each cell in the PivotTable as executing a single dynamically created SQL statement.
Column Label (Product)
Row Label (Color)
Pen
Pencil
Blue
select sum(amount)
from data_table
where Color = 'Blue' and Product = 'Pen'
select sum(amount)
from data_table
where Color = 'Blue' and Product = 'Pencil'
Red
select sum(amount)
from data_table
where Color = 'Red' and Product = 'Pen'
select sum(amount)
from data_table
where Color = 'Red' and Product = 'Pencil'

Each of the 4 cells in my PivotTable executes a query from my data_table. In the cell that intersects Pen and Blue, SUM(amount) function will execute only for rows in the data_table where Color = 'Blue' and Product = 'Pen', producing the amount of 10. No matter where I place this PivotTable in my worksheet or the entire workbook, this cell will always execute the SUM(amount) function where Color = 'Blue' and Product = 'Pen'. By adding a field to the report filter or adding a slicer, the “where” clause statement automatically includes that additional attribute.
What’s so great about PowerPivot? It allows you to change the “where” clause statement. This wasn’t possible prior to PowerPivot’s introduction. This is typically the domain of the cube developers who utilize Microsoft’s SSAS and MDX. The door is now open for an Excel Power user or analyst to create useful cube-like data models. Now, I can compute the sum of all items from my data_table and place the result next to individual amounts already in the PivotTable. I can blend my data_table with another table and display a related item in the PivotTable.
In my next post, I’ll discuss the data model concept behind PowerPivot. In order to utilize DAX, you have to understand how to arrange your data.

No comments:

Post a Comment