Sunday, November 6, 2011

Add PowerPivot to your Excel now!

"Don't be too proud of this technological terror you've constructed. The ability to destroy a planet is insignificant next to the potential of the Force." - Darth Vadar, Star Wars

Ok, so Microsoft hasn't created the means to destroy a planet. They have elevated Excel's usefulness. With PowerPivot for Excel, analysts can sift through millions records, merge multiple data sources, and create a data cube, all within Excel. 



PowerPivot for Excel is a free add-in from Excel that gives Business Intelligence tools to Excel Power users. I say power users because PowerPivot isn't for the average Excel user. If you don't understand pivot tables or dimensional modeling, then you'll have a steep learning curve. If you're like me, warnings such as mine only make you more curious. You can download PowerPivot from Microsoft. After downloading the 91 Megabyte file, run the setup executable file. Then, open Excel's developer ribbon and select the add-in icon to add PowerPivot to your Excel environment. 


Once installed, you'll be able to view a new ribbon and dialog (see below). In a nutshell, you pull in data from Oracle, text files, SSRS 2008 reports, or a host of other sources inside the PowerPivot window. Then, you can relate each table by using a common key in both tables (without writing a single SQL statement). You have the option of displaying the data in a pivot table, pivot table, both, or a flattened pivot table (a glorified table). That's it. Instead of using vlookups and sumifs functions, PowerPivot allows you to express your data in a pivot table (just like you would if connected to a SQL Server SSAS data cube). 




PowerPivot version 1.0 was released with SQL Server 2008 R2 in May 2010. A newer version will be released with SQL Server 2012 in early 2012. Check out what's new in PowerPivot 2012 here.

When I happened upon an Excel blog that mentioned PowerPivot, I was captivated by the claim it could handle millions of records. It turns out that it can do so much more. 


Microsoft states that with PowerPivot, one can:

  • Take advantage of familiar Excel tools and features
  • Process massive amounts of data in seconds
  • Load even the largest data sets from virtually any source
  • Use powerful new analytical capabilities, such as Data Analysis Expressions (DAX)
  • Make the most of multi-core processors and gigabytes of memory

There are a few gotchas with PowerPivot:

  • PowerPivot only works in Excel 2010. If you send your workbook to someone without Excel 2010, then they won't be able to interact with your pivot table based on PowerPivot data.
  • In the corporate environment, it's best to install PowerPivot for SharePoint 2010 so you readily share your PowerPivot models with others. Microsoft is making a very good case for organizations to purchase SharePoint. 
  • If your organization uses Microsoft terminal servers (i.e. Citrix) to deploy windows to its employees, you'll need to do extra homework to get PowerPivot working.
  • To merge data from multiple systems (for example Oracle and SQL Server), you must ensure the keys used to relate the tables have the same data type. Oracle and SQL Server don't define numbers the same way. It's best to import your data into PowerPivot as text and convert to a number within PowerPivot.  
  • To create a data cube, you must understand dimensional modeling. Without such knowledge, DAX functions won't make much sense. 
    • Design your data structure first, then pull in the data you need. It begins with one fact table and multiple dimension tables.
    • Slicing data becomes tricky when you have multiple dimensions in a single pivot table
  • Key concepts to understand: filter context, how to change filter context, row context, and the 35 new DAX functions.
  • Version 1.0 isn't truly stable. It works...most of the time. I've experienced a few occasions where Excel claims it can't open the PowerPivot data. Sometimes, Excel can't open the PowerPivot ribbon. If you migrate from SQL Server 2005 to SQL Server 2008 R2, you'll need to re-install PowerPivot. 

The gotchas can be overcome. There is active community of power users and Microsoft most valuable professionals who blog about PowerPivot everyday. 


Monday, October 24, 2011

BI tools for data analysis

I recently performed a Google search on selecting Business Intelligence (BI) software tools. The first prominent link led me to read the white paper, How to Choose the right Business Intelligence Technology, I like how the article lays out the various styles of analysis. It goes on to describe the Microsoft tools needed to implement the each style. 


The article got me thinking about my quest to define the role of the analyst. Reading the blog "Reporting vs. Analysis: What’s the Difference?", help formalize my thoughts. Check out Brent Dykes's key thoughts on the differences between reporting and analysis:

Data Reporting is...
  • Organizing data into informational summaries 
  • following a push approach, where reports are pushed to users who are then expected to extract meaningful insights and take appropriate actions for themselves (i.e., self-serve) 
  • Outputs: canned reports, dashboards, and alerts (reports sent to users based on a triggering event) 
  • providing no or limited context about what’s happening in the data. In some cases, the end users already possess the necessary context to understand and interpret the data correctly. 
  • not going to answer the “so what?” question on its own
Data analysis is...
  • Exploring data and reports in order to extract meaningful insights 
  • Following a pull approach, where particular data is pulled by an analyst in order to answer specific business questions. 
  • Outputs: Ad hoc responses (reports) and Analysis presentations (a comprehensive, deep-dive analysis) 
  • Providing context which is critical to good analysis. In order to tell a meaningful story with the data to drive specific actions, context becomes an essential component of the story line. 
  • Emphasizing data points that are significant, unique, or special - and explain why they are important to the business 
Upon retrospect, I would add one more style of BI reporting to their list of 5 styles.I call it:

Business Analysis
This style is performed by a business/clinical/health data/data analyst with the purpose of answering a business question that leads to action. The analysis must sniff out the business question through questioning and discovering the appropriate business context. The end user will receive a written analysis with supporting charts and tables. The analyst will need tools that allow to him/her sift through sometimes millions of rows and find a meaningful summaries of this data.

Tool Options to support Business Analysis

MS Excel and Power Pivot for MS Excel

The most versatile tool in Microsoft's arsenal continues to be MS Excel. Excel can consume SQL data extracts, data cubes, and external data from vendors. Utilize various built-in functions and add-ins to perform analysis on the data points. With the Power Pivot add-in installed, Excel merge disparate data sources with relative ease. You can even create your own data cube of the source data.

MS Word
Yes, Word is a tool. One needs to express the end results of analysis in a coherent and organized fashion. Word provides to means to accomplish this.

MS Outlook
I know that it is strange to list Outlook as a BI tool. Here is my thought, rarely is an analysis so compelling that it leads to immediate action. Part of the analsyst's job is to follow-up with the business user. If no action is taken then the analysis was neither useful or needed. Utilizing Tasks and Calendar events to track follow-up conversations is vital to a successful analysis.

Describe impact on an organization
I've heard "we have lots of data but no information". In order to answer a business question, one must transform data into a meaningful story. Given the right data and the appropriate tools, an analyst can create a compelling story that can be understood by many.

I find myself travelling down this path toward insights and actionable data. After walking down the reporting path for many years, this will be a major paradigm shift.