Agile Dimension Modeling – Part I

Agile and rapid are some of the most frequently used buzz words in the data warehousing community these days, so I thought it was about time to put these in context of timeXtender. Those who know our tool and company are well aware, that timeXtender was created from day one for agile implementation of ETL and data warehouses and that the tool is a rapid development tool. Just to set expectations right, this and following posts will not be an attempt to cover in complete any specific agile methodology like scrum, just as it will not be a complete coverage of the Kimball dimensional modeling approach.

Agile methodologies

In software development there are several iterative and incremental software development methodologies commonly referred to as agile software development methodologies, including scrum and extreme programming. In agile methodologies, teams are usually cross-functional and will most often include a customer/business representative. There is no fixed team size, but they are generally quite small from 3-5 people.
User stories is one of the artifacts commonly used in agile development and one that I will emphasize for use with agile dimensional modeling.

Dimensional modeling

Kimball encourages using the four-step dimensional design process to consistently consider four steps in a particular order. In short the steps are

  1. Select the business process to model
  2. Define the grain of the fact table
  3. Define the dimensions for the fact
  4. Define the numeric measures of the fact

Consistency in the model using conformed dimensions and facts is key to a scalable data warehouse. The star schema(s) that is the result of the dimensional design process is usually considered the smallest deliverable and often represents several days or weeks of work for developers to build.

Agile + Dimensional Modeling

The ever changing requirements for data warehouses today makes it natural to adopt elements from agile methodologies, to be able to keep up with the changing requirements. If you are not able to deliver in very fast, short iterations, chances are that your dimensional models will be outdated before they get implemented.
I do not remember a single project, where the customer did not have a few requirements, that was key to the business and represented urgent needs. It usually does not take much analysis and discussion to agree, that these early identifiable requirements should be implemented first.
It is about getting actionable information as fast as possible, mainly because it serves a urgent business need, but in todays economy projects are also under pressure to deliver return on investment faster than ever before.

Agile + Iterative

Agile and iterative really goes hand in hand, as agile development promotes a set of techniques using iterative development for rapid delivery. As agile traditionally focuses only on the development part of the data warehouse lifecycle, I suggest to focus more on being agile rather than using agile. This will include being agile and using iterative techniques in the requirements specification phase also.
At a high level the agile approach can be illustrated by the following

Iterative Lifecycle

In the next few posts, I will introduce user stories and prototyping around dimensional modeling as a way to remain true to a scalable dimensional model, while using a deliver fast, deliver often approach.