When designing a data warehouse or reporting architecture, the smallest questions can pose some of the biggest challenges. Identifying what may appear to be an obvious term or metric can slow down or even derail a business intelligence initiative. Early in my career as a BI professional, I was faced with just such a scenario when trying to address a simple question: “What is a day?”
A Small Question
Years ago, I was working for a healthcare company during a major system implementation. This two-year project was incredibly challenging and at times physically taxing, with 90+ hour weeks during the final few months leading up to the cut-over to the new system. As that initiative wrapped up, I was given a new set of responsibilities: Build out a new data warehouse and reporting architecture to help the business make the most of our new system. I set out to interview the users who would be the primary data consumers, to learn about their needs and expectations and get a sense of how they expected to use the data.
Although I was a bit out of my element – I was still relatively green in the world of BI – I used what I gleaned from those user interactions to construct an early version of a data warehouse. In this early proof-of-concept I had accounted for the big-picture design and some of the more complex technical components, so I was surprised when the first big hurdle of the project was a simple but fundamental question that had not yet been fully answered.
What Is A Day?
Once the initial prototype was built, my team began testing with real data against existing metrics that had already been validated. Excitement about the new data warehouse quickly turned sour when our testing revealed that many of those metrics did not match up. We discovered that our data validation tests were accurate when slicing or filtering across all dimensions except date. Even then, the data was only incorrect when reporting against certain departments. After double-checking the date filtering logic, we were left with an unsettling conclusion: The question, “What is a day?” had different answers for each business unit.
Most of us envision a day as the 24-hour period between midnight and the following midnight, and typically data is stored with a date/time stamp (or sometimes just a date) to support this. However, what we found during the data validation phase of this project was that some business units look at a day differently than simply the 24 hours between midnights. Ask someone in a healthcare setting how they define a day, and you may get an answer below:
- The 24-hour period beginning at midnight and ending just before the previous midnight
- A 24-hour period which begins at the outset of the business day
- Any continuous 24-hour period, regardless of whether it spans more than one calendar day
- Any duration of time, which may be repeated to produce two or more “business days” per day (such as if a patient has multiple outpatient therapy visits in the same calendar day)
- Three consecutive eight-hour shifts, or two consecutive 12-hour shifts
Answering the Question
Given that the care of a single patient in a single day can traverse multiple departments – emergency, inpatient, imagining, and therapy, among others – clearly setting the boundaries as to what constitutes a day with respect to patient care is complicated. Departments such as billing and staff capacity planning complicate this formula. Because these support services are often day-centric in their metrics, reconciling these overlapping data points becomes a multi-front initiative.
Sometimes, answering these fundamental questions is a matter of just getting everyone to use the same definition of a day (or hour, or project, or widget). Other domains, such as patient care, aren’t as easily rebranded. The data and applications through which it flows must adapt to how patient care is done, not the other way around. In the project I mentioned above, none of the definitions of a day were incorrect when taken in context. Therefore, it was up to us as data professionals to adapt the analytics solution to reflect the needs of the respective patient care and business units.
Making It All Work
There is no one-size-fits-all solution to data challenges such as these. However, there are practices I recommend to help you better understand the problem you’re trying to solve and help get others on board with the solution you eventually come up with:
- Make sure you understand how the business uses their data. Data professionals, especially those of us working in business intelligence and analytics, must spend time with key users to get to know their workflow and their challenges. You can’t effectively offer help or advice without first understanding the use cases and pain points.
- Get more specific in your definitions. As I mentioned, none of the definitions of a day in the project above were incorrect. Part of the solution to this particular problem was that we got more specific with what we meant by a day, and built reporting concepts including “business day”, “inpatient day”, and “outpatient treatment episode”. This allowed each clinical and business unit to maintain their own operational reports while building a bridge to allow better cross-domain analysis (such as charges per day for patients who move from one clinical unit to another in a single day).
- Communicate the challenges. You don’t necessarily have to teach set theory to nontechnical personnel, but an effective problem-solver can break down a technical or architectural challenge into concepts that can be easily understood by users and clients. In clearly communicating the challenges presented by how the data is expected to be consumed, you can build allies in addressing the data challenges and potentially solicit suggestions for improving the design or workflow.
- Be flexible. You may have some of your ego built into the design you’ve crafted – and that’s normal. Just don’t be hypersensitive to feedback or potential changes. Remember, the consumers of the data ultimately decide whether the project is a success.
- Document and educate. Whether you change the underlying data or the reporting and analytics layer on top of it, be sure that any adjustments to the way data is represented are fully documented in a language and format that is easily understood by and accessible to those who will use the data on a regular basis. Spend the time to train those impacted by any change.
Conclusion
In the biggest of data solutions, it’s sometimes the smallest of questions that cause the most excruciating headaches. When you discover a disconnect in a fundamental definition such as, “What is a day?”, get in front of it as soon as possible! Delay only costs time and money later. Work with those users and business units who would be impacted by any change to how the data is represented, and be sure to communicate clearly and document the technical and nontechnical aspects of the answer everyone agrees to.
Some of those challenges can be solved with a robust date dimesnion while other are unfortunately (or fortunately) buried in the fact. Google the “Two Midnight Rule” and ask yourself what kind of a monkey wrench that would have thrown into your solution….