Click to learn more about author Wayne Yaddow.
In Part 1 of this series, we described how data warehousing (DW) and business intelligence (BI) projects are a high priority for many organizations. Project sponsors seek to empower more and better data-driven decisions and actions throughout their enterprise; they intend to expand their user base for BI, analytics, and data discovery so that users will reach informed decisions.
This series of mistakes to avoid focuses on helping organizations sidestep quality issues that are too common on DW projects. Tips offered here will aid in assuring satisfaction as DW teams plan and implement new functions and capabilities.
Falling Short in Reviews of DW/BI Project Requirements and Technical Designs
Shift-left quality assurance is an approach to DW development where quality assurance is performed early and frequently in the lifecycle — to the left of its usual position on the overall project plan. “Shift left” refers to dynamic Data Quality verifications and static testing: conducting reviews, inspections, and unit and integration testing. However, many DW projects have not begun to embrace this potentially important process.
Shifting the verification process to the left during the project lifecycle is an agile practice that delivers a means to verifications with (or in parallel to) design and development activities. That is, development and QA teams collaborate to plan, manage, and execute tests that accelerate feedback of issues to the business and developers.
When quality assurance assignments are practiced earlier in the project lifecycle, DW architects, business and data analysts, and ETL and BI report programmers assume “testing” roles to identify issues. When implemented early, validations to focus on business rules and security — and even acceptance tests — can be a tangible benefit for project quality and success.
Spending weeks or months at the end of a release cycle to find and fix issues is inefficient. An IBM study reported that “it is 100 times less expensive to fix a defect early than after release to production.” That calculation alone should get most teams behind shift-left testing.
Another goal of shift-left testing is to fix issues that might emerge in the future, even in production. Therefore, when organizations adopt a shift-left strategy, they can test, analyze progress, and pass judgment on the system, much better bit by bit rather than all at one time.
Neglecting to Identify Best Practice Test Processes That Verify Data Transformations and BI Reports
The majority of testing for many (perhaps most) DW projects is performed by running SQL scripts and then gathering the results in spreadsheets for further analysis.
That approach to ETL testing can be slow and error-prone. Performing DW tests manually without dedicated process-supported tools to design and manage those tests may thwart test automation in the near and long term.
Repeated testing is vital to ensure a high level of ETL Data Quality. The more you test, the more bugs you will discover before going live. It’s crucial for business intelligence projects. When users can’t trust the data, the adoption of BI or analytics solutions will be in jeopardy. Implementing advanced ETL testing processes will support frequent regression and smoke testing.
The decision to implement cutting-edge processes and tools for ETL testing depends on a budget that supports spending to meet advanced testing requirements. Test tools built and maintained in-house are likely to be better than no test tools at all. Many ETL tool providers offer test solutions directly focused on the ETL tool (ex., Informatica’s Data Validation Option — DVO).
A few of many notable test processes include:
- Source-to-Target Data Test Verifications: Record counts, duplicate data checks, data transformation tests, regression testing, smoke testing
- Load-Testing Processes: Tests that realistically simulate an appropriate number of users to validate application infrastructure for performance, reliability, and scalability
Enlisting Third-Party and In-House Testers with Inadequate DW Testing Skills
The impulse to cut DW project costs is often intense, especially in late project phases. A common error is to delegate numerous testing responsibilities to resources with limited business and data testing skills. DW project leads and other hands-on testers are frequently expected to demonstrate extensive experience designing, planning, and executing database and data warehouse test cases.
In recent years, DW projects have experienced a trend toward business analysts, ETL developers, outsourced testers — and even business users — for planning and conducting much of the QA process. Doing so can be risky. The following are among the skills frequently required for DW testing:
- In-depth understandings of the business, DW technical requirements, business process, and business terminology
- An ability to develop strategies, test plans, and test cases specific to data warehousing and the enterprise’s business
- Skills for deploying code to databases
- Skills for troubleshooting ETL sessions and workflows
- A firm understanding of DW and database concepts
- Advanced skills with SQL queries and scripting
Data warehousing projects can fail for many reasons: poor Data Architecture, inconsistent data definitions, lack of features to combine data from various data sources, missing or inaccurate data values, inconsistent use of data fields, unacceptable query performance, and so on.
Most crucial project risks and failures are reduced when well-trained and experienced testers provide ongoing support from the earliest phases of development through project completion.