Governed Self-Service Analytics, People, and Processes

Click to learn more about author Aditi Raiter.

Self-service BI is lucrative but demands an eye on governance and security. In this article, I will stress the behaviors and strategies needed around managing self-serve BI/reporting tools.

Self-service analytics rotates around business users having access to data preparation and dashboard building tools. There are many challenges when self-service BI tools are open to all users in an enterprise. In the article The Impact of Data Governance on Self-Service Analytics, the author rightly claims common risks — multiple data models with unnecessary changes, multiple truths and analysis outcomes, security audit failures, BI systems maintenance nightmares, etc. In the article Safeguarding Against the Risks of Self-Service Data Preparation, the author emphasizes the impact on data quality and data security associated with such models. These solutions can become critically vulnerable when they are flexible, as they can even lead to enterprise data stored in a server under the desk of a power user!

Governed self-service models rotate around IT in charge of data model builds and preparation while business users are in charge of building reports. IT creates a centralized reporting database environment that is enhanced with new data elements with new business needs. Business users build reports on the same data model, leading to lower wait times and improving flexibility. Below is a commonly used enterprise data flow diagram that depicts the data flow in an organization. Business power users can build reports when they have access to flattened out documented reporting data sets. A reporting warehouse is dedicated to every business line with a documented database design.

Some cooperative strategies allow reporting tools to directly connect with ERP/CRM standbys. The question is: Is this the right long-term approach? Even though standby environments can take some amount of data retrieval loads, they are not meant to be used as a playground for self-service BI tools. They are not reporting replicas!

The success of a governed self-service model is also about the processes needed to coordinate people’s roles. According to Dan Madison’s book Process Mapping, Process Improvement, and Process Management, 85 percent of organizational problems fall into the categories of process, control mechanism, or structure, with the bulk of that being in process category. Some companies these days don’t pay attention to governance processes for their self-service BI tool as there is no easy ROI to justify the value of these approaches. They let power users build reports without verification and are willing for IT to later fix reports if the numbers seem strange. On the other hand, a few organizations have very stringent governance processes that make it hard for power users to build reports, discouraging them from creating their own reports. A governed self-service model is the way to go for most of the world’s leading companies. In How to Succeed with Self-Service Analytics, the author describes the process of creating a report governance process.

The intent here is to curate the right balance between flexibility and governance. The right reporting governance framework must be able to support data discovery, data security, data quality, audits, and change management on an interface that has access to flattened-out reporting datasets for power users. The platform must also be rewarding to power users so that they are proud of what they can do and feel accomplished.

The platform description starts off by defining user types.

The article referenced above stresses the importance of the power business user role in the report building process. These users also have the authority to raise requests to manage other users on their business teams like casual users and guests. Power users have the felicity to generate their reports on their timelines with approvals from the right authority.

Self Service Analytics — Enterprise High-Level Process

The above process can be described with a practical example. Joe, our protagonist, has joined a logistics company in the operations department. Joe sets off in this new work environment as a power user to create a Key Performance Indicator (KPI) report for a warehouse process. He is required to build a Receipt of Put-Away Performance (by Date Range) Report that shows detailed information about the average time taken for orders to be put away. To build the above report, Joe’s manager recommends he attend mandatory HR power user training.

1. Attend Mandatory Power User Training

It is a lot to expect from operations — that they’ll be able to cultivate data reporting abilities even if it is as simple as dragging and dropping fields from an area on the screen. In most organizations, these are additional responsibilities outside of their day job. As a part of performance management, this type of training is mandated for power users, encouraging system awareness.

Joe, in our example, begins by attending a training program to understand what reporting means. He gets answers to some of his questions:

How can on-demand, real-time reporting help retain a customer?
Is a prospective client an existing customer with any other business line?
Are they worthy of a gold or platinum service if they are already a customer?
Are customer orders shipped on time? People find data worthy only when they can imagine its benefits.

Joe is introduced to the self-serve reporting process (outlined in Figure 2). He is trained on building reports utilizing the reporting tool and its features connecting to the reporting database environment for his business line. He also learns about a version control tool that helps him check-in and save his work. He now understands the IT term “metadata” and its usage in search engines deployed within the company. He also now understands that he has the IT help desk team on his side if he needs help.

Joe realizes the importance of his elevated access — if used in the right way, it can help Joe and his teammates answer frequent questions for themselves.

2. Document Reporting Requirements

Another important step for power users is to gather their needs on a standard power user reporting requirements template. This template documents some of the commonly asked requirements for reports, such as:

Report update frequency
Report users and their permission types to view data elements
Summary/detail reports
Additional data elements useful for comparison
Outlier definitions

Joe, in our example, fills in the standard reporting power user template with data elements for each order line and the selection criteria as outlined in the image below.

Joe realizes the following:

The template helped him organize his thoughts on requirements.
The signed-off requirements document facilitated Joe and his business team to be on the same page.

3. Metadata and Report Catalog Search

In the article Business Intelligence Meets Metadata Challenges, the author stresses the importance of the growing concern over metadata. Storing data lineage and reporting table content information does not give a full picture of metadata.

Metadata is the catalog of data. It shows you where to find the information needed, the source, where it’s used, and the formulas that it participates in. Data can be anything — IT reporting table columns, IT ETL processes that describe transformation rules and job dependencies, data steward MS Excel templates, IT reports and user bases, business data quality indicators, data content information, IT data models, access permissions, systems, etc. Any documentation that can give more information about data is metadata.

Data catalog tools attract self-serve platforms as they aid in building trust around enterprise data.

Some report catalogs are implemented, storing report metadata information in a graph database where resources are data tables, dashboards, reports, users, teams, business outcomes, etc. Their connectivity reflects their relationship: consumption, production, association, etc. Airbnb’s data portal success has an interesting approach. In the piece Disrupting Metadata with Metadata Automation, the author talks about the advantages of metadata automation.

Joe, in our example, sets off on a data discovery platform. He uses a metadata tool to see if the report already exists or if one that exists can be enhanced to fulfill his requirements. He finds reports that can indicate a put-away performance, indicating the volume of stock put away per warehouse clerk per hour. This is close to the logic he needs but not the same. On other reports, he notices the right logic for his need, but they are not built for his customer. He begins by testing a similar report for a different client that satisfies his needs. As the database and the tables are similar, he can get in touch with the business team who built the report. Alternatively, Joe has access to the Reporting Warehouse Database Design document, which indicates the tables that can be used to build the report from scratch. Joe chooses to enhance a standard report instead.

Joe realizes:

He no longer has to spend days looking for data.
He is no longer dependent on IT old-timers who have company experience in metadata because they were partially involved in all projects related to building reporting tables.

4. Collaborate with Other Power Users

Once the users are aware of built reports and dashboards and have access to the above metadata discovery database, they usually need to collaborate with the teams that have built these reports. A collaboration tool can be used to get like-minded people together to build a business report type. They can interact with the people who have already built or are in the process of building a similar report.

This collaboration platform can be as simple as Microsoft Teams, Skype, or a more advanced one that enables sharing analytics dashboards with some of the features like the ones listed below:

Different streams for different business lines — a business line lane that could further fork into report groups for that business
Ability to share experiences, best practices, discussions around potential improvements, and hurdles to build a report kind
Ability to share dashboards and visuals
Ability to view, write, edit, or delete annotations and start new discussions around the data
Option to share dashboards with colleagues or customers who do not have a tool account by sending them a session link via email

Joe, in our example, starts off on a collaboration analytics platform. He sends a request to join the reporting lane to another client. He reads previous comments on the report experiences. The comments hint that he should get in touch with a certain person on the team. After getting in touch, he learns that the existing report built, unfortunately, has hardcoded customer-specific SKUs and intended locations only.

Joe realizes:

The collaborative platform helped him quickly get answers to all his additional questions from the person/team who originally built a similar report.

5. Assistance from IT Help Desk Support Team

Support teams can provide technical assistance for solving a software or hardware-related problem. Part of this group, called the IT help desk, is dedicated to supporting data-related issues/requests as they support IT applications within the organization. Obviously, if documentation is not in place or metadata cannot be found, the task will need to be redirected to core IT teams. Automated metadata tools can help minimize these concerns. IT help desk support teams also have additional access to view data lineage from source systems to reporting platforms.

Data Lineage and Profiling

Most data catalog or ETL tools have visibility into data profiling, which is the process of collecting summaries and statistics of data from the source. It’s an audit that details the number of null columns, the maximum and minimum value in a field, data quality indicators, etc. Data lineage tools track business data flow from the originating source through all the steps in its lifecycle to destination. Some tools can go all the way to keeping a time slice of the lineage. For example, if a requirement is to know how a field was calculated at some point in time in the past, an option to view previous and current lineage is possible. There are also other open-source data lineage tools if Spark is used for pipelines.

Joe, in our example, sets off by raising a ticket to the IT help desk. The support team helps him check the data lineage of the report field between various systems. They let him know that the report is pulling the right field from the source systems even though it has some hardcoded SKUs. Joe also verifies a few orders from source WMS screens rightly pulled by the report. The support team also confirms that there are no additional transformation rules if Joe builds the same logic for his customer. With the solution now researched, Joe begins preparing standard report development documents for the change approval board. He proposes modifying the existing report as the board appreciates a new customer report only when necessary. He updates a standard Report Design template.

Joe realizes:

Approaching the IT help desk was a very convenient option to finalize his design. Though their help was not mandatory, the process proved to be beneficial.

6. Center of Excellence (CoE) Report Design Approval

The CoE team validates the proposal for enabling the report cube for both customers. Their intent is to verify whether the report tables suggested meet the design requirements. Some business lines like Freight Management have mandatory user profiling rules for shipment visibility. The CoE team determines whether the report satisfies core data visibility authorization concerns. They review the SQLs and indexes proposed and consider any improvements suggested on the reporting database.

Joe, in our example, attends the CoE board meeting with the report design template proposing an additional parameter for customer name on the existing report. He submits the SQL used and verifies that the report is performant. User groups can see their own version of the report variants for a customer. The board then approves the report enhancement process.

Joe realizes:

This kind of strict governance often plays an influential role in curbing broad data mapping issues, incorrect data-related decisions, and redundant evaluations. It was an opportunity for effective self-serve analytics as the design team approved (and initiated) database disk space additions for forecasted volumes. Attending a one-hour call was not burdensome red tape, after all!

7. Report Creation/Enhancement

Some reporting and BI tools come with easy access to online help and community forums. Online help has some sample reports and dashboard demos that can come in very handy. A few BI strategies allow relevant master data loads on reporting databases to add additional details around reference/master data on the reports.

Joe then modifies the existing standard report on the reporting tool’s UAT environment. He chooses to involve other teammates so that they can help with testing. He enhances the KPI report by adding the warehouse id, country, region, and cluster details.

In a very rare scenario, Joe realizes that the put-away for a pallet type is not updated at all in the system. He decides to open a ticket with the IT engineering team to fix the bug. As the bug is not very critical to his business case, he also considers the option of fixing the bug in release 2.

Joe realizes:

He can manage and negotiate report delivery timelines with the customer himself. He is not dependent on IT to quote on fixing any unexpected problems.
Thankfully, with the IT help desk team’s support, Joe was not alone in resolving this issue.

Sum (ID)

8. Report Migration to PROD

After the UAT phase, like any other object, the report is moved to PROD via a help desk and ticketing software with workflow management. It’s a great deal to consider that the power user can manage IT tasks that involve documentation related to change board approvals and releases. This task is better handled by the IT help desk.

Report enhancements or new report builds are not critical releases involving any major downtime. In most organizations, the change board does not mandate IT representation for such requests.

Joe, in our example, opens a help desk ticket to get the report cube released to PROD. His final UAT tested report is checked-in to GIT.

The IT help desk validates the following details received from the ticket:

Report cube object to be migrated to PROD
Guest user list that has execute-only rights to the report in PROD
Casual users who can use the migrated report cubes to build a report version in PROD
UAT approval from all business teams who tested the report cube along with the business need and test case documents

They additionally verify that:

The report columns do not have personal identification information
The report satisfies expected performance indicators/timelines for common parameter executions
New report metadata is registered into the system

The IT help desk fills in other typical change board documents on Joe’s behalf confirming that they have the project details and critical assessment details for the change board’s review. Once approved, they get in touch with the release team to move Joe’s objects to PROD.

Joe realizes:

The request ticket form was in place to maintain stable and secure systems. His PROD work environment is stable as any changes to the reports or user access need his approval.

9. Advanced Predictive Analytic Responsibilities

Some companies find it worthwhile to invest in granting predictive analytics capabilities to power business users.

Popular uses of predictive analytics answer two kinds of questions:

Regression answers “how much”
Classification answers “which one”

A predictive model is like a car; you need to upgrade it from time to time, changing its features. Predictive models follow the same philosophy, but the magnitude of the upgrade is not as drastic. Improving a predictive model’s accuracy can be achieved by enhancing a couple of features or predictors.

A predictive analytics project includes the following tasks:

Select the target variable
Get the historical data and prepare it
Split the data into training and test sets
Experiment with prediction models and predictors (features) — pick the most accurate model
Train the model on the training set
Validate it on the test set with standard accuracy measures
Implement it — in addition, do not forget to follow up on it from time to time

A BI tool that has the following features might thrive in the area of self-service report generation:

No coding or scripting
GUI to common R/Python machine learning packages
Workbench for advanced analytics
Stats analysis, visualizations, transformations, built-in models for pattern discovery and model testing

The above tool features can help achieve rapid model creation. Product features or power user-training in this area are not solidified yet. This function still demands an eye to effectively manage a data scientist role in the hands of a smart power user.

Summary

If the right processes are not in place, even the best brand tools cannot help to accomplish the insights when needed. Data quality and data catalog features help build trust around enterprise data. Required governance helps build confidence in IT operations. The true value of a self-serve capability is not in a tool but in a framework.

And finally, Joe realizes that designing great looking BI reports and dashboards isn’t just about spectacular charts, but the goal is to share easily understandable information by efficiently getting to the right numbers at the right time!

LISTEN NOW: MY CAREER IN DATA PODCAST

Data Topics

Governed Self-Service Analytics, People, and Processes

Self Service Analytics — Enterprise High-Level Process

Data Lineage and Profiling

Summary

Leave a Reply Cancel reply