The Data Challenges of a Return to Service

By on

Click to learn more about author Kaycee Lai.

COVID has been particularly rough on companies that provide in-person entertainment experiences–theme parks, cruise lines, movie theatres, casinos, concert promoters, resorts, etc. With summer approaching, we’re seeing signs that some of these companies may be opening up operations again. Disneyland, for instance, just announced it is reopening its California parks on April 30, and Royal Caribbean announced it will resume cruises in June.

While the green light to open is most welcome, practicalities of the return to service may prove to be the ultimate test of data analytics capabilities. For these companies it’s much more complicated than simply turning on the “open for business” sign and starting up payroll again. They’ve got to coordinate the efforts of procurement, HR, supply chain, marketing, security, medical services and more. 

Think about Disneyland, which has now been closed for over a year, and the data-related questions their analytics teams and various departments need to answer.

Supply Chain Data Challenges

How does an established company re-engage a massive supply chain that has been gathering dust for 11 months? Sadly, many of its external vendors have probably gone out of business, so they’re going to need to establish connections with new ones. Those that are still in business will need to renegotiate contacts and more.

One major data challenge with supply chains is that a significant portion of the data resides outside of the enterprise, while the rest of it is held by the partners. This means that a company like Disneyland, for instance, has to figure out a way to integrate its own data seamlessly with that of vendors. And, of course, this brings with it all the challenges of highly dispersed, distributed datasets.

HR Data Challenges

How do they find the best way to prioritize the return of employees? Is it by location, job, past performance or something else? One of the top challenges that HR faces in dealing with data is that information is often inaccurate and incomplete. Profiles and resumes are not always updated when candidates are brought onboard and through the tenure of their employment. These problems are bound to be compounded when you’re bringing back a workforce that has largely been furloughed for 12 months. The opportunities for inaccurate data to creep in are immense, and the ability to expedite the process of data prep is critical.

Marketing Data Challenges

Can we identify and market to customers who are less risk-averse and more likely to venture into the park? Since Disneyland is opening up in limited capacity, not everyone that would normally come will be able, or want, to come out. Additionally, it will initially not be permitted to let in non-California residents. Disneyland marketing needs to be able to pinpoint the right customers to market to. And it needs to be able to educate many others, such as die-hard fans who live in neighboring states like Nevada and Arizona, and who are used to being able to cross the border any time they want and enjoy the magic. And then there are the people who used to live in California, and are still on the mailing lists, but have now moved out of state.

To compound the challenges further, based on information about prospective park visitors, Disneyland needs to be able to predict demand for food and other amenities at the park.

All of this requires the ability to quickly run predictive models on accurate datasets. Some of the necessary data may reside in different departments. Some of it may need to be modeled based on what has happened at Disneyland’s sister parks in Florida, which have remained open but with similar strict guidelines. From a data perspective, this may mean connecting quickly to data generated on the other side of the continent within an entirely different part of the organization. 

Healthcare Data Challenges

When you’ve got as many as 40,000 people attending in a single day, and you’ve got the specter of the pandemic still looming, how do you monitor signs of trouble in real-time? (i.e., people refusing to wear masks, or employees or guests who show signs of being COVID positive). You may have to make some serious decisions within minutes, and making data-informed decisions will mean integrating data that’s being streamed in real-time – from exit/entrance numbers to temperatures being taken at rides – with historical data sources. It also means integrating with external data sources, such as The COVID Tracking Project or the CDC.

Conclusion: Unify Data Access…Without Moving Any Data

As Disneyland, Royal Caribbean and other companies return to service, the challenge for most teams will not so much be in analyzing the data as it will be in getting to the data in the first place, much of which might be buried in departmental silos. At a time when answers are required within hours or even minutes, it might take months to hunt down and prepare the right datasets for analysis. To make sure this is not the case, companies must have the Data Management technologies in place that ensure that all of the company’s internal data – as well as relevant external datasets – can be accessed through a single source.

To be clear, this is not something most large companies have currently, but it is something that has been talked about for some time. For instance, Matt Aslett at 451 Research has promoted the idea of an Enterprise Intelligence Platform that combines functionality from data integration, data storage and processing, and analytics. John Santaferraro of EMA has also promoted the idea of a Unified Analytics Warehouse, which ties together multi-structured data stored in any hardware or cloud platform, providing analytical capabilities across all storage tiers.

In either case, these paradigms would not entail actually moving the data. This would simply be too big of a feat to be practical for large, complex enterprises. Rather, they would rely on virtualization, or the ability to create a virtual layer over complex architecture that allows the data to stay put – in Snowflake, Teradata, Hadoop, S3 Buckets, Oracle, Business Systems and more – but treats it as a single entity. As I previously noted, most companies haven’t implemented this, but thanks to technologies like Trino, it’s now entirely possible to run a query on a highly distributed – and perhaps disjointed – data infrastructure. For big companies returning to service after a year-long hiatus, this may be exactly what they need to make sure the process goes smoothly. And the nice thing about the virtualized approach is that, unlike moving data, it can be applied very quickly – even in time for an April 30th opening.

Leave a Reply