The Key to Unstructured Data Management: Communicating About Your Data

By on
Read more about author Steve Leeper.

Sharing information about what’s happening with the unstructured data in your organization is much more difficult than it might appear. Miscommunication can negatively impact virtually every aspect of your organization, from your IT, storage teams, and app developers – all the way to the business and other end users. However, it is critical to have an accurate, full picture of your unstructured data to conduct your business safely, efficiently, cost-effectively, and successfully. 

How is it possible to have unstructured data accessed over file protocols for up to 40 years, without also having a clear way to communicate important details about the data? While storage teams are responsible for managing platforms and are the stewards of data for numerous stakeholders, they aren’t the data owners. Usually, end users and application owners should be in charge of managing unstructured data, but that rarely, if ever, happens. So, in the end, no one ends up managing the data. No other area of IT makes decisions about their platform so blindly and leaves their end users to manage such a large portion of it.

Historically, unstructured data management uses free tools that slowly scan file systems and provide capacity and file count details for planning. With every new file tree selection, the tools start the primitive scan process from the beginning, forcing users to wait hours or even longer for completion (particularly on large file systems). After this, time stamps on an assortment of random files are spot checked. These combined steps at best offer a muddied view of an organization’s unstructured data. 

Sometimes, in-house tools from storage vendors can provide more clarity, but even then, limited detail on the data is available. And it’s even more complex in environments with multiple storage platform vendors. In the end, if IT decides to remove or archive data in the system, a rarely successful search commences trying to identify the data’s owner. Realistically, no one knows what’s in a company’s data “junk drawer,” and there’s no one that’s responsible for it. An absence of clarity and clear reporting of unstructured data has made it functionally impossible to manage and communicate across the organization. Without communication, data will continue to grow at an exponential rate, making the problem increasingly worse. 

Thankfully, there are solutions out there. Organizations should look to hire a vendor that lends visibility into unstructured data and provides reports to all relevant stakeholders. Additionally, solutions that allow you to organize and act on data can help teams implement life cycle management strategies for unstructured data. When IT can quickly gain details about top users and groups’ consuming capacity, data whose owner has departed (orphaned data), cost of datasets and associated emissions, and the age of data, then real action can be taken. Tagging data is another useful ability to look for, so teams can organize and assign ownership to datasets. Custom reporting also allows for complex queries on metadata and assigned tags.

Solutions that help companies understand unstructured data allow them to communicate to IT management, data owners, and storage, compliance, and security teams about the information they need. This way, more educated discussions can be held. Products that are designed from the ground up to function at scale, vendor agnostically, and in the enterprise, empower organizations to take action against their unstructured data, wherever it’s located. Communicating effectively and accurately about data is the first step to managing it, and managing data can bring vast improvements to your entire organization.