Case Study: Unilog Leverages Data Preparation to Monetize Upstream Data

By on

data preparationPaxata has facilitated the monetization of our data,” said Noah Kays, Director of Content Subscriptions, at Unilog. Unilog is a global technology company specializing in e-commerce solutions and content for business-to-business distributors in the industrial products (electrical, plumbing, etc.) market. The solution includes Product Information Management system (PIM), a Content Management System (CMS), helping distributors create an e-storefront platform. The platform also enables Unilog’s customers to connect to their ERP systems and integrate their data with the content provided by Unilog.

Since 1999, Unilog has expanded into creating product content for e-commerce for other industries as well. “We’ll build the SKUs, the specifications, the images, we’ll build out your taxonomy, and we’ll deliver that all to you for your e-commerce website,” Kays said.

Unilog also recently started offering PIM-as-a-Service solution to serve the needs of many of their smaller customers who don’t have the capacity to put thousands or hundreds of thousands of SKUs on their website, “So we do build the item data for them within our own platform,” he said. They also do product content subscriptions for large buying groups, enabling their members to subscribe to content for their websites.

The Data Preparation Problem

Unilog previously focused more on one-off deliverables, but over time, the size and number of upstream data they receive and databases they maintain has grown significantly. A typical project might entail building out a product line of 100,000 SKUs for a retailer, and once delivered, the customer would take over management. “Now we’re seeing more and more people use us as their whole product information management department where we maintain the systems and data over time,” he said.

Considering that the upstream product data is constantly expanding, rapidly onboarding new product information ensures that Unilog provides continuous value to its clients. This ongoing product expansion includes constantly analyzing and reconciling the differences between Unilog’s own production database and the industry Data Warehouses where multiple manufacturers upload pertinent product data. In many cases, this reconciliation also includes enrichment by associating images and technical documents, applying proper categorization, gap filling and normalizing attributes as well as providing longer descriptions to the data so that the information is ready for use by downstream distributors.

However, for more than 15 years, the company managed their product information in a series of Excel spreadsheets and in-house databases, but a growing number of exponentially larger projects requiring delivery and maintenance of millions of items overwhelmed their legacy systems. “Suddenly those Excel-based processes that worked for so long started to break down due to the limitations within Excel itself,” he said.

Complex Considerations

Although there were solutions that could manage the workload, Kays said he needed a product that could be used by non-technical business users without a big learning curve:

“I needed a solution that would enable the Data Preparation and general functionality of Excel, but at scale for dealing with millions of items at the same time.”

Unilog needed a product that was able to profile, check, match, de-duplicate, and normalize data across all items and provide it in a user-friendly tool.

A growing number of customers have been changing the way they use Unilog services as well, and any solution needed to be able to gracefully manage this change. For example, smaller customers are now pooling their resources to create master catalogues of items they can use for their e-commerce sites, and Unilog is cleaning, hosting, managing, and distributing that growing database on an ongoing basis.

“So it’s scale, it’s velocity and it’s disparate data sources as well. It’s data democracy between IT, the business, and in ways we’ve never experienced before,” he commented.

Paxata: A Key to Unilog’s Data Monetization

While narrowing his search down to a short list of Data Quality and ETL tools, he stumbled across Paxata. After a few demos, he saw right away that it was similar enough to Excel that the content teams would be able to grasp it quickly, but without Excel’s limitations. Paxata used similar terminology and an intuitive interface that wouldn’t require a lot of training, “So I knew that the adoption of the application within our teams was going to be more successful than with some of the other platforms that were not as user-friendly,” he said.

Another deciding factor was a flexible licensing model, which is based on the number of active rows of data being analyzed rather than number of users. “That meant that I could expand as much as possible and get as many people in there without incurring additional cost, and that was huge for us.” The license model aligned with our service model: the more product data we enriched for downstream use, the more value we got.


From an implementation perspective, Kays said that the only slowdown was on his part, because he needed to set up his account and connect to the Paxata Cloud. “Once I got my side of it done, the connection was set up and I was extracting data the next day – and it was running extremely fast.” After Paxata was connected, new data started to roll, “a kind of snowball effect took place.” He said that it not only met their initial expected needs, but they soon found more and different things they could do with it. “And now it’s really embedded as a core function within our content creation processes and our data quality processes.”

Impressive Results

“Recently we really found a lot of value and longer-term return on investment in the speed of issue resolution.” In his database, Kays has item information from an electrical Data Warehouse, which has about two million items that can be used as seed data for creating web ready item data, as well as his own item data stored in Oracle, and taxonomy info from Excel.

Paxata allows Kays to sync all that disparate information in one project. “I can map and merge all those different datasets together and figure out what items I might be missing, and what things might be out of sync between the different databases.” This has been empowering for his analyst teams, because it’s easy to get all the data they need to quickly resolve issues in one view.

He used an example of a customer with 100,000 SKUs who needs help finding an error. In the past, getting data for those 100,000 SKUs would require a complicated query or a lot of different lookups across different excel spreadsheets – a process that could have taken hours.

“Now I can just drop that list of items in Paxata, do a couple of quick lookups against my entire database of 4 million items, and get all the data that I need to do a full analysis of that potential issue in minutes –

There is another challenge he has is in normalizing attributes for items. For example, when a customer shopping for a garbage disposal sees a variety of models with different horsepower. With potentially 1,000 garbage disposals in a single category, and a large number of Unilog content engineers working on this data, “Making sure that we represent one-third horsepower consistently across all 1,000 SKUs can be a challenge,” he said.

The process of normalizing and standardizing values gets even more challenging the larger the data set is, so one-third horsepower may be represented by .33, .3, .3330, or 1/3. Paxata adds another layer of validation, along with a feature called ‘cluster and edit,’ that uses NLP (Natural Language Processing) under the covers to look for similar values within a dataset and suggest how they can be normalized, “And that has been huge for us. This has allowed us to really expand across these larger data sets,” he said. Once normalized, the data can be quickly loaded back into the PIM.

With his previous system, Kays also had to run a nightly export of two million items from the database, and even with a 350 mb download speed, the process took up to three hours. “Once I connected Paxata to it, I got that same amount of data in seven minutes.” And because Paxata automates that process, he no longer has to log in and press “a button” every night – the extracts are done automatically.

Implementation was straightforward. “We were pretty much off and running with the vast majority of team fully bought in and trained within a month or so,” he said.


Unilog serves hundreds of mid-size distributors across the globe. Some have been around for 100 years, and many with databases they’ve been using for 30 years, with a variety of salespeople and employees adding information throughout that time.

“They have this somewhat messy data, and they’re trying to match it to our data, which is relatively clean and ready for them to use, but because of their legacy systems, they struggle with it.”

So Unilog has now developed a data matching service – another data monetization strategy – where those distributors can send in their data and Unilog will clean it up, use Paxata to match it against the master database, to tell them which items should be on their e-commerce website.

“So, we’ve taken a piece of the Paxata functionality and operationalized it and commercialized it into a service that we offer to our customers, and we’re actively generating revenue based on that.”

Lessons Learned and Best Practices

Kays said that defining his main data sources and doing some end-user education beforehand, as well as having done a bit more data governance planning, might have made the transition somewhat easier. The other thing he heartily recommends is consulting with Paxata on options before building out new projects or solving problems. “You might spend two hours trying to build out one way when there was a simpler way.” Julie, his Paxata account rep, was able to quickly jump on a phone call and offer other options, approaches and scenarios. “To some extent, I almost consider our account representative another member of my team,” he said.

Next Steps

In the next year, Kays would like to roll out Paxata to more departments within his organization and empower more of his business analysts. “By giving them all the data at their fingertips – that’s really where the value of this comes out.”

Before Paxata, when Unilog received an ERP dump of product information from 60 different distributors, there were numerous duplicates, and other time-consuming issues that arose.

“If I had known this tool was around back then, it would have gone a whole different way. It’s really about enabling your Data Analysts with the data and the tools that they need to get their job done, and Paxata is the next generation of that tool. Paxata has given us the speed and productivity we need to bring our data, content, and services to market a lot quicker.”



Photo Credit: agsandrew/

Leave a Reply