How Much Time Do Data Scientists Spend Cleaning Data?

By on

cleanby Angela Guess

Rick Delgado recently wrote in Dataconomy, “You probably had some big ideas in mind when you first started thinking about adopting big data solutions for your business… Hiring a qualified data science team is usually one of the first priorities, along with all the investment in equipment and technology needed to properly collect and analyze all the big data you’ll want. Over time though, that excitement might have worn off. Insights from big data analytics were likely coming in, but not at the pace you were hoping for. Is this a result of your data scientists simply not getting the job done well enough? Is it a case of laziness on their part? As easy as it is to think that big data insights should be reached one after the other in a short amount of time, more than likely the data scientists on your staff are doing everything they can. There are reasons for them not being more inventive, and it has nothing to do with their work ethic.”

Delgado continues, “There’s a lot that goes into a data scientist’s job. Some of their time is spent exploring the vast amounts of data they have to work with. Some of it requires preparations of data visualizations. And still other times they’re working on extract, transform, and load (ETL). While these are all valuable tasks in their own right, chances are most of their time is taken up in something far less glamorous. It’s sometimes referred to as data cleaning, but other terms include data wrangling and data munging. Many data scientists jokingly refer to themselves as data janitors, with a lot of time spent getting rid of the bad data so that they can finally get around to utilizing the good data. After all, bad data can alter results, leading to incorrect and inaccurate insights. The costs of bad data are high, with some research stating it costs a typical business more than $13 million every year. So data cleaning is important, but it’s time-consuming and not all that fun.”

Read more here.

Photo credit: Flickr/ go_greener_oz

We use technologies such as cookies to understand how you use our site and to provide a better user experience. This includes personalizing content, using analytics and improving site operations. We may share your information about your use of our site with third parties in accordance with our Privacy Policy. You can change your cookie settings as described here at any time, but parts of our site may not function correctly without them. By continuing to use our site, you agree that we can save cookies on your device, unless you have disabled cookies.
I Accept