Welcome to Magazine Premium

You can change this text in the options panel in the admin

There are tons of ways to configure Magazine Premium... The possibilities are endless!

Member Login
Lost your password?
Not a member yet? Sign Up!

More Data Isn’t Important, More Information Is

June 7, 2012

by Angela Guess

Douglas Merrill of Forbes has written a new article about the nature of Big Data and how math lies behind its power. He writes, “As a general rule, more data is always better than less data.  You can do more math magic with more data.  In general, it gives you more degrees of freedom. Most importantly, more data makes it easier to avoid a problem called overfitting. Stay tuned, I’ll come right back to that. First, to learn a new machine learning model, you need a bunch of stuff.  You need data; at least some of that data needs to be tagged with the outcome you are hoping to learn. So, for example, if you are trying to predict the probability a borrower will default on a loan, you need some cases where borrowers defaulted, and some where borrowers paid off their loans.”

He continues, “So, you have a bunch of tagged observations, so you know that these loans went bad, these paid off, and so forth.  Off to model! First, what do you do? Recall the math issues I mentioned last time; you can still screw up the math.  More data will mean it will take a few more seconds to generate garbage data, but the data will still be garbage. Again, it’s not the number of bits, it’s the amount of information you can generate. However, since you’re drowning in bits, your first step will be to divide your data into a training segment — that you’ll use to teach the model — and a testing segment — that you will use to test the quality of your model.  If you have extra data, you might also create a third group, called a validation group, that you will use to make sure you didn’t overfit your data.”

Read more here.

Related Posts Plugin for WordPress, Blogger...
photo by: purpleslog

Tags: , , , , , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *


Add video comment

FOLLOW US!

Friend me on FacebookFollow me on TwitterJoin my group on LinkedInWatch me on YouTubeRSS Feed

User Login

Lost Password

 

 

Latest Tweets

Twitter