by Angela Guess
Derrick Harris of GigaOM reports, “Some of the U.S. government’s most research-intensive agencies want your help to come up with better ways to analyze their expansive data sets. NASA, along with the National Science Foundation and the Department of Energy, launched a competition on TopCoder called the Big Data Challenge series. Essentially, it’s a competition to crowdsource a solution to the very big problem of fragmented and incompatible federal data.”
He goes on, “The first contest in the series involves answering a question, albeit a difficult one. From the contest page: ‘How can we make heterogeneous (dissimilar and incompatible) data sets homogeneous (uniformly accessible, compatible, able to be grouped and/or matched) so usable information can be extracted? How can information then be converted into real knowledge that can inform critical decisions and solve societal challenges?’”
Harris notes, “This a problem that’s magnified in government agencies, but that plagues companies of all types that try to get started with big data. While future big data strategies might mandate a particular data format or other standards, the past and, often, the present is a messy pile of stuff created by different divisions within different agencies or departments. The dream of creating spectacular algorithms, beautiful visualizations and uncovering hidden insights often only comes after untold man-hours spent cleaning and munging data (often with Hadoop) into formats that software systems can work with.”
photo credit: NASA

















