You are here:  Home  >  Conference and Webinar Communities  >  Current Article

Big Data and the Semantic Web: Their Paths Will Cross

By   /  September 12, 2012  /  No Comments

Big Data and the Semantic Web are on a track to intersect. And businesses that want to be on track to profit from the explosion in data should start looking a little more closely at that intersection, and soon.

“We’ve got more data now than ever before coming at us, and it is coming faster and faster,” says Frank Coyle, director of the Software Engineering program in the Lyle School of Engineering at Southern Methodist University, whose research is in the area of web services and semantic web technologies. “So the semantic angle is how can you organize this data to take advantage of it, to do queries over it.” Those in the semantic web community say RDF is the way to go, he says, adding that people now use the term linked data as another way of describing semantic data. “If you take Big Data and link it, then you have semantics – you have meaning now introduced into the equation.”

Coyle – who will be presenting a talk entitled, “Relationships Matter – Leveraging Semantic Technology to Extend Your Business Horizons,” at the upcoming Semantic Technology and Business Conference in the U.K. next week – says that most companies today are dealing with Big Data in all its forms: structured, semi-structured and unstructured. And RDF, the language of the semantic web, offers a simple sentence structure, where triples consist of a subject, predicate and object, to help put all those data elements in relationship to each other.

So, if your business is grappling with trying to get value out of all three kinds of data, “if you take it to the simple sentence structures of RDF, if you convert it to RDF, then you are in a position to use semantic web tools such as SPARQL to do queries over this data, and integrate it in a cohesive way rather than separately dealing with each of those categories,” he says.

The hardest of the three data types to structure in RDF triple form is unstructured data, he says. But because your unstructured data generally is on the web on some sever, it has a URL associated with it. “When you get into the details of RDF you have to have a URL to find your subject,” Coyle says. “So immediately you can at least begin to talk about some of this unstructured data. Once you access it, if it’s text there are tools you can use to run over it to extract subjects, predicates and objects from that, like Open Calais,” for one.

Most companies aren’t close to players like Google or Facebook when it comes to driving incredible value from Big Data, but many of them at least have experience with structured and semi-structured data to find hidden jewels, which will put them in good stead for moving to the next level of dealing with unstructured data. And, Coyle adds, there’s no reason not to tap into existing staff talent to begin to generate triples from all the data the company has, and then use SPARQL to navigate and perform queries over those triple stores. “You can take a conventional database person who knows SQL and you can easily get them up to speed on SPARQL,” he says. “An advantage of this approach is that you can presumably use the expertise you have in the company to help you with this vs. going outside and hiring specialized people.”

And, not only does the semantic approach present new opportunities for companies to make use of data for their own internal ends, but you could imagine a situation where a company might be able to devise linked semantic data products to offer to other companies as a paid service, he says. “We are at the beginning of things but the thing is, the payoff is going to be surprising.”

You can register to attend SemTech U.K. here.

About the author

Jennifer Zaino is a New York-based freelance writer specializing in business and technology journalism. She has been an executive editor at leading technology publications, including InformationWeek, where she spearheaded an award-winning news section, and Network Computing, where she helped develop online content strategies including review exclusives and analyst reports. Her freelance credentials include being a regular contributor of original content to The Semantic Web Blog; acting as a contributing writer to RFID Journal; and serving as executive editor at the Smart Architect Smart Enterprise Exchange group. Her work also has appeared in publications and on web sites including EdTech (K-12 and Higher Ed), Ingram Micro Channel Advisor, The CMO Site, and Federal Computer Week.

You might also like...

Data Profiling: The First Step to Data Science

Read More →
We use technologies such as cookies to understand how you use our site and to provide a better user experience. This includes personalizing content, using analytics and improving site operations. We may share your information about your use of our site with third parties in accordance with our Privacy Policy. You can change your cookie settings as described here at any time, but parts of our site may not function correctly without them. By continuing to use our site, you agree that we can save cookies on your device, unless you have disabled cookies.
I Accept