Five Steps to Linked Data Integration

Last week, we covered the story of how Chris Testa, Director of Engineering at Ad.ly, Inc. brought the Semantic Web to Hollywood. Today, in Part II, Chris shares his recommended 5-Step process for Linked Data Integration.

1. Understand what your "things" are

  • Look for the high value entities in your system -- the ones bringing money and business intelligence over competitors (Examples: Advertisers, Brands, Celebrities)
  • Look for models that are growing quickly in your system (For us, it was Celebrities)
  • Look for things that are well annotated, popular things in culture & technology

2. Choose a Linked Dataset:

  • dbpedia and Freebase are cornerstones of the Linked Data movement
  • There are tons of specialized datasets in many fields (biomedical, events, news, gov't, so much more!)
  • Once you link up, linking to more becomes much easier!

3. Reconcile your things:

  • Reconciling is matching the entities in your database with remote linked data sources
  • Freebase's matchmaker is a really useful tool for reconciling
  • Make it a game, put experts on it to ensure high quality datasets
  • Heuristic methods exist to tackle queues in the 100k+ count

4. Build business intelligence:

  • Tip: There are really simple things you can do with linked data that are cool!
  • For example, display context to users around reconciled entities in your project. Context makes things easier for users.
  • Index and search on reconciled properties like full name, gender, genre, profession, etc.

5. Feedback & maintenance

  • Users won't trust the data unless it is manicured.
  • Add lots of negative feedback loops (Unlike buttons!) to make sure that users are heard.
  • A few minutes a day of cleanup does wonders!

Chris Testa, Ad.lySee Chris’ SemTech 2011 presentation on slideshare: How Hollywood Learned to Love the Semantic Web:
http://slidesha.re/mhXXOJ

Additional Reporting by Jennifer Zaino with contributions from Chris Testa, Director, Engineering, Adly, Inc.