Hungry to discover the top 100 restaurant dishes across the good old USA? Well, have your fill of them over at Dishtip.com, which today has debuted a semantic mashup that covers the bases from salted caramel ice cream at San Francisco’s Bi-Rite Creamery to chorizo stuffed dates at Chicago’s Avec to Hirata pork buns at Ippudo NY. Users can filter by type of food (e.g. vegetarian or comfort) and what cuisine, as well as what meal course it fits into, and price. (The list is heavy on the big-city restaurant dishes, admittedly, given that that’s where a lot of the web discussion action is, but as likely to come up with an offering from a vendor street cart as a swank spot.)
The minds behind Dishtip, TipSense LLC founders Joel Fisher and Dave Schorr, say this hasn’t been possible before, and wouldn’t be now without the deep layers of semantic analysis that’s revealing detailed and relevant information based on millions of sources. The list aims to further play out what has been the company’s value proposition to foodies – the ability to search at the level of lobster roll vs. seafood, say, or chicken parm va. Italian food.
“We have all this data now from across the USA and we don’t think anywhere else is this data aggregated and analyzed on a national scale,” says Fisher. “No one else has been able to do this to test where the Semantic Web is today, where the capabilities are. We have harnessed vast amounts of information to even understand this kind of data and it gets much, much deeper.”
The service takes the all-hands-on-deck approach, crawling anywhere and everywhere people have posted reviews, photos, and other content on restaurant dishes to glean information to feed the site at large, and the new list. As Schorr describes it, Dishtip’s semantic approach involves training the system to learn about food-related items, drawing from a variety of structured and non-structured document and web site content pools to understand common types of foods, attributes relevant to various items consumed in restaurants, trending words, and more, rather than working from a specific pre-built ontology of dishes. “We can learn things not in a traditional food directory based on statistical patterns in the data,” he adds. “It isn’t easy to figure out all these attributes. They’re all derived from lots of deep inspection,” he says, describing downloading through millions of recipes to figure out, for instance, what components are indicated to make something creamy vs. spicy or savory vs. salty.
“That’s what goes on in the backend with tuples and weights. We take the semantics in multiple different data areas, mash them up and combine and roll them back up into the dish domain. You couldn’t figure this out looking at menus or reviews,” he says. “You have to join all this from other data sources, some that you don’t even see, some that might not even be directly related to that dish. But that’s how semantic stuff works. There are lots and lots of statistics and weights and learning.” To ensure it’s matching the right dishes with the right restaurants from the wealth of web commentary out there, Dishtip also took a multifaceted approach to creating unique identity attributes that give it statistical confidence that it is the same venue verified across other sources.
From early tests of response to the Top 100 list, “people’s comments tell us the results are extremely accurate as it relates to is this a top dish or not,” says Fisher. “Which I think is a testament to how well the weighting and all those computations are working. That’s part of making this a useable good experience.”
Fisher adds that image recognition gets tricky when you try to associate an image and various attributes using an algorithm. “Although the images are not currently 100% perfect on our site (although we have a mechanism for corrective learning), I think we are ahead of the curve,” he says.
Fisher says that the technology behind Dishtip could work for other verticals, as well. One example the company gives is in the travel realm, where users can also benefit from asking questions differently than they’ve become accustomed to. Instead of searching for diner when you could search for blueberry pancakes, those with beach travel in mind could search more granularly, for gentle surf or perhaps nice views for that family-friendly vacation. “Now if you search on TripAdvisor they can provide answers for beach or family vacation, but that doesn’t really narrow it down enough to the specific type you are looking for, and so you might have to comb through hundreds or thousands of reviews to find a match,” Schorr says. “But when you supply semantic analysis of tens of thousands of reviews of hotels around the world you can start to extract correlations of family-friendly and reconcile that against third-party vocabularies, like geographic features, historical weather trends and so on, and you can take travel search to the next level.”
Schorr says that that many people who see the site assume that the content is provided by dishtip users (either consumers or restaurants). “When you really point out that the site is all driven by machine intelligence, then people can see just how far the technology has come,” he says.