AlchemyAPI has released its AlchemyVision Face Detection/Recognition API, which, in response to an image file or URI, returns the position, age, gender, and, in the case of celebrities, the identities of the people in the photo and connections to their web sites, DBpedia links and more.
According to founder and CEO Elliot Turner, it’s taking a different direction than Google and Baidu with its visual recognition technology. Those two vendors, he says in an email response to questions from The Semantic Web Blog, “use their visual recognition technology internally for their own competitive advantage. We are democratizing these technologies by providing them as an API and sharing them with the world’s software developers.”
The business case for those developers to leverage the Face Detection/Recognition API include that companies can use facial recognition for demographic profiling purposes, allowing them to understand age and gender characteristics of their audience based on profile images and sharing activity, Turner says.
“Large publishers and content recommendation networks are interested in identifying when famous people appear in photographs and video as this typically drives increased engagement of online content,” he adds. “Lastly, via a customer’s ability to load custom faces into the system, they are able to drive a wealth of different app ideas such as automatically tagging one’s friends in their shared photos.”
The API leverages AlchemyAPI’s internal knowledge graph to link well-known entities in photos to other pieces of information. “The AlchemyAPI knowledge graph can be described as a taxonomy of ideas, it encompasses relationships between people, places, things, etc.,” says Turner. “The knowledge graph can be used with facial recognition and comes back with a type hierarchy.” For example, in the image below, Barack Obama is identified by name but the knowledge graph also shows that he is a person, politician and democrat, he says.
“It’s the taxonomy of an idea; in this case a proper noun. Barack Obama can then be disambiguated into further subtypes such as President, PoliticalAppointer, AwardWinner, U.S. Congressperson, and USPresident,” Turner notes.
The latest API is underpinned by the company’s deep learning capabilities to understand a picture’s content and context for extraction and tagging. Due to the unsupervised deep learning approach it used to build AlchemyVision, the Face Detection/Recognition system “can learn new people and other visual concepts without the need for human annotation. This is a unique capability that allows us to rapidly grow the system’s understanding of the world in an automated fashion” Turner says. At launch, the face recognition system was capable of identifying 60,000 different celebrities.
Turner says that, like the company’s AlchemyLanguage product, AlchemyAPI views AlchemyVision as a product family to which it will continue to add features that make it possible to take advantage of all of the actionable information contained in the world’s images. “Reading signs, logos and product labels is the next step in the AlchemyVision family and we’re excited about making this public later in the year,” he says.