by Neil Raden
I haven’t been formally trained on WolframAlpha nor have I thoroughly investigated it. In fact, I’ve spent more time reading the hype about it than I have actually kicking the tires. But from the time I’ve spent, some things are already obvious. First and prominently, WolframAlpha does not rely on semantic technology, neither Semantic Web nor Linked Data concepts, and it possesses no underlying ontology driving its structure or information. Having said that, I may be mistaken when I said “no underlying ontology” as there may be elements of ontology that I’m not aware of, but overall, it is not an application based on semantics. There does not appear to be a taxonomy of terms linking them across knowledge domains.
This doesn’t really seem strange because Dr. Wolfram, for all his good intentions, has developed software for mathematicians, scientists and engineers, not enterprise applications and data management. In the lingo of data people, the data in WolframAlpha is hard-coded and proprietary, a horrifying prospect. The way the knowledge domains seem to be developed does not appear to conform to what would be considered good data management practice in an enterprise today. However, WolframAlpha is a product, not a data management tool. There doesn’t appear to be a way to expand its knowledge base except via Wolfram Research’s “curating” process which is, presumably, only done by Wolfram Research.
The implications of this are rather severe (though in fairness we can expect the product to mature). In essence, WolframAlpha is based on Wolfram’s previous (and extremely successful) product, Mathematica. Therefore concepts like equations, infinite series, Fourier transforms, topological concepts, algorithms for solving mathematical constructs are all part of the foundation of the product. What is new is the seemingly vast domain of knowledge that has been added in many areas. What is missing, though, is the essence of semantic technology, the meaning and relationships of things. Instead, they are embedded in a knowledge domain and expressed in mathematical terms, so they are only understood in their own context. Terms like velocity, weight, distance, country, airfoil, downforce, oxidative phosphorilization, etc. may be understood as words that are part of mathematical statements, but there is no representational framework (taxonomy and/or ontology) so that they can be understood across domains or (and this is a big problem, I think) accessed through other means than the WolframAlpha interface (more on this in a minute).
Initially I assumed that the interface was operating on natural language, but it does not. It doesn’t actually parse language and derive the meaning of questions, it merely tolerates a statement and picks words or symbols it understands. Actually, “tolerate” is a bit of an overstatement. Until I got the hang of it, I presented 42 queries to WolframAlpha without getting a single answer. This quickly becomes obvious when entering a question like, How much coal does it take to power a Prius for 50,000 miles and how does this compare to the CO2 output of a conventional Camry, or,“How is fuel consumption of a Boeing 777 with GE90 engines affected if the average ambient temperature of the planet falls 4 degrees centigrade?” This is the sort of thing the tool should actually be able to calculate, but only if the question is phrased in the dialect the system understands. It is not going to give you a satisfactory answer because in the same way search engines like Google and Yahoo accept questions like this, they develop no understanding of their meaning. WolframAlpha, too, silently overlooks your intent and works with what it knows.
Doug Lenat wrote about this is his blog and it is actually quite a severe problem. He gave this example:
“If it knows that exactly 17 people were killed in a certain attack, and if it also knows that 17 American soldiers were killed in that attack, it doesn’t return that attack if you ask for ones in which there were no civilian casualties, or only American casualties. It doesn’t perform that sort of deduction.”
Because “context” in WolframAlpha is limited to the equations (I’m using the term “equations” loosely) of a particular knowledge domain, the meaning of a term can change depending on how it is used, giving you completely wrong answers, with no warning that they are wrong. All of this is enough to make a data governance geek cringe. The only thing worse than being wrong is being wrong and believing you are right, and acting on it. This argues against WolframAlpha being used in any sort of decision management environment such as those James Taylor and I described in our book “Smart (Enough) Systems.”
It is too early to forecast the trajectory of WolframAlpha – will it get gobbled up by a search vendor or one of the technology giants, will it catch on and take Wolfram to heights far beyond its success with Mathematica or will it settle into a comfortable niche in the same computation market serviced by the company now? In a way, it’s possible to see WolframAlpha as a nice complement to search engines, especially if the search engines themselves are able to tunnel to WolframAlpha and craft the questions themselves, but this is unlikely to happen without a semantic layer within WolframAlpha. Because knowledge isn’t static, one can assume that new knowledge and knowledge domains will flow in too quickly for Wolfram’s current “curating” approach. The only way to keep meaning and relationships under control without a million clerks is with semantic technology.
ABOUT THE AUTHOR
Neil Raden is the founder of Hired Brains, a research and advisory firm in the BI industry. In the past, he’s been a mathematician, an actuary, a software developer, a DSS consultant, and a founder of a SI doing DSS/Data warehousing/BI. He has almost 30 years experience, is a widely published writer, well-known speaker, and consultant.
He has personally designed over 100 data warehouses and implemented dozens of large analytical applications in finance, marketing, distribution, logistics, actuarial, scientific, statistical, consumer products and more. He has written and/or contributed to three books:
“Smart (Enough) Systems,” Prentice Hall, 2007
“2008 BPM and Workflow Handbook,” Future Strategies, 2008 (contrib.)
“Planning and Designing the Data Warehouse,” Prentice Hall, 1996