The problem isn’t that there isn’t enough data out there that will help them gain valuable insights. Rather, there is so much scattered data: It lives in diverse sources, from SEC Edgar filings to articles published in places such as the New York Times and Yahoo! Finance to real-time commentary in social forums like Twitter. That makes it difficult for fund managers to have a comprehensive view of the companies they cover; they lack reliable Linked Data solutions. With data residing in various financial reports using different accounting standards that can’t easily be converged, their headaches mount.
For the last few years MIT Senior Industrial Liaison Officer Graham G. Rong has been researching how to improve financial and social analytics using a Semantic Web approach. The work has revolved around converting XBRL (Extensible Business Reporting Language) financial data to RDF (Resource Description Framework) using the XBRL taxonomy with data elements and an ontology developed for the meaning of all this information. Then combining it with other Internet data to help fund managers better leverage information to make company comparisons and connections, and drawing conclusions from that.
For instance, with the digitization of financial reporting it is possible to understand information such as quarterly or annual revenue in context with other factors that might be found on the web, such as a company’s environmental, or green, ranking or whether they have close relationships with partners in unstable countries overseas that can affect investment decisions. “The problem to address is to help fund managers who are looking for comprehensive information that includes a company and the ecosystem information around it, which is not always that easily available, so that you can analyze a company proactively upstream and downstream,” he says.
Rong’s effort continues on, with a recent next-stage prototype of the application advancing the concept of XBRL convergence.
The XBRL Dialect Challenge
As Rong explains, a challenge exists: While XBRL is a global standard for filing and publishing financial statements, as well as exchanging them and analyzing the information contained within them, different countries have different XBRL dialects, so to speak. This disparity results from the fact that their financial accounting standards differ, thus impacting the semantic homogeneity of the taxonomy of different jurisdictions. The US GAAP (Generally Accepted Accounting Principles) standard for operating profit, for example, equals gross profit minus operating expenses. But China’s version of the operating profit standard, known as YingYeLiRun, is a combination of main profit plus other profits, he says. The language tag differences aren’t a problem to address, as essentially “the XBRL tags are the same, [but] the meanings [behind them] are different,” he says.
Today, firms expend a lot of labor and time trying to manually converge – or estimate, as they call it – those meanings so as to make company comparisons possible, he explains. Linked Data can be the means to automatically performing XBRL convergence, providing a best practice for exposing, sharing, and connecting pieces of data, information, and knowledge using URIs (Uniform Resource Identifiers) and RDF. “With Linked Data we can do the matching and mapping for this to happen automatically,” Rong says, meaning the conversion of another country’s standard for operating profit, for instance, to US GAAP, or vice versa.
“The Linked Data concept lets you develop ontologies to do matching and so to ease convergence,” he says. Rong’s work has included developing comprehensive ontologies for such convergence purposes, which are extensible by users to incorporate other knowledge bases, as well. This is useful for companies filing financial reports with the SEC, for example, which allows for extensible XBRL elements. With these ontologies in place, “you can do apples to apples comparisons” across different companies’ data, regardless of what standard they adhered to when filing their financial reports.
A big benefit of taking this approach is that it can drastically reduce the costs of information analysis. At the same time, it makes it possible for fund managers or data analysts to spend more time focused on meaningful work, rather than tedious, manual data convergence, he says. “They can get data – useful information to work with, not just raw data – more quickly,” he explains.
Another thing Rong and his research team realized is that the same Linked Data capability – mapping an ontology of a specific domain with XBRL taxonomies and word synonym dictionaries, then querying the ontology and searching/reasoning against XBRL data – can be leveraged to actually validate an XBRL filing, as well. “The quality of a financial filing matters” to analysis, he says, and it also matters to government agencies that oversee financial reporting and can rely on the capability to ensure that companies make necessary corrections to their data.
Moving Financials Forward
As far as the system’s main capability of translating XBRL to RDF, Rong mentions that the capability can be useful to organizations such as the EDM Council that originated and stewards the Financial Industry Business Ontology (FIBO). FIBO is a business conceptual ontology and an operational ontology delivered together, designed to be useful both to the financial industry and the regulatory community in understanding the complex patterns and relationships of information characteristic of the sector, with the goal of driving greater transparency.
“This can be a good test bed for them because we have a whole system that can grab XBRL data from the SEC, run it through, convert it into RDF and put it into a triple store for consumption and analysis by integrating and using the FIBO ontologies,” Rong says.
Similarly, government agencies’ own ontologies can be integrated into the system and leveraged for purposes very different from that of most fund managers. They may be more interested in how accurate data in a filing is, for instance, or if a company’s performance data seems out of whack compared to others in its industry in that region, which may trigger some alarm bells. “It’s using the same information, the same data, but based on different views you can get different conclusions based on your interest or concerns,” Rong says.
The new prototype also features integration with a mapping tool, which offers a way to plot companies on a map, and view related corporate information when rolling over the location. As an example, Rong says a fund manager may want to run a SPARQL query, based on financial data translated into RDF, of all companies in a particular sector in a particular area that had revenue above a certain level in the last quarter.
“In 2016 on the R&D side we are going to keep developing new features and functions like this that could be useful,” Rong says. “Linked Data can be considered part of Big Data, providing lots of good features and functionality to contribute to its powerhouse value.”