A Brief History of Semantics

As a word, “semantics” was first used by Michel Bréal, a French philologist (a language historian),in 1883. He studied how languages are organized, how languages change as time passes, and the connections within languages. Gen erally speaking, semantics is the study of language and its meaning.

More specifically, semantics can be used to describe how words can have different meanings for different people because of their experiential and emotional backgrounds. A language can be a natural language, such as French, Dutch, or Hindi, or it can be an artificial language, such as a programming language for computers.

Theoretical computer scientists study and develop artificial languages, while linguists study natural languages.

In 1967, Robert W. Floyd wrote a paper describing the use of language semantics in computers and has been given credit for starting the field of programming language semantics. Floyd described programming languages as having two parts: semantics (meaning) and syntax (form). To be read, a computer algorithm must combine semantics and syntax, encoding them precisely so the computer can process them automatically. (Humans do this at the subconscious level.)

Professor Donald Knuth said this about Floyd: “In the old days, programmers would just twiddle with programs till they seemed to work. His approach of marrying math with computer science was a revelation to the field.”

The concept of a global information system became technologically possible in the late 1980s. By 1985, the internet was gaining popularity in Europe. In 1988, the first direct IP connection between North America and Europe occurred. This was quickly followed by discussions of a web-like communications and information system.

The World Wide Web and Social Media

When people began working on the World Wide Web, the “type” of companies interested in the technology determined the business direction it would take. As people experimented with the World Wide Web, the focus shifted to social interactions and social media platforms like Facebook, LinkedIn, Google+, Instagram, Vine, Pinterest, Twitter, and Tumblr, all of which require human interaction.

Because natural language has a structure humans can interpret but machines cannot, humans had to “read” a natural language’s meanings and become a part of the system.

More recently, researchers have begun merging programming languages with linguistics, allowing researchers to combine semantics and big data as they strive to take artificial intelligence to the next level. Semantics is much more of a cognitive process than files and computer memories can claim. It is the process of designing and using a language for communicating and expressing knowledge. It can also provide a foundation for the process of thinking.

The Semantic Web vs. the World Wide Web

In May of 2001, an article titled “The Semantic Web” was published and authored by James Hendler, Ora Lassila, and Tim Berners-Lee. (Tim Berners-Lee went on to become the director of the World Wide Web Consortium, or W3C.) Their paper described a new way to use and search the Internet, an added dimension full of new possibilities. While a human can read the text of an HTML web page, a computer/search engine cannot (unless tags it can read are deliberately inserted). This is because HTML is designed to store visual information and is not written in a programming language.

The Semantic Web is an extension of the World Wide Web, and has focused on technology. The World Wide Web needs a human presence, while the Semantic Web does not. It uses the “hidden,” encoded data and, more recently, natural language processes to search for, compile, and organize information from the web. The Semantic Web needs a human presence only to initiate the request.

Semantics and Linked Data

The concept of linked data has been a very useful aspect of the Semantic Web, and is remarkably functional as an education tool. It can be used to publish and share information all across the internet. The phrase “Linked Open Data” has been used since at least 2007, when the mailing list for Linking Open Data was first created. The Linking Open Data community’s goal was to extend the web with a data commons, providing information, commonly in the form of graphs, as free information.

The internet provides a nearly infinite amount of information. Ranging from spreadsheets to images and from videos to the websites bringing it all together, links connect one site to another and allow us to discover a constantly growing stream of information. The World Wide Web is described as a web of linked “documents,” while linked data describes a web of linked “data.”

Linked data allows computers to combine data and information in many complex ways. This situation was made possible through standardized vocabularies and the major search engines using them. Bing, Google, and Yahoo have started using microdata formats placed within HTML documents to communicate information.

Using natural languages by computers creates access to many new forms of data. Consider these sentences in a spoken format: “fruit flies like pears” and “time flies like a butterfly.” While the sentence structure of each example is quite similar, their meanings are very different, with the words “flies” and “like” having different definitions – definitions that are determined by context. The example shows how even a remarkably simple sentence requires a significant amount of linguistic understanding.

While computers are excellent at using the simple language of mathematics, human languages are remarkably confusing in their complexity and periodic exceptions to the rules. A chess-playing program can play against and defeat most people in a chess game. The same cannot be said for trivia-playing programs. A normal child could beat such a program, because the program lacks a sufficiently broad understanding of the language’s meaning, context, and subtleties. This problem applies to a significant number of services and applications.

Without understanding context, a search engine cannot respond with efficient results for words with multiple meanings.

Semantics and Virtual Assistants

Barry Zane, vice president of engineering for Cambridge Semantics, said:

“Semantic-based technologies are the key to making data easily understandable to both humans and computers, enabling data harmonization using common business meanings.”

With the World Wide Web as a foundation, and the evolution of semantics to include natural languages, virtual assistants are now becoming a reality. Apple’s Siri provides a good example of a virtual assistant. Siri doesn’t just retrieve information; it also helps people complete their online work more quickly and easily. Siri can interpret the spoken word, up to a point, and can also perform a variety of services for the user. Initially, the tasks Siri could perform focused on the mobile internet user. It would book restaurant reservations, check the status of a flight, or coordinate various Internet activities. Siri has now transitioned to other platforms and devices, including automobiles.

In the last two decades, the dream of adding natural language processes to computers, and having them speak as casually as humans, has developed significantly.

Virtual assistants and services are beginning to exchange useful information all over the Semantic Web. Virtual assistants, such as Google Now and Siri, have initiated a wide range of start-ups, especially those providing automated services. We are witnessing the emergence of new semantics services and technologies. The merging of trends in technology and the business world is creating a new cycle of innovation affecting the way individuals and businesses perform their work, and even how data is collected into useful information.

The flexibility of virtual assistants working across the web, and available on different devices, is an important part of the Semantic Web’s purpose.

One aspect of the Semantic Web is its ability to communicate with other computers and operate without human presence. A human needs to initiate the work, but then they can do something else with their time. The use of semantics provides a virtual assistant capable of working independently and processing significant amounts of data.

The Development of Chatbots

Chatbots, a relatively new tool for communicating with customers and potential customers, started gaining popularity around 2018-2020. Chatbots are designed to simplify communications between computers and humans. As a result of “transformers” (developed in 2017) natural language process systems, combined with the open-source nature of many of these models, communications between bots and humans have somewhat improved.

Chatbots provide a new way for organizations to deal with the needs of potential customers in real time. Though still in early stages of use, chatbots can respond 24 hours a day to online queries. Several organizations, including Google, Amazon, Facebook, Apple, and Microsoft, have developed chatbots, though some are still working out the kinks.

Image used under license from Shutterstock.com

LISTEN NOW: MY CAREER IN DATA PODCAST

Data Topics