Machine Learning: From Then Until Now

Machine Learning is a form of Artificial Intelligence (AI) which allows computers to learn by way of observation and experience, rather than rigid pre-programming. Machine Learning uses computer programs that are capable of growth and change as they process new data. Using algorithms, Machine Learning allows computers to develop habitual responses based on the repeated behaviors and actions of the person using the computer. The concept of learning repeated behaviors is important. As models are presented with new data, they adapt, learning from earlier experiences to provide reliable, consistent results and responses. While the science of Machine Learning is not new, it has been gaining a renewed popularity as it becomes a fundamental building block in AI technology, Big Data, and the evolution of virtual assistants.

Machine Learning can be used to develop programs for:

Email spam filtering
Fraud detection
Web search results
Text-based sentiment analysis
Credit scoring and next-best offers
Network intrusion detection
Pattern and image recognition

Various recently developed computing technologies have helped Machine Learning to evolve. While the algorithms for Machine Learning have existed for a sometime now, the ability to use complicated mathematical calculations in processing Big Data, repeatedly, faster and faster, is a new evolutionary development. Machine Learning has been used for a variety of computing tasks. Examples include optical character recognition (OCR), spam filtering, computer vision, and search engines.

Neural Networks Lead To Algorithms

Donald Hebb wrote a book titled The Organization of Behavior in 1949. His concepts revolutionized the way neurons were thought to work. In his book, he proposed what came to be called Hebb’s rule. Hebb states:

“When an axon of cell A is near enough to excite cell B, and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.”

This means as one neuron excites another, the connection is strengthened, and that it is a fundamental process for learning and memory. Put in the simplest of terms, practice makes perfect.

Somewhere around 1952, Arthur Samuel of IBM, wrote the first program for playing checkers, which included a learning program. The machine’s learning program was quite successful, and helped in improving a checkers players’ performance. His work provided a boost to the public image of Machine Learning.

In 1957, theories and discoveries were brought together by Frank Rosenblatt, who developed the perceptron algorithm. His work was funded by the U.S. Office of Naval Research. Originally, the perceptron was meant to be a machine designed to imitate the memory and thinking processes of a human brain. The “concept” of it as an algorithmic program developed as it was realized the program could be used in different kinds of machines. The program was first used as software in the IBM 704, and then later, in customized hardware, such as the “Mark 1 perceptron.” This particular “perceptron” had been developed for image recognition, and worked with an arrangement of 400 photocells, which were randomly connected to “neurons”.

Unfortunately, in 1958, Mr. Rosenblatt made some unrealistic statements about the perceptron at a press conference organized by the U.S. Navy. The New York Times reported the perceptron to be “the embryo of an electronic computer that the Navy expects will be able to walk, talk, see, write, reproduce itself, and be conscious of its existence.” These exaggerated claims stirred up quite a bit of conversation and controversy in the newly developed Artificial Intelligence community. Marvin Minsky and his colleague Seymour Papert, took offense to the exaggerations, and exposed the weaknesses of neural networks in a book titled, Perceptrons: An Introduction to Computational Geometry.

The perceptron did seem promising, initially. But as was pointed out in Minsky’s book, it could not be taught to recognize certain classes of patterns. The exaggerated claims, and the discovery of the algorithm’s limitations, led first to broken expectations, and then the stagnation of neural network research. Neural network research went through many years of stagnation after Minsky’s accurate and enlightening book.

Statistics Renew Interest in Machine Learning

Machine Learning regained popularity again in the early 1990s, due to the merging of Computer Science and Statistics. This combination resulted in a new ways of thinking about Machine Learning and AI. With this method, (called the Uncertainty Approach) “uncertainty” is integrated into the models’ framework. Many of Machine Learning’s modern success stories are the result of the ideas developed in the 1990s.

The Use of Big Data

In Vermont, the Chamber of Commerce held a conference discussing the benefits and ethics of Big Data. Jim Higgins, the VP for Sales at Cigna, stated his company had begun talking with the Dartmouth-Hitchcock Medical Center about how insurance claims data could help doctors and hospitals enhance their patient’s health. Higgins said:

“If we’ve had someone who’s been on the Cigna medical plan for a year or even longer, we can probably forecast from the data we have, are they a health risk? What a great idea if we can share that with a nurse from a local physician’s office.”

“Statistical” Artificial Intelligence is the core of Big Data analysis, and has produced an exponential growth in the volume of data used for scientific research. All the sciences are on the verge of major changes. For example, by 2024 the new radio telescopes will be taking in an excess of 1 Exabyte each day. In biology, there will soon be about 1 Exabyte of genetic data (10 to the power of 18 bytes) available worldwide. Handling this deluge of this data requires a new scientific discipline. Big Data is being used to develop new ways of storing gigantic amounts of data, and then quickly complex patterns.

Machine Learning Techniques

Two very popular Machine Learning techniques are unsupervised learning and supervised learning. Roughly 70 percent of Machine Learning is supervised learning. Unsupervised learning makes up roughly 10 to 20 percent. Reinforcement learning and semi-supervised learning are two other types of technologies sometimes used.

Supervised learning algorithms use labeled examples in the training process. This learning process provides an input, with the “correct output” already known. The learning algorithm receives a set of inputs along with the corresponding correct outputs, and then the algorithm learns by comparing its actual output with the “correct output.” Then, it modifies the model accordingly. Supervised learning is often used in applications with historical data, to predict probable future events.

Unsupervised learning is used when the system is not given the “correct output.” The algorithm must learn what it is being shown, explore the data, and find a structure or pattern within it. Unsupervised learning deals with transactional data. It can, for example, identify groups of customers with similar buying patterns. These people can then be sent the same type of advertising.

Semi-supervised learning is essentially a mix of supervised and unsupervised Machine Learning. It uses both unlabeled and labeled data while training – typically small amounts of labeled data with large amount of unlabeled data. This kind of learning is often used with methods such as regression classification, and prediction. Semi-supervised learning techniques are used when the costs of labeling are too high. Early examples of this technology include recognizing a person’s face on a video cam.

Reinforcement learning is often used for gaming, robotics, and even navigation. With reinforcement learning, the algorithm, through trial and error, finds which actions provide the greatest rewards. This style of learning has 3 basic components: 1) the agent (or decision maker); 2) the environment (data the agent uses); and 3) the available options (the agent choices). The agent’s goal is to choose the actions maximizing the expected reward.

At Present

New learning algorithms, such as Bayesian networks and support vector machines, are being used in normal day-to-day commercial systems, supporting computational processing with cheaper, more powerful data storage. Ultimately, this means it’s possible to produce models, quickly and efficiently, that can analyze more complex data, and provide faster, more detailed results.

Deep Learning has become a fast-growing field of research in Machine Learning. Deep Learning has provided breakthroughs in text, speech, and image recognition. It uses a neural network with multiple hidden layers, allowing a computer to organize information, learn tasks, and find patterns. Deep Learning is being used for training Virtual Assistants, robotics and automation, and even image scanning with the extraction of associated text.

Siri, Apple’s Virtual Assistant, uses Deep Learning algorithms to develop her speech recognition abilities. Craig Federighi, Apple’s Senior VP of Software Engineering, said, “The speech recognition capability in Siri now has a 5 percent word error rate, thanks to a 40 percent reduction on the part of Apple.”

Machine Learning has become an integral part of Internet research and marketing, and will prove to be an essential technology moving forward into the future of Data Management, Big Data, and AI.

LISTEN NOW: MY CAREER IN DATA PODCAST

Data Topics

Machine Learning: From Then Until Now

Leave a Reply Cancel reply