Artificial Intelligence is hot. We can hardly do anything without coming into contact, consciously or unconsciously, with forms of Artificial Intelligence. And it is becoming increasingly important. This article is an introduction to the field of Artificial Intelligence. It starts with a definition and then explores the different sub-specialties, complete with description and some applications.
WHAT IS ARTIFICIAL INTELLIGENCE?
Artificial Intelligence (AI) uses computers and machines to imitate people’s problem-solving and decision-making skills. One of the leading textbooks in the field of AI is Artificial Intelligence: A Modern Approach (link resides outside Axisto) by Stuart Russell and Peter Norvig. In it they elaborate four possible goals or definitions of AI.
- Systems that think like people
- Systems that behave like people
- Systems that think rationally
- Systems that act rationally
Artificial intelligence plays a growing role in (I)IoT (Industrial) Internet of Things, among others), where (I)IoT platform software can provide integrated AI capabilities.
SUB-SPECIALTIES WITHIN ARTIFICIAL INTELLIGENCE
There are several subspecialties that belong to the domain of Artificial Intelligence. While there is some interdependence between many of these specialties, each has unique characteristics that contribute to the overarching theme of AI. The Intelligent Automation Network (link resides outside Axisto) distinguishes seven subspecialties, figure 1.
Each subspecialty is further explained below.
Machine learning is the field that focuses on using data and algorithms to imitate the way humans learn using computers, without being explicitly programmed, while gradually improving accuracy. The article “Axisto – an introduction to Machine Learning” takes a closer look at this specialty.
MACHINE LEARNING AND PREDICTIVE ANALYTICS
Predictive analytics and machine learning go hand in hand. Predictive analytics encompasses a variety of statistical techniques, including machine learning algorithms. Statistical techniques analyse current and historical facts to make predictions about future or otherwise unknown events. These predictive analytics models can be trained over time to respond to new data.
The defining functional aspect of these engineering approaches is that predictive analytics provides a predictive score (a probability) for each “individual” (customer, employee, patient, product SKU, vehicle, part, machine, or other organisational unit) to determine, to inform or influence organisational processes involving large numbers of “individuals”. Applications can be found in, for example, marketing, credit risk assessment, fraud detection, manufacturing, healthcare and government activities, including law enforcement.
Unlike other Business Intelligence (BI) technologies, predictive analytics is forward-looking. Past events are used to anticipate the future. Often the unknown event is of significance in the future, but predictive analytics can be applied to any type of “unknown,” be it past, present, or future. For example, identifying suspects after a crime has been committed, or credit card fraud if it occurs. The core of predictive analytics is based on capturing relationships between explanatory variables and the predicted variables from past events, and exploiting them to predict the unknown outcome. Of course, the accuracy and usefulness of the results strongly depends on the level of data analysis and the quality of the assumptions.
Machine Learning and predictive analytics can make a significant contribution to any organisation, but implementation without thinking about how they fit into day-to-day operations will severely limit their ability to deliver relevant insights.
To extract value from predictive analytics and machine learning, it’s not just the architecture that needs to be in place to support these solutions. High-quality data must also be available to nurture them and help them learn. Data preparation and quality are important factors for predictive analytics. Input data can span multiple platforms and contain multiple big data sources. To be usable, they must be centralised, unified and in a coherent format.
To this end, organisations must develop a robust approach to monitor data governance and ensure that only high-quality data is captured and stored. Furthermore, existing processes need to be adapted to include predictive analytics and machine learning as this will enable organisations to improve efficiency at every point in the business. Finally, they need to know what problems they want to solve in order to determine the best and most appropriate model.
NATURAL LANGUAGE PROCESSING (NLP)
Natural language processing is the ability of a computer program to understand human language as it is spoken and written – also known as natural language. NLP is a way for computers to analyse and extract meaning from human language so that they can perform tasks such as translation, sentiment analysis, and speech recognition.
This is difficult, as it involves a lot of unstructured data. The style in which people speak and write (“tone of voice”) is unique to individuals and is constantly evolving to reflect popular language use. Understanding context is also a problem – something that requires semantic analysis from machine learning. Natural Language Understanding (NLU) is a branch of NLP and picks up these nuances through machine “reading understanding” rather than simply understanding the literal meanings. The purpose of NLP and NLU is to help computers understand human language well enough so that they can converse naturally.
All these functions get better the more we write, speak and talk to computers: they are constantly learning. A good example of this iterative learning is a feature like Google Translate that uses a system called Google Neural Machine Translation (GNMT). GNMT is a system that works with a large artificial neural network to translate more smoothly and accurately. Instead of translating one piece of text at a time, GNMT tries to translate entire sentences. Because it searches millions of examples, GNMT uses a broader context to derive the most relevant translation.
Learn how Natural Language Processing works (link resides outside Axisto).
A SELECTION OF NLP TASKS
The following is a selection of tasks in natural language processing (NLP). Some of these tasks have direct real-world applications, while others more often serve as sub-tasks used to solve larger tasks.
- Optical Character Recognition (OCR)
- Determining the text associated with a given image representing printed text.
- Speech Recognition
- Determine the textual representation of the speech on the basis of a sound fragment of a speaking person or persons. This is the opposite of text-to-speech and is an extremely difficult problem. In natural speech, there are hardly any pauses between consecutive words, so speech segmentation is a necessary subtask of speech recognition (see ‘word segmentation below). In most spoken languages, the sounds representing successive letters merge into one another in a process called coarticulation. Thus, the conversion of the analog signal to discrete characters can be a very difficult process. Since words are spoken in the same language by people with different accents, the speech recognition software must also be able to recognise a wide variety of inputs as identical to each other in terms of textual equivalents.
- The elements of a given text are transformed and a spoken representation is produced. Text-to-speech can be used to help the visually impaired.
- Word Segmentation (Tokenization)
- Splitting a piece of continuous text into individual words. For a language like English, this is quite trivial, as words are usually separated by spaces. However, some written languages such as Chinese, Japanese, and Thai do not mark word boundaries in such a way, and in those languages, text segmentation is an important task that requires knowledge of the vocabulary and morphology of words in the language. Sometimes word segmentation is also applied in, for example, making words in data mining.
- Document AI
- A Document AI platform sits on top of NLP technology, allowing users with no previous experience with artificial intelligence, machine learning, or NLP to quickly train a computer to extract the specific data they need from different document types. NLP-powered Document AI enables non-technical teams to quickly access information hidden in documents, e.g. lawyers, business analysts and accountants.
- Grammatical Error Correction
- Grammatical error detection and correction involves a wide range of problems at all levels of linguistic analysis (phonology/orthography, morphology, syntax, semantics, pragmatics). Grammatical error correction has a major impact because it affects hundreds of millions of people who use or learn a second language. In terms of spelling, morphology, syntax, and certain aspects of semantics, with the development of powerful neural language models such as GPT-2, this can be regarded as a largely solved problem since 2019. Various commercial applications are available in the market.
- Machine Translation
- Automatically translating text from one human language to another is one of the most difficult problems: all different kinds of knowledge are required to do it properly, such as grammar, semantics, real world facts, etc..
- Natural Language Generation (NLG)
- Converting information from computer databases or semantic intent into human readable language.
- Natural Language Understanding (NLU)
- NLU concerns the understanding of human language, such as Dutch, English, and French, which allows computers to understand commands without the formalised syntax of computer languages. NLU also allows computers to communicate back to people in their own language. The main goal of NLU is to create chat and voice-enabled bots that can communicate with the public unsupervised. Answer questions and determine the answer to a question in human language. Typical questions have a specific correct answer, such as “What is the capital of Finland?”, but sometimes open questions are also considered (such as “What is the meaning of life?”). How does understanding natural language work? NLU analyses data to determine its meaning by using algorithms to reduce human speech to a structured ontology – a data model made up of semantics and pragmatic definitions. Two fundamental concepts of NLU are intent and entity recognition. Intent recognition is the process of identifying user sentiment in input text and determining its purpose. This is the first and most important part of NLU as it captures the meaning of the text. Entity Recognition is a specific type of NLU that focuses on identifying the entities in a message and then extracting key information about those entities. There are two types of entities: named entities and numeric entities. Named entities are grouped into categories, such as people, businesses, and locations. Numeric entities are recognised as numbers, currency and percentages.
- Text-to-picture generation
- Describe an image and generate an image that matches the description.
Natural language processing – understanding people – is key to AI justifying its claim to intelligence. New deep learning models are constantly improving the performance of AI in Turing tests. Google’s Director of Engineering Ray Kurzweil predicts AIs will “reach human levels of intelligence by 2029“(link resides outside Axisto).
By the way, what people say is sometimes very different from what people do. Understanding human nature is by no means easy. More intelligent AIs expand the perspective of artificial consciousness, opening up a new field of philosophical and applied research.
Speech recognition is also known as automatic speech recognition (ASR), computer speech recognition or speech-to-text. It is a capability that uses natural language processing (NLP) to process human speech in a written format. Many mobile devices incorporate speech recognition into their systems to perform voice searches, e.g. Siri from Apple.
An important area of speech in AI is speech-to-text, the process of converting audio and speech into written text. It can help visually or physically impaired users and can promote safety with hands-free operation. Speech-to-text tasks contain machine learning algorithms that learn from large datasets of human voice samples to arrive at adequate usability quality. Speech-to-text has value for businesses because it can help transcribe video or phone calls. Text-to-speech converts written text into audio that sounds like natural speech. These technologies can be used to help people with speech disorders. Polly from Amazon is an example of a technology that uses deep learning to synthesise human-sounding speech for the purposes of e-learning and telephony, for example.
Speech recognition is a task where speech is received by a system through a microphone and checked against a database of large pattern recognition vocabulary. When a word or phrase is recognised, it will respond with the corresponding verbal response or a specific task. Examples of speech recognition include Apple’s Siri, Amazon’s Alexa, Microsoft’s Cortana, and Google’s Google Assistant. These products must be able to recognise a user’s speech input and assign the correct speech output or action. Even more sophisticated are attempts to create speech based on brain waves for those who cannot speak or have lost the ability to speak.
An expert system uses a knowledge base about its application domain and an inference engine to solve problems that normally require human intelligence. An interference engine is a part of the system that applies logical rules to the knowledge base to derive new information. Examples of expert systems include financial management, business planning, credit authorisation, computer installation design, and airline planning. For example, an expert traffic management system can help design smart cities by acting as a “human operator” to relay traffic feedback for appropriate routes.
A limitation of expert systems is that they lack the common sense people have, such as an understanding of the limits of their skills and how their recommendations fit into the bigger picture. They lack the self-awareness of people. Expert systems are not a substitute for decision makers because they lack human capabilities, but they can dramatically ease the human work required to solve a problem.
PLANNING SCHEDULING AND OPTIMALISATION
AI planning is the task of determining how a system can best achieve its goals. It is choosing sequential actions that have a high probability of changing the state of the environment incrementally in order to achieve a goal. These types of solutions are often complex. In dynamic environments with constant change, they require frequent trial-and-error iteration to fine-tune.
Planning is making schedules, or temporary assignments of activities to resources, taking into account goals and constraints. To design an algorithm, planning determines the sequence and timing of actions generated by the algorithm. These are typically performed by intelligent agents, autonomous robots and unmanned vehicles. When designed properly, they can solve organisational scheduling problems in a cost-effective way. Optimisation can be achieved by using one of the most popular ML and Deep Learning optimisation strategies: gradient descent. This is used to train a machine learning model by changing its parameters in an iterative way to minimise a particular function to the local minimum.
See also our “More Optimal Planning and Optimalisation Software”.
Intelligence is at one end of the Intelligent Automation spectrum, while Robotic Process Automation (RPA), software robots that mimic human actions, is at the other end. One is concerned with replicating how people think and learn, while the other is concerned with replicating how people do things. Robotics develops complex sensor-motor functions that enable machines to adapt to their environment. Robots can sense the environment using computer vision.
The main idea of robotics is to make robots as autonomous as possible through learning. Despite not achieving human-like intelligence, there are still many successful examples of robots performing autonomous tasks such as carrying boxes, picking up and putting down objects. Some robots can learn decision making by associating an action with a desired outcome. Kismet, a robot at M.I.T.’s Artificial Intelligence Lab, learns to recognise both body language and voice and respond appropriately. This MIT video (link is outside Axisto) gives a good impression.
Computer vision is an area of AI that trains computers to capture and interpret information from image and video data. By applying machine learning (ML) models to images, computers can classify and respond to objects, such as facial recognition to unlock a smartphone or approve intended actions. When computer vision is coupled with Deep Learning, it combines the best of both worlds: optimised performance combined with accuracy and versatility. Deep Learning offers IoT developers greater accuracy in object classification.
Machine vision goes one step further by combining computer vision algorithms with image registration systems to better control robots. An example of computer vision is a computer that can “see” a unique series of stripes on a universal product code and scan it and recognize it as a unique identifier. Optical Character Recognition (OCR) uses image recognition of letters to decipher paper printed records and/or handwriting, despite the wide variety of fonts and handwriting variations.