AI Glossary: Key Terms and Definitions

This glossary provides definitions for common terms used in the field of Artificial Intelligence (AI), including related areas like Machine Learning (ML), Deep Learning (DL), and Natural Language Processing (NLP). It is intended to be a helpful resource for both beginners and those with more experience. It also includes terms specific to Google Cloud Platform’s AI offerings.

AI (Artificial Intelligence)

The broad field of computer science dedicated to creating machines and computer programs capable of performing tasks that typically require human intelligence. These tasks include, but are not limited to, learning, reasoning, problem-solving, decision-making, perception, understanding natural language, and interacting with the environment. [Google Cloud, 2] [U.S. Department of State, 15] [NASA, 14] [University of Illinois Chicago, 3] [IBM, 13]

Related Terms: Machine Learning, Deep Learning, Natural Language Processing

Algorithm

A well-defined sequence of instructions or a set of rules designed to perform a specific task or solve a particular problem. In the context of artificial intelligence and machine learning, algorithms are the computational procedures that enable systems to analyze data, identify patterns, make predictions, and learn from experience. [Heinz College, 4]

Related Terms: Model

Artificial General Intelligence (AGI)

A theoretical form of artificial intelligence that possesses the ability to understand, learn, and apply knowledge across a wide range of tasks, much like a human being. Unlike AI systems designed for specific tasks, AGI would exhibit general cognitive abilities comparable to human intellect, allowing it to perform any intellectual task that a human can. [Google Cloud, 2]

Related Terms: Artificial Superintelligence (ASI), Artificial Intelligence (AI)

Artificial Superintelligence (ASI)

A hypothetical stage of artificial intelligence development where machines possess intellectual capabilities far exceeding those of the most intelligent humans in virtually every field, including scientific creativity, general wisdom, and problem-solving. [Google Cloud, 2]

Related Terms: Artificial General Intelligence (AGI)

Backpropagation

A crucial algorithm in training neural networks. It calculates the gradient of the loss function with respect to the network’s weights, allowing the weights to be adjusted to improve the network’s accuracy.

Related Terms: Neural Network, Weight, Loss Function

Bias (in AI)

Systematic errors in an AI system’s output that result from skewed training data or flawed algorithms. Bias can lead to unfair or discriminatory outcomes.

Related Terms: Fairness, Explainable AI (XAI)

Deep Learning (DL)

A subfield of machine learning that utilizes artificial neural networks with multiple layers (hence, “deep”) to analyze and extract complex patterns from large volumes of data. These deep neural networks can automatically learn hierarchical representations of data, making deep learning particularly effective for tasks such as image and speech recognition, natural language processing, and other complex data-driven applications. [Heinz College, 4]

Related Terms: Neural Network, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN)

Embedding

A representation of discrete variables (like words or categories) as continuous vectors. Embeddings allow AI models to capture semantic relationships between different items.

Related Term: Word Embedding, Vector Search

Fine-tuning

The process of taking a pre-trained AI model and further training it on a specific, smaller dataset to improve its performance on a particular task.

Related Terms: Pre-trained Model, Transfer Learning

Gemma 3 (Google Cloud Specific)

A family of lightweight, open-source AI models built by Google, based on the technology used in Gemini models. Suitable for deployment on platforms like Google Cloud Run.

Related Terms: Gemini 2.0, Open-Source AI Models, Google Cloud Run, Vertex AI Model Garden

Gemini 2.0 (Google Cloud Specific)

A powerful, multi-modal AI model from Google, capable of understanding and processing various types of data, including text, images, and audio.

Related Terms: Multi-modal Capabilities, Large Language Model

Generative AI

A class of artificial intelligence models focused on generating new, original content that resembles the data on which they were trained. This includes the creation of text, images, audio, video, and other forms of data. Generative AI models learn the underlying patterns and structures within the training data and can then produce novel outputs that are statistically similar to that data. [Heinz College, 4]

Related Terms: Large Language Model (LLM)

Hallucination (in AI)

In the context of artificial intelligence, particularly with large language models, hallucination refers to the phenomenon where the AI system generates information that is factually incorrect, nonsensical, or fabricated, yet presents it with confidence as if it were accurate and truthful. [Heinz College, 4]

Related Terms: Large Language Model (LLM)

Inference

The process of using a trained AI model to make predictions or generate outputs on new, unseen data. This is the “application” phase after training.

Related Terms: Model, Prediction

AI Hypercomputer (Google Cloud Specific)

An architecture designed by Google Cloud that integrates AI-optimized hardware, software, and consumption models to improve the productivity and efficiency of AI workloads.

Related Terms: AI Optimized Hardware, AI Training, AI Inference

AI Optimized Hardware (Google Cloud Specific)

Hardware specifically designed to accelerate AI workloads, such as GPUs (Graphics Processing Units) and TPUs (Tensor Processing Units).

Related Terms: AI Hypercomputer, GPU, TPU

AI Training (Google Cloud Specific)

The process of feeding data to an AI algorithm to allow it to learn and improve its performance.

Related Terms: AI Inference, Training Data, Model

Large Language Model (LLM)

An artificial intelligence model, typically based on deep learning architectures like transformers, that has been trained on an extremely large dataset of text and code. LLMs possess the ability to understand and generate human-like text, making them capable of performing a wide range of natural language processing tasks, including text completion, translation, summarization, and conversational AI applications like chatbots. [Heinz College, 4]

Related Terms: Transformer, Generative AI, Prompt Engineering

LLM Comparator (Google Cloud Specific)

A tool within Google Cloud’s Vertex AI platform that provides an interface for human evaluators to review side-by-side comparisons of the outputs from different LLMs.

Related Terms: Vertex AI Evaluation Service, Pairwise Model Evaluation, Large Language Model (LLM)

Machine Learning (ML)

A subfield of artificial intelligence that focuses on enabling computer systems to learn from data without being explicitly programmed. Machine learning algorithms identify patterns in data, allowing them to make predictions, classify information, and improve their performance over time through experience and exposure to more data. It is a core technology driving many of the advancements in artificial intelligence. [Google Cloud, 5] [U.S. Department of Energy, 6] [AWS, 7] [MIT Sloan, 12] [IBM, 8] [Datacamp, 10] [SAS, 9]

Related Terms: Supervised Learning, Unsupervised Learning, Reinforcement Learning, Algorithm

Model

In AI, a model is a mathematical representation of a real-world process. It’s the output of the training process, embodying the learned patterns and relationships from the data.

Related Terms: Algorithm, Training, Inference

The ability of an AI model to process and understand multiple types of data, such as text, images, audio, and video.

Related Terms: Gemini 2.0

Natural Language Processing (NLP)

A branch of artificial intelligence that deals with enabling computers to understand, interpret, and generate human language. NLP techniques allow machines to process and analyze large amounts of text and spoken language, facilitating tasks such as language translation, sentiment analysis, text summarization, and the development of conversational agents like chatbots. [Heinz College, 4]

Related Terms: Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Tokenizer, Token

Neural Network

A computational model inspired by the structure and function of the human brain, consisting of interconnected nodes (neurons) organized in layers. These networks learn to process information by adjusting the strengths (weights) of the connections between neurons based on the input data they receive. Neural networks are a fundamental building block of many machine learning and deep learning algorithms, particularly effective for tasks involving pattern recognition and complex data analysis. [Heinz College, 4]

Related Terms: Neuron, Layer, Activation Function, Deep Learning, Backpropagation, Weight

Open-Source AI Models

AI models whose source code is publicly available, allowing for community contributions, modifications, and use.

Related Terms: Gemma 3

Overfitting

A problem that occurs when an AI model learns the training data too well, including its noise and irrelevant details. An overfit model performs poorly on new, unseen data.

Related Terms: Underfitting, Regularization

Pairwise Model Evaluation

A method of comparing two AI models directly against each other to assess their relative performance on a specific task. Useful for comparing LLMs.

Related Terms: LLM Comparator, Vertex AI Evaluation Service

Parameter

A variable within an AI model that is learned during the training process. The parameters define the model’s behavior.

Related Term: Weight

Pre-trained Model

An AI model that has been trained on a large, general dataset and can be used as a starting point for fine-tuning on a more specific task.

Related Terms: Fine-tuning, Transfer learning

Prompt

The input text or instructions given to an AI model, particularly a large language model, to elicit a specific response or output.

Related Terms: Prompt Engineering, Large Language Model (LLM)

Prompt Engineering

The art and science of crafting effective prompts to get the desired results from AI models, especially large language models. It involves carefully choosing words, phrasing, and context.

Related Terms: Prompt, Large Language Model (LLM)

Prompt Injection

A type of security vulnerability where an attacker manipulates the input (prompt) to an AI model to cause it to perform unintended actions or reveal sensitive information.

Related Terms: Model Armor, AI Security

Reinforcement Learning

A paradigm of machine learning where an agent learns to make decisions in an environment by interacting with it and receiving feedback in the form of rewards or penalties. The agent’s objective is to learn a policy, which is a mapping from states to actions, that maximizes the total reward it accumulates over time. This type of learning is particularly useful for training agents to perform complex tasks, such as playing games or controlling robots, through trial and error. [Google Cloud, 5]

Related Terms: Agent, Policy, Reward

Supervised Learning

A type of machine learning where an algorithm learns from a dataset that is labeled, meaning that each input data point is paired with its corresponding correct output. The algorithm’s goal is to learn a function that can map new, unseen inputs to their correct outputs based on the patterns it learned from the labeled training data. Supervised learning is commonly used for tasks such as classification and regression. [Google Cloud, 5]

Related Terms: Labeled Data, Classification, Regression

Token

In the context of natural language processing and large language models, a token is the smallest unit of text that the model processes. This could be a word, a part of a word, or even a single character. The process of breaking down a piece of text into individual tokens is called tokenization, which is a crucial first step in preparing text data for analysis by these models. [Heinz College, 4]

Related Terms: Tokenizer, Natural Language Processing (NLP)

Tokenizer

A component of an NLP system that breaks down text into smaller units called tokens. Tokens can be words, subwords, or characters, depending on the tokenizer.

Related Terms: Token, Natural Language Processing (NLP)

Transformer

A type of neural network architecture that has become dominant in natural language processing. Transformers use a mechanism called “attention” to weigh the importance of different parts of the input data.

Related Terms: Large Language Model (LLM), Attention Mechanism

Turing Test

A test proposed by Alan Turing in 1950 as a measure of a machine’s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. In a typical setup, a human evaluator engages in natural language conversations with both a human and a machine, without knowing which is which. If the evaluator cannot reliably distinguish the machine from the human based on their responses, the machine is said to have passed the Turing test, suggesting a certain level of artificial intelligence. [University of Illinois Chicago, 3]

Related Terms: Artificial Intelligence (AI)

Underfitting

A problem that occurs when a model is not capable of representing the underlying patterns in data. This can be due to insufficient data, or an overly simple model.

Related Terms: Overfitting

Unsupervised Learning

A type of machine learning where an algorithm learns from a dataset that is not labeled, meaning there are no predefined output labels associated with the input data. The algorithm’s objective is to discover hidden patterns, structures, or groupings within the data without any prior knowledge of the correct outputs. Unsupervised learning is often used for tasks such as clustering, dimensionality reduction, and anomaly detection. [Google Cloud, 5]

Related Terms: Clustering, Dimensionality Reduction, Anomaly Detection

Vertex AI Evaluation Service (Google Cloud Specific)

A service within Google Cloud’s Vertex AI platform that automates the evaluation of machine learning models, including LLMs, using various metrics.

Related Terms: Pairwise Model Evaluation, LLM Comparator, Vertex AI Model Garden

Vertex AI Model Garden (Google Cloud Specific)

A repository of pre-trained and open-source AI models within Google Cloud’s Vertex AI platform, making it easier to deploy and manage AI models.

Related Terms: Gemma 3, Pre-trained Model, Fine-tuning, Vertex AI Evaluation Service

Weight

A type of parameter. A numerical value associated with a connection between neurons in a neural network. Weights determine the strength of the influence of one neuron on another.

Related Terms: Parameter, Neural Network

References

Glossary AI AI Machine Learning Glossary Definitions Terminology Reference