Toloka Team
History of generative AI
Generative AI sets itself apart from traditional AI by the fact that it is capable of generating new content such as visuals, audio and textual data. It may seem like technology has only recently emerged. However, this is not entirely true. The first generative algorithms date back to the origins of AI as a field of computer science.
Yet, generative AI has only recently begun to evolve by leaps and bounds. Here's a look at the history of generative AI and why it's been developing so rapidly in recent years.
A Brief History of Artificial Intelligence
The birth of AI
Before heading to the discussion about generative AI, let’s take a moment and look at the development of AI as a whole. Many scientists from different fields were involved in the exploration of artificial intelligence at the beginning of the 20th century. Perhaps one of the most famous was the highly acclaimed mathematician Alan Turing. He has been involved with the challenge of artificial intelligence since at least 1941. Turing made one of his earliest mentions of "intelligent machinery" in 1947. In an eponymous paper, Turing explored the question of whether a machine could detect rational behavior.
In his 1950 paper Computing Machinery and Intelligence, Alan Turing poses the question "Can machines think?". This seminal work in the field of artificial intelligence proposed the idea of the Turing Test. The test requires a person to interact with a computer and a human being. The person must determine, based on their answers to certain questions, whether they are talking to a human or a machine. The program's task is to trick the person into believing that they are having a conversation with a human being.
AI was born in 1956 when a workshop called The Dartmouth Summer Research Project on Artificial Intelligence was held. Researchers from different fields of science, including linguistics, philosophy and computer science, came together. They debated the potential of computing machines to simulate reasoning, intelligence, and creative processes. Among other events at the workshop, John McCarthy, an American computer scientist, proposed a name for a new academic discipline: artificial intelligence.
Considerable progress was made over the next decades. Machines were becoming more affordable, cheaper, faster, and capable of storing more information. Machine learning (ML) algorithms were also improving. Arthur Samuel introduced the machine learning term in 1959 with the first self-learning program created to play checkers, which could independently play on its own.
In the late 1950s, Frank Rosenblatt introduced the perceptron. It could be described as the first ever operational realization of a neural network. A perceptron represents a basic ML model that was designed to aid computers in learning from a diverse range of data.
In the 1960s, John McCarthy developed the LISP programming language for artificial intelligence tasks. In the same decade, the first expert systems were developed to model a human's knowledge in a certain field. For example, Dendral is the first AI expert system for the purpose of identifying the molecular structure of an unknown organic compound.
Among the first functioning examples of generative AI, the ELIZA chatbot was created in 1961 by British scientist Joseph Weizenbaum. It was the first talking computer program that simulated the work of a psychotherapist and could communicate with a human in a natural language.
In the 1960s and 70s, research on the application of computer vision and the utilization of some basic recognition patterns was carried out. More advanced expert systems were developed during this period. For example, the MYCIN software for diagnosing bacteria that cause serious diseases. By the late 1970s and during the 1980s, the invention of AI was on the decline due to reduced funding.
The rise of AI: deep learning and generative AI
In the 1990s and 2000s, computer processing capacity has substantially grown. The DeepBlue chess computer system defeated the world chess champion in 1997, and Dragon Systems created NaturallySpeaking, the first publicly available voice recognition system.
The rise of the Internet led to what has become an explosion in the amount of data being collected and processed. In the 2000s, the processing power of computing machines has reached the level essential for dealing with enormous data flows. New technologies and concepts have emerged that support the development of artificial intelligence.
Machine learning, neural networks, and deep learning have become more widely accessible and have given new opportunities to develop smarter and responsive systems. Deep learning has been growing particularly fast in the 2010s. It is a type of machine learning that employs multi-layered neural networks that self-train on a large dataset.
Despite the fact that the first mentions of deep learning date back to the 1940s and 1950s, powerful computers were not available in those times and the amount of available data was too limited for a full-scale development of deep learning systems. Such rapid progress in the 2010s was facilitated by the growing GPU computing power and the emergence of convolutional neural networks.
Modern generative AI is based mainly on deep learning technique, therefore generative AI also started rapidly developing in the 2010s.
First generative AI
As was already mentioned, one of the first primitive generative AI was ELIZA. It was a text chat bot created in the 1960s by Joseph Weizenbaum. ELIZA was one of the first examples of Natural Language Processing (NLP) and mimicked the work of a psychotherapist and could communicate with humans in natural language.
ELIZA followed a simple pattern of recognizing keywords in text to later generate programmed generic responses. The chatbot's ability to communicate created an impression that the machine could understand human speech. However, the machine interpreted all words as character data, without giving them meaning as a human does. According to the developer ELIZA was just a parody of a psychotherapist and was completely non-intelligent.
Although it was quite primitive and relatively simple to implement compared to the chatbots of future generations, ELIZA opened the path for subsequent advances in the Natural Language Processing domain over the following decades.
Development of generative AI
Generative AI is a type of AI that can create realistic images and videos, generate text or music. To achieve this, generative AI models are applied. The purpose of such models is to generate new samples from what was already in the training data.
Some of the first generative models were Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs) that were devised in the 1950s. They produced successive pieces of data, such as speech. For instance, for HMMs, one of the first applications was speech recognition. The productivity of generative models, though, significantly boosted only after the rise of deep learning.
In the field of natural language processing, recurrent neural networks (RNNs), which were introduced in the late 1980s, are used for language modeling tasks. RNNs can model relatively long dependencies and allow generating longer sentences. Long Short-Term Memory (LSTM), a kind of recurrent neural network, was later developed.
One of the fundamental breakthroughs in generative AI is the creation of Generative Adversarial Networks (GANs) in 2014 by an American computer scientist Ian Goodfellow. It is an unsupervised machine learning algorithm that engages two neural networks that are in competition with each other. One network is a generative model that generates content and the other is discriminative that tries to figure out whether it is an authentic sample or not.
The success of one network invariably implies the defeat of the other and vice versa. The generator learns to maximize the discriminator's probability of error. On the other hand, the discriminator tries to minimize the probability of its error. The generator will eventually generate high resolution images that cannot be distinguished by the discriminator from the real images after several thousand attempts.
This method and other methods, such as variational autoencoders (VAE) and diffusion models, developed around the same time have greatly contributed to the improvement of the image generation process.
Another type of model that has played a significant role in the development of generative AI is the transformer architecture model. It's a deep neural network algorithm presented only recently back in 2017. Transformers have powered many generative models in various domains. Similar to recurrent neural networks (RNNs), transformers are intended to process sequences such as natural language text.
The transformer architecture is applied in NLP, which has led to the creation of large language models such as BERT and GPT. Vision transformer is a combination of visual features and transformer architecture to complete computer vision (CV) tasks.
Recent breakthroughs
Natural language processing
A major breakthrough in generative AI and, in particular, in the development of NPL was the introduction of GPT (Generative Pre-trained Transformer) models. In 2018, the first version of GPT was created by OpenAI.
It is a neural network that employs deep learning architecture to generate text, engage in conversations with a user and fulfill various language tasks. Its creation marked a turning point in the widespread use of machine learning. We can now apply this technology to automate and refine a diverse variety of tasks, from text translation and writing promotional materials to coding and researching complex topics. The value of such models resides in its processing speed and scale that it can operate at.
GPT is a large language model built using a transformer algorithm that is trained in a self-supervised mode on a heap of textual data from the Internet. The model performs a language modeling task, i.e. predicts the next word (or part of a word) given the previous context. In 2023, GPT 4 was released, capable of generating up to 25,000 words of text, which is a significant improvement over previous versions.
Deep learning text-to-image models
At the beginning of the current decade, several AI engines have come to light and become breakthroughs in photorealism.
DALL-E is a machine learning model created in 2021 by OpenAI that generates photorealistic images from textual descriptions. GPT-3 has become the foundation for the creation of DALL-E in 2021. It is capable of generating high-quality and realistic images, as well as being able to perform some additional functions, such as adding, replacing, or removing certain objects, or generating alternative variants of a given image according to a textual description. The important aspect is that the model is continuously trained on new data.
The DALL-E algorithm incorporates three neural networks:
CLIP (Contrastive Language-Image Pre-training) is responsible for recognizing text and creating a sketch of the future image;
GLIDE is responsible for converting the sketch into a final low-resolution image;
The third neural network is responsible for increasing the resolution of the AI-generated art and adding additional details.
Stable Diffusion, released in 2022, is an open-source neural network by Stability.ai that allows you to generate images based on text prompts. It allows people to create stunning AI art in a short amount of time. It also assists in completing sketches and editing the original images. Stable Diffusion relies on a diffusion model that generates images from random noise. It is trained to remove unnecessary parts from a sample image step by step, producing a result after a few stages of processing. This algorithm works by successively adding noise to the original image and gradually transforming it into photorealistic images or art.
Another generative AI tool for image creation is the Midjourney neural network, created in 2022. Midjourney is powered by speech recognition algorithms and image-generating algorithms. Simply put, it can recognize printed text and convert it into images. Midjourney is a diffusion neural network that, like Stable Diffusion, creates images from noise. It is renowned for being amazingly easy to use to generate artistic images with lots of finely detailed features.
Generative AI timeline
1956 - Introduction of Artificial Intelligence as a science;
1958 - Frank Rosenblatt proposed the scheme of a device that simulates the process in the human brain - perceptron, the world's first neural network;
1964 - Creation of one of the first functioning generative AI - ELIZA chatbot
1982 - RNN is created, which takes prior information into account and generates sentences;
1997 - A type of RNN with a more complex architecture called LSTM is developed, which allows efficient processing of long sequences of data and identifies patterns;
2013 - Creation of a generative model called variational autoencoders (VAE);
2014 - Creation of GANs, which were a breakthrough in generative AI as they were among the first to generate high-quality images. GAN has received more attention, also due to the higher degree of complexity of the theoretical basis of VAE compared to the more straightforward concept underlying GAN;
2015 - Introduction of diffusion models that function by incorporating noise into the existing training data and then reversing the process to restore the data;
2017 - Deep learning architecture referred to as transformer was proposed;
2018 - Groundbreaking Generative pre-trained transformers (GPT), a type of large language model, was introduced by OpenAI;
2021 - AI platform DALL-E intended for generating and editing unique artworks and photorealistic images was launched;
2022 - Open source Stable Diffusion and proprietary Midjourney AI image-generating tools were introduced;
2023 - GPT-4 was released in March 2023, capable of generating longer texts up to 25000 words.
Conclusion
Generative AI technology has a relatively short history. It started its development around the middle of the last century, but the biggest discoveries in the field of generative AI happened recently in the 2010s. And now it is developing at an extraordinary rate.
And as generative AI continues to evolve, it's important for businesses and society to stay up to date with the latest developments and be able to adapt to the changes it will bring. It has the potential to bring additional value to both everyday life and business. Recognizing the potential of generative AI and its influence on society as a whole is essential to ensure that you can successfully adapt and take advantage of the opportunities it brings.
About Toloka
Toloka is a European company based in Amsterdam, the Netherlands that provides data for Generative AI development. Toloka empowers businesses to build high quality, safe, and responsible AI. We are the trusted data partner for all stages of AI development from training to evaluation. Toloka has over a decade of experience supporting clients with its unique methodology and optimal combination of machine learning technology and human expertise, offering the highest quality and scalability in the market.
Article written by:
Toloka Team
Updated:
Aug 22, 2023