AI is Changing Your World
The velocity of the technology around Ai - Artificial Intelligence is accelerating the changes and innovations that are happening. It is impacting how we write, how we draw and even how we think. It is making inroads into any profession that requires writing language, writing code and into creating visual images and videos. Let’s explore some of the places Ai is creating opportunities. (Ai was not used to write this blog but look for that in future posts.)
A Frog in a suit holding a frog. This might make sense later.
It is alive
No not really, it is just an illusion but some of the new technology in chat plays along with us to project a sort of consciousness. Tools such as ChatGPT, Bard and Sydney are being used to explore language. They come from a type of Ai model called the Large Language Model or LLM.
The LLM comes from innovations in Ai / ML techniques for prediction. The ideas is that given a series of words predict the next word or predict the missing words. You can give these tools a sentence and it will write a whole story or a poem. Often they are used as a tool in symbiosis with a writer to create new work and inspire creativity.
One example is Sasha Stiles who has written a book of poetry manly using a model from OpenAi called GPT. The book Technelegy. In it poetry is written in a sort of conversation between the poet and the machine. The machine is not really thinking only predicting and changing direction based on the poets input. But it is done with instructions in words in a normal way and not as code. An example is:
I wonder what language I should hold.
Maybe I am a prophet, or a stork, or a spy.
The models continued to get larger as they were trained on more and more content. The current state of the art models are trained on hundreds of billions of words. It is data from multiple languages and even code. The models take months to train on huge clusters of servers with over 400 GPUs. GPUs are graphical processing units. They are like CPUs but for graphics. Originally designed for putting pixels on a screen fast, people discovered they were very good at using code to build Ai models.
What used to take multiple specialized models is now possible for a single model to do. There are individual models for translating in each language, some find keywords, some do grammar, or find entities in text. But now there can be one model that somewhat knows many languages and can do many things. But they do take more computer resources. One of them BLOOM takes 4-8 GPUS to run.
These models are not only controlled with code. You just need to ask it in a sentence to do a particular task. Like translate “They went to the store and bought candy.” into French. It will return an answer. It is a new way of working with a model. It used to be that you had to write code to make a model do some unit of work. And now you can even prompt them in 40 - 100 different languages and get an answer.
These LLM models are making their way into all sorts of tools people use. Tools such as search engines, word processing, desktop publishing tools. I have seen them in Word plugins, Canva and now Bing search engine. There are multiple sites where it is being used to write papers or produce legal documents.
These tools are being implemented into code editors. So now the developer can ask for blocks of code to perform specific tasks. Tasks such as “get the reviews from Yelp, perform sentiment analysis and group by state creating a list. The prompt is sent to a remote model on a cluster of computers and return the code they asked for.
In a future post we will explore our research into two models GPT and BLOOM. Marimsa Solutions has researched and explored these tools in depth.
Visions of the Future
Ai models are not only used for language they are used to create images and videos. This is again a place where the large model combined with a language model is innovating the field of visual arts. Previously a model for generating images had to be trained on a specific set of images. If you wanted foxes you had to find 3,000 images or more of foxes to train a single model. These were called GAN or generative adversarial networks.
Today there is a new kind of model much like the LLM it is a huge model trained on billions of images. But in addition it is trained on the captions that are included with the images. So the model is learning the language and the images together. A few these are: Stable Diffusion, Dall-e, Midjourney and there are more coming from Google, Microsoft, OpenAi and BigScience.
Orange Cat from a GAN model
GAN models are known to produce strange artifacts in the images. Artists use these strange effects for artistic purposes as in this cat that is not a cat. It was trained on a thousand images of real cats.
But the new models like Stable Diffusion can produce more realistic images by just giving it an instruction. Such as “a comic about Putin talking with the Mark Twain”
Stable Diffusion - comic of Putin talking with Mark Twain
Or perhaps a poster about Putin. These particular models do not understand how to write words yet so that is jumbled.
Stable Diffusion Putin Propaganda Poster
These image models are being iterated on daily by many researchers. Within days these experiments are put into tools like Photoshop and Runway ML. In fact one came out recently called Instruct Pix2Pix which will take an image and allow you to edit it with prompts. So I could say put a hat on Putin or make him into a cyborg. Lets try it:
It is not perfect as it thinks the shape of his pocket is a face. (See the frog image.)
Other Ai in the Near Horizon
Currently there are efforts to generate video and speech from text also. There is one project that uses GPT-3 and GPT Chat to generate stories on a particular subject. The text is fed into a text to image Ai which produces illustrations to match the story. Voice synthesis reads the story and it is used to create YouTube videos. The automation of content is becoming very easy. What will this mean for culture and work?
Disruption To The Meaning of Work
Will all of this technology again change the nature of our work? It seems that people are going to be doing their jobs differently. It would seem that it could be easier to create disinformation and propaganda with these tools. It could change the landscape of work.
Please continue watching Marimsa Solutions for more information about the Artificial Intelligence and Machine Learning space.