DALL-E 3 produces much more detailed images and will be integrated into ChatGPT

Taras Mishchenko - 21 September 2023, 10:05 AM

OpenAI has unveiled DALL-E 3, the latest iteration of its AI-powered image synthesis model. This version boasts seamless integration with ChatGPT, allowing the model to generate images that accurately match complex descriptions. It also manages the generation of text in the image, such as labels and captions, which was a challenge for previous versions. The model will be available for ChatGPT Plus and Enterprise users in early October.

The DALL-E 3, like its predecessors, is a text-to-picture generator that creates unique images from written prompts. Although specific technical details of the DALL-E 3 have not been disclosed, it can be assumed that the model, like previous versions, has been trained on millions of human-generated images, some of which have been taken from photo stocks such as Shutterstock. The new model probably contains innovative training methods and an increased duration of computational training.

OpenAI's sample images for DALL-E 3 indicate that it outperforms other image synthesis models in its ability to follow prompts accurately. The images generated by DALL-E 3 seem to follow the given instructions precisely, creating objects with minimal distortion. OpenAI emphasizes that the DALL-E 3 enhances complex details such as hands more effectively than the DALL-E 2.

The DALL-E 3 also demonstrates an improved ability to embed text into images, which was a difficult feature for its predecessor. For example, a tooltip describing an avocado in a therapist's chair that says: "I feel so empty inside," led to the appearance of a cartoon avocado with the exact phrase represented in a speech bubble.

OpenAI emphasizes that the DALL-E 3 has been "natively built" on ChatGPT. This integration will allow users to enhance images in conversational mode, using the AI assistant as a brainstorming partner. It also means that ChatGPT can generate images based on the context of the current conversation, potentially opening up innovative possibilities. It is worth noting that Microsoft's Bing Chat AI assistant, which uses OpenAI technology, has been generating images during the conversation since March.