Graphic Designers, Take Note: A New Tool Could Change the Game
Introducing COLE, a groundbreaking tool named after Henry Cole, the pioneer of the first graphical Christmas card in 1843. This innovative platform allows users to input graphic design ideas—like "a poster for a Winter Holiday concert with musicians in warm clothes amidst falling snow"—and utilizes AI to generate both the image and complementary text.
What is COLE?
COLE is a composite of advanced AI models, including fine-tuned versions of Meta’s Llama2-13B, DeepFloyd IF, LLaVA1.5-13B, and GPT-4V, bolstered by the open-source graphics renderer Skia. Developed by a team of 12 researchers from Microsoft Research Asia and Peking University, COLE addresses the complexities of graphic design and the scarcity of training data on key formats, particularly .SVG files. The researchers streamlined SVG elements into a unified image layer, allowing the AI to describe background layers through text.
COLE's background model was trained on a collection of 100,000 high-quality graphic design images sourced from the internet.
More Than Just a Product
Currently, COLE functions more as a framework than a commercial product. However, its capabilities are impressive. By simply inputting prompts, COLE can create crisp, organized graphic designs that seamlessly integrate visuals and stylized text. This marks a significant advancement, as generating integrated text and imagery has been challenging for many AI art generators, including leaders like Midjourney and DALL-E 3.
Editable AI-Generated Designs
Perhaps the most remarkable feature of COLE is its ability to produce images with editable text and visual elements. Users can modify text directly within the framework without needing to export to software like Adobe Photoshop or InDesign. For instance, they can easily change the font or adjust the visuals, transforming a grocery bag from a photorealistic style to a cartoon representation.
According to the researchers in their recent arXiv paper, “A scalable, high-quality graphic design generation system should require minimal effort from users, produce accurate typography, and offer flexible editing options.” With COLE, they have achieved this goal.
Competitive Quality in Graphic Design
The researchers assert that COLE produces outputs of "very competitive quality," even when compared to DALL-E 3. They thoroughly tested COLE on 200 graphic design projects ranging from advertisements to event promotions, documenting their prompts for transparency.
COLE performs best when generating covers, headers, and posters, demonstrating superior editing capabilities for specific elements compared to DALL-E 3 and similar tools.
However, COLE is not yet a complete solution. Users cannot change the arrangement of text blocks, and the tool currently allows only one color of typography per image. The researchers plan to address these limitations in future developments.
A New Era for Graphic Designers?
High-quality graphic design is often taken for granted, yet it is an art form in itself. Designs—whether concert posters or functional graphics like road signs—reflect skill and creativity.
Does COLE pose a threat to graphic designers? The answer is nuanced. While COLE's editable fields help users refine outputs and leverage human expertise, it simplifies a process that traditionally requires professional skill to develop effective prompts. This makes nice designs attainable for those without formal training.
In essence, COLE aims to democratize high-quality graphic design, a concept already explored by companies like Adobe and Canva. In this respect, COLE may serve as a competitor and potentially enhance existing tools in the market.
For now, COLE is not publicly available, but a demo will soon be released on their GitHub project page.