Frequently, our clients approach us with enquiries regarding the latest hype in emerging technologies that attract considerable attention. They want to know about the readiness and practical capabilities of these technologies in relation to what they come across in online articles and discussions.
The AI QR code generator is, without a doubt, one of these technologies that has captured considerable interest. Recently, one of our clients came to us, asking exactly that—whether the technology has matured enough to be effectively used for digital companies while meeting precise briefs and company branding.
Our Lead Web Developer, Benoit Zenker, who leads the exploration, now shares the team's findings and discoveries.
Context/State of the art
Obtaining reliable information on AI can be quite a challenge. The technology is progressing at a remarkable pace, with new AI breakthroughs making headlines almost daily. As part of our exploration, we have delved into the workings of the AI technology behind popular social media posts.
The following are some of the main QR codes that sparked the curiosity of our client and, indeed, everyone else:
What do we know about these QR images?
In June 2023, a captivating set of eight QR code images, including the images above, made by a Chinese research team, were unveiled on Reddit, instantly capturing the attention of the entire web community, leaving everyone intrigued by their potential and applications.
To generate the QR codes the team used Stable Diffusion, a unique AI model developed by Stability AI. Custom models, purposely trained for QR code generation had to be designed. As it stands, Stable Diffusion is the only AI model capable of generating QR codes, with neither Midjourney nor Dall-E possessing this capability.
Numerous other users of Stable Diffusion have attempted to develop their own models for the same purpose. However, thus far, we have not encountered anything that is as impressive or reliable as the QR codes created by the Chinese research team
While the Chinese team's model remains closed and inaccessible, many other models developed by different users are open source. Meaning we could utilise them to create our own QR codes – and that's precisely what we did!
Target Brief
It’s not enough to have the ability to create a visually stunning QR code; it must also meet real-life production standards. Therefore before we embarked on experimenting with the available technology, we started by defining what we think are the key performance indicators (KPIs) that would determine whether the tool is ready for implementation in the brand marketing environment.
The identified KPIs are as follows:
Alignment with Branding: Ensuring the QR code design can seamlessly align with the client's branding.
Functional Efficiency: Establishing consistent QR code functionality across various mobile devices.
Scalability: Enabling ongoing QR code creation while preserving the desired campaign style and client branding.
Tool Exploration
The key available technology for generating AI QR codes is ControlNet, a neural network specifically designed to control the image generation process. At the moment, ControlNet is only compatible with Stable Diffusion.
With these two tools in hand, we embarked on the journey of generating QR codes.
The power of these tools lies in their adaptability, as they can be combined with different models. A model, in this context, refers to an algorithm trained on a specific set of images to acquire a particular style or subject.
To access a suitable model, you have the option to download one from the extensive online library known as civitAI. Alternatively, if you have sufficient data, you can even train your own custom model.
To begin with, we took on the challenge of converting a brand logo into a QR code while preserving the logo's brand and design identity. All output we obtained from these attempts had little resemblance to a QR code and failed to maintain the logo's original design identity. Equally important, none of our efforts resulted in a properly functioning QR code.
During these attempts we learned two key lessons
The AI system requires images with a certain level of complexity and detail to successfully generate an intriguing and functional QR code image.
The AI system produced much better results when given text prompts as opposed to feeding it with existing images.
Text Prompts
Much like all current AI image generators Stable Diffusion provides a "text to image" mode, where users can input a series of keywords (referred to as "prompts") to describe the desired output. Contrasting with the "image to image" mode that utilises existing pictures as input, this approach offers two distinct advantages:
Increased Creative Freedom: In the "text to image" mode, the AI enjoys greater flexibility, allowing it to shift pixels more freely, leading to more imaginative and diverse results.
Enhanced QR Code Efficiency: By incorporating keywords related to QR codes, such as geometric, contrast, and highly detailed, in combination with other descriptive keywords, the system can optimise the generation process, resulting in more efficient and effective QR code designs.
This picture is an example of an output we generated using the prompt “Frontal view of old, wooden stairs in front of a snowy mountain with many idyllic cabins”.
Notice how the keywords “wooden stairs” and “snowy” contribute to the creation of dark geometric shapes against a white background.
Aside from the obvious visual mistakes in the mountains that can easily be resolved, the more significant issue is that the QR code blends in so well with the image that it becomes unrecognisable by a phone. We came across this issue with almost all generated images we tried to create. We managed to create impressive QR images quite easily, but all failed to function effectively as a QR code.
To address this issue we experimented with various approaches in the search to try and find potential solutions.
Fine tuning the AI settings
In addition to utilising text prompts, there is access to several settings that allow us to alter the outcome of the generation process. While it involves a fair amount of trial and error, we have managed to maintain the same text prompt and, through these settings, make the QR code more visible and distinguishable.
However, it's important to note that this fine-tuning process must be approached on a case-by-case basis. The same settings, when applied to a different text prompt, might yield unexpected results. This process quickly becomes a true dialogue between the user and the machine, requiring careful adjustments to achieve the desired outcome for each unique input.
Using different models
A great thing about Stable Diffusion is that it’s open source, meaning there’s a very active community of users sharing their tools. This enables easy access to a diverse array of downloadable models, each trained to produce distinct styles such as photorealistic, cartoony, or painting-like, as well as various subjects like human figures, landscapes, and objects
Here we have two images generated using different models: on the left is the image created with the "Neo Sci-Fi" model, and on the right is the one generated with the "Ghostmix" model:
While the visual outcome may indeed appear impressive, it is crucial to understand that this is primarily because the images are generated without any constraints in terms of adhering to a specific brief or brand guidelines. As a result, the AI has the freedom to explore and generate diverse shapes and designs, leading to visually captivating outputs.
To achieve the same level of details and design while remaining consistent with a specific brand, will require a substantial amount of effort and/or a bespoke model, purposely trained to output branded images.
In the Sci-Fi building image on the left, you'll notice rounded corners on the QR code and a larger margin surrounding it. Such adjustments can be used to seamlessly integrate the QR code into the picture
However, both of these QR codes encounter similar functional issues as the previous ones. The first QR code does not function at all while the second only works with some devices and is not consistent.
A creative approach
We quickly learned that utilising AI is not a fully automated process; it requires creative guidance, particularly when aiming to achieve an output that aligns with a specific brand identity.
To address this, we made the decision to involve a graphic designer from our creative team. With their experience and creative mindset, they provided relevant art direction, resulting in more captivating and brand-aligned images.
Our graphic designer offered a valuable insight, suggesting that AI generation could be integrated into a larger creative process that incorporates more traditional design tools.
For this AI-generated picture, we applied manual edits, adjusting the contrast to enhance the visibility of the black pattern. Additionally, we seamlessly integrated the original QR code as an overlay on top of the image, using a low opacity setting to maintain subtlety while ensuring full functionality.
This process not only helped us come closer to adhering to a specific brief and branding but also contributed to increasing the reliability of the QR functionality.
Meeting the KPI
With a better grasp of the tool's capabilities, it's time to revisit the key points mentioned earlier and evaluate whether the AI can meet our expectations.
Alignment with Branding
This remains close to impossible with current available models and state of technology, as the AI still lacks true understanding of a brand's identity and requires a good amount of human manual work.
As it stands, we mostly control the creative using text input and, as a result of that, aligning it with a specific visual identity can be quite complex. The AI may utilise colours, shapes, and proportions that do not align with the brand guidelines. These kinds of issues could be fixed with a process involving hand-made editing of the AI, but that would be quite time-consuming.
Functional Efficiency
It is crucial to ensure that the AI-generated QR code maintains the same level of reliability as the traditionally used QR code functionality. However, as we’ve discussed, the more creative we become, the less reliable the scanning functionality becomes, and vice versa. Striking the right balance between creativity and QR code reliability proves to be a challenging task.
When attempting to "camouflage" a QR code with AI, we are essentially pushing the QR code to its limits. This is also evident in the QR codes created by the Chinese team, as they also do not function optimally. Although we conducted a light smoke test using a few different devices both iOS and Android, it was sufficient to determine that successful scanning demands precise distance and appropriate lighting conditions. Moreover, depending on the specific phone model, some QR codes may not work at all.
Scalability
During our exploration, we found that very few impressive-looking QR codes worked perfectly right out of the box. Despite this, we were able to generate beautiful and functional QR codes.
Because of the nature of AI, the creation process of every QR code will need to be started almost from scratch each time. Additionally, the more QR Code you will create for the same campaign, the more challenging it becomes. This is because you will have to train the AI to preserve certain design elements while altering others to generate a new design. AI is not a magic tool; it requires human supervision and feedback in order to unlock its full potential.
Given the current state of the technology, if we are to attempt to create a fully automated process, we’ll need to compromise on the reliability of the QR functionality or have the traditional QR pattern clearly visible, or possibly both.
The Verdict is in…
The technology is undoubtedly exciting and advancing rapidly, but we believe it is not yet fully ready for creating branded QR codes.
Although these AI-generated QR codes have an impressive appearance, they are still encountering two significant challenges, functional efficiency and alignment with branding.
However, we believe that these issues could be resolved in the near future. From our perspective, the most encouraging prospect lies in training a custom model on a carefully selected set of reference images. This approach would provide better control over both the functionality and the visual appearance of the generated images.
We are also very excited about the possibilities that the ControlNet plugin offers. While we used it for QR code generation, it provides many other ways to maintain precise control over Stable Diffusion’s output. This potential could be a game-changer for the commercial application of AI-generated images.
Bibliography
How to:
The most complete technical analysis on how to generate QR code with Stable Diffusion https://antfu.me/posts/ai-qrcode-101
Installing and using Stable Diffusion https://stable-diffusion-art.com/beginners-guide/
Code we used:
Stable Diffusion https://github.com/AUTOMATIC1111/stable-diffusion-webui
ControlNet https://github.com/lllyasviel/ControlNet
ControlNet models: https://civitai.com/models/90940/controlnet-qr-pattern-qr-codes, https://huggingface.co/monster-labs/control_v1p_sd15_qrcode_monster, https://huggingface.co/DionTimmer/controlnet_qrcode-control_v1p_sd15
Models
https://civitai.com/models/100287
https://civitai.com/models/36520
Some very recent paying services (low reliability):
https://qrbtf.com/ai by the pioneer Chinese team