Alibaba's new model AnyText: Make an e-commerce promotion poster in 1 minute by just talking.

Image Source: Generated by Wujie AI

As we all know, how to output text in images has always been a weakness of large-scale text-to-image models. However, the new model AnyText developed by Alibaba has provided a solution to this problem.

For example, given the prompt: "An e-commerce advertisement for a pen, with the words 'Double 12 Big Promotion!', 'Smooth water output', 'Immediate delivery', 'Free shipping', '50 yuan off'." It can quickly generate the following image:

It can even be used directly as an e-commerce image without modification.

Currently, AnyText supports four languages: Chinese, English, Japanese, and Korean. The project has released a demo in the Pionex community, which can be deployed locally or directly tested in the Pionex community.

Given the prompt: "A blue wall with the words 'happy' in Chinese, English, Japanese, and Korean," AnyText provided this image:

Although the text is a bit strange, at least it has achieved the function of text output. Let's try another theme. The "Southern Little Potato" meme is currently popular, so let's follow the trend and go to Harbin:

The effect is good, and it even makes people feel that AnyText has surpassed the current strongest Midjourney in terms of text output capability. It's worth noting that the recently updated Midjourney can only output simple English and the effect is only average.

By understanding the prompt and providing appropriate graphics, along with appropriate text, although not necessarily highly artistic, it is fully practical. At least there is another way to create emojis.

Generated by AnyText

AnyText currently provides two functions, one is image generation, and the other is image editing. As the name suggests, image generation creates an image with text based on the user's description, while image editing allows AI to help users change or add text to existing images.

Image editing is a very practical function of AnyText. Simply upload the image to be modified, smear in the area where you want to modify or add text, and write a prompt to modify the text in the image. You can change the existing text content or add text to the image.

The above image shows the effect after editing with AnyText, and the image below is the original image.

The left image is the original, and the right image is the modified effect.

This image editing function can greatly improve the speed at which graphic designers modify images. However, with AnyText, in the future, everyone may need to be extra careful in judging the authenticity of the content in images.

The other function is image generation, which is actually the main function of AnyText and can to some extent replace the work of graphic design. In addition to providing prompt words, users can also adjust the position of the text. AnyText provides three different modes for this, which are random, freehand, and drag-box.

The freehand mode allows users to randomly select the position of the text. For users who don't have a good idea for the text position, they can also use the drag-box function to drag out a rectangular text box and let AI randomly fill it in.

Freehand

Drag-box

If you can't think of a suitable position, you can also choose random and let AI arrange it on its own.

After selecting the text position, we can enter the prompt words and adjust the parameters of the image. Let's take a look at more images generated by AnyText:

Request in random mode: Generate a newspaper from 1980 with the title "New Newspaper"

Input in freehand mode: An oval nameplate with the words "Name: Luo Jiancheng, ID: 0875"

Input in freehand mode: Generate a futuristic LOGO with the words "GENAI New World"

Input in drag-box mode: A classical portrait with the solid poem "Do you know, should be green and red thin"

Input in random mode: Draw a cream cake decorated with fruits, with the words "Happy Birthday" underneath

In drag-box mode, enter: A children's crayon drawing, with a candy house in the forest, titled "Candy House"

In freehand mode, enter: A woman standing in front of a bulletin board with the words "Safety Production"

Various images prove that AnyText's text expression ability is quite strong compared to its peers. Whether in Chinese or English, it can be clearly recognized, and it can even easily display ancient Chinese characters.

But this makes people feel very regretful because in terms of its text output capability, AnyText's image content quality and understanding ability cannot keep up. This makes AnyText like a student who excels in one subject but has an average overall performance. This is more regrettable than models that are not good at anything.

AnyText also has a big problem, which is the generation time. Although many image generation models require some time to generate content, none take as long as AnyText. Basically, a set of image generation takes 3-4 minutes, and some image generation times even exceed 5 minutes. The estimated time given by AnyText itself often contradicts the actual time spent, making users feel like they are waiting longer. Moreover, AnyText may also encounter bugs, forcing users to regenerate the image.

Another point is that although AnyText can change the image's resolution, intensity, seed number, style, and other professional parameters, it is not well-guided in this aspect. Many people can hardly find the position to change the parameters unless they randomly click. It's quite regrettable that even after a year of development for large-scale generative models, users still need to explore these basic functions on their own.

Overall, AnyText is not yet a mature product. Although it has its own advantages in text output, in terms of the current image quality, it probably needs to be trained for a while before being put into practical use.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

Alibaba's new model AnyText: Make an e-commerce promotion poster in 1 minute by just talking.

Selected Articles by 巴比特

Table of Contents

Related Articles