Recently, AI painting technology has become a hot topic in the art circle. Over the past few decades, the art of painting has been seen as an impenetrable barrier for AI. However, today, this barrier may also be experiencing the same challenges as the original AI impact on Go.
"Starry sky, hillside full of red roses, dilaptured stone castle, van Gogh style, oil painting", in the input of these five key words to describe only a few minutes later, the reporter got a very van Gogh romantic style oil painting, at first glance is really familiar with the famous work "Starry Sky" has some resemblance. And the creators of all this are not humans, but AI software with deep learning capabilities.
Artistic creativity, once considered a "moat" by humans, may now be catching up with AI. Last month, at the Colorado State Fair art competition in the United States, a work called "Space Opera" was selected by the judges after layers of selection, and finally was determined to be the gold medal of the competition. However, the artwork was not created by a human, but by a game designer using an AI painting tool called Midjourney. And this has sparked a great debate in the art world.
Just after the popularity of Midjourney, a large number of AI painting platforms have emerged in China. It is worth mentioning that at this year's World Artificial Intelligence Conference, the "Wenxin" platform launched by Baidu also includes AI painting function. At the time, Baidu CEO Robin Li said: "In the past year, both at the technical level and at the commercial application level, artificial intelligence has made great progress, and some even changed direction. "It's a directional change where AI goes from understanding language, understanding text, understanding pictures and videos, to generating content."
From input to output, AI is now moving from quantitative to qualitative change. Since 2015, Google has introduced a program that can use AI to complete simple image generation, and now the technology has seen another important innovation.
"What attracted the most attention in the industry this time was the appearance of the Stable Diffusion model, which solved the shortcomings of the previous Google Disco Diffusion model in painting the human face." A senior programmer told reporters that the face painting requirements are extremely high, and the similarity is too low will lead to the "uncanny valley" effect, so the model has been unable to solve it well. Today, being able to draw faces means that AI painting application scenarios are greatly expanded, and this is also a valuable breakthrough in the field of multi-modal pre-training.
"The pre-training of multimodal AI is not really new." An industry engineer told reporters that the so-called multi-modal refers to the collaborative integration of several different types of information elements such as text, images, and sound in the training model of artificial intelligence, such as AI painting is actually the process of converting text semantics into visual images. "In fact, like the familiar speech to text, it is also a multi-modal pre-trained artificial intelligence."
Since this kind of AI painting is essentially a "training project" based on these open source models, the development threshold is not high, so in the past two months, a large number of AI painting platforms have emerged in China. Just search for "AI painting" as a keyword on the wechat mini program, you can find more than 20 related programs.
Today, the transformative power of AI painting for the industry has emerged. Shortly after the popularity of Midjourney, a number of internationally renowned newspapers and magazines have begun to use its generated works to produce covers and illustrations.
Looking at the country, Baidu also through its "Wenxin" platform, in this year's artificial intelligence conference in a short time to create a number of popular TV drama posters, several of which let users call, "professional artists may need to paint for several days." At that time, the relevant person in charge said that the function had been tested on hundreds of platforms and was open to some authors. In the future, Baidu will rely on the Wenxin platform to launch more high-level creation tools on hundreds of websites, including automatically matching the generated AI pictures with corresponding music and text, and generating short videos with one click without the need for creators to make clips.
As these AI painting programs are gradually put into commercial use, the most intuitive change is that the cost of multimedia content creation will be significantly reduced. According to media reports, Midjourney has more than 3 million registered users, and it offers paid plans to generate paintings for as little as $30 a month, which is far less than the price of traditional illustrators.
Clearly, the illustration industry will face challenges. The reporter learned that some businesses have begun to consider the use of AI painting to save money in the low-end illustration market. Recently, some illustrators said in an interview with the media that in the foreseeable future, in addition to the key paintings still need well-known illustrators, most of the less demanding illustrations will be solved by artificial intelligence.
Of course, the rise of AI painting will also bring new opportunities. Recently, many short video platforms and game companies have hung out related positions such as "multi-modal intelligent creation algorithm engineer", and from the job description, the main job is to achieve intelligent content output through training relevant models.
Interestingly, the reporter noted that at present, some positions responsible for debugging and assisting AI painting have also appeared on some recruitment platforms. These positions, labeled as "illustrators," actually work by constantly adjusting keywords to help AI mass-produce qualified illustrations.
This also reveals the problem with most of the current AI painting platforms. In the actual experience, the reporter found that the quality of paintings generated by most platforms on the market is still not too high, and there is a widespread situation that it is impossible to interpret and integrate all keywords. "The core of these models still needs to continue to feed a large number of high-quality data, so that AI can get better and better, so many small models do not have enough data, and the quality of the painting is definitely not satisfactory." "Another reason is that many mature models are developed in English, and the semantic logic of Chinese is clearly different from that of English."
Source: Shanghai Securities News