kittey said:
Sure that’s possible, but the thing is, the art itself is somewhat useless if you don’t know what it is. I’m pretty sure the AI-makers scraped Danbooru because it is known as the best-tagged anime image board. Nai Diffusion, for example, works on the same tags as Danbooru to give its users what they want because exactly those tags have been used as input to train the AI.
Pixiv tags are pretty useless for that. Not only do they allow only a very small amount of tags (12, I believe?), which is not enough to actually describe an image, but those tags are often used inconsistently, with several different tags used for the same thing, requiring artists to use the precious little amount of tags for redundancy, leaving even less room to describe anything besides characters and copyright. From the perspective of an AI developer, Pixiv tags are absolutely useless. Scraping a site with untagged images is of course possible, but will only let you generate basically random images.
ComradeMokou said:
Pixiv’s and in general most other image boards don’t have nearly as good of tags as Danbooru does. AI training needs data. Good data in, good model out.
This is neglecting to account for the fact that AIs can already label data, and the AI-labeled data can then be used to train a different AI.
https://openai.com/blog/vpt/
This paper shows a technique where OpenAI:
1. Downloaded a huge amount of unlabeled data (Minecraft letsplays) from the internet
2. Paid a small number of people to create a small amount of labeled data (by playing Minecraft and recording all their mouse and keyboard inputs)
3. Trained an AI on the small amount of labeled data that predicts inputs
4. Ran that AI on all the unlabeled data and create labels for it
5. Trained a second AI on the large amount of now-labeled data to play Minecraft
Even if Danbooru never existed, someone could apply this technique to Pixiv and other sites to get a bunch of well-tagged images.
But Danbooru does exist, so step 2 can be skipped, no need to pay people to tag images.
And Danbooru already has an open source AI tagging model, so step 3 can be skipped too.
All someone really needs is the AI tagging model + the ability to scrape images from Pixiv, Twitter, etc, and they can get tagged images. Danbooru's database itself isn't a key component here.
evazion said:
I don't think this is quite true. Stable Diffusion is so effective because it's trained on literal billions of images. Instead of tags it's trained on captions and text-image pairs scraped from thousands of different sites. It doesn't matter that the data is incredibly noisy; the lesson from recent advances in AI is that more data is always better than less data, no matter how noisy the data is.
An AI trained directly on Pixiv, Twitter, ArtStation, DeviantArt, or any other artist site would be even better than one trained on Danbooru, because it has more data, no matter how noisy the tags are. And they're a lot less noisy than images scraped off random sites on the internet. The success of AI comes from the ability to cut through incredible noise to find the signal.
The benefit of NovelAI's tag-based approach has less to do with the quality of the images themselves and more to do with consistency and control for the user. Noisy data can produce good images but it can be frustrating (or impossible) to get exactly the character you want in two different poses/scenes/expressions/etc due to the randomness inherent to current AI techniques.
But with proper consistent tagging, you get this: https://old.reddit.com/r/NovelAi/comments/xn8r0v/image_generation_progress_showcase_when_you/