Donmai

Re-uploading pictures off of Yande.re and other boorus?

Posted under General

There are a lot of HQ pictures on Yande.re and other boorus, I think it'd be great if a tech savvy person could write a bot/scraper to re-upload everything from those sites unto Danbooru.

Clarification: I'm not talking about Gelbooru, Sankaku or other low quality boorus.

Updated

Unbreakable said:

Just because an image is HQ doesn't mean it's suitable for Danbooru

A lot of images isn't exactly one image though.

Unbreakable said:

who would tag/check for correct tagging on all of those uploads?

The community. Why would it be any different?

teacapo said:

A lot of images isn't exactly one image though.

Uhm, what?

teacapo said:

The community. Why would it be any different?

You can't expect to have someone mass dump images with low/bad tagging and expect other users to fix it.

Unbreakable said:

Uhm, what?

Most (if not all) of the images on Yande.re are suitable for Danbooru.

Unbreakable said:

You can't expect to have someone mass dump images with low/bad tagging and expect other users to fix it.

Who do you think is adding and correcting tags on Danbooru? One uploader, or the community?

Agree with Unbreakable here. I am by no means a fan of automatic uploading, even if some of the other Boorus do it. Each upload should be deliberate, and the tagging should be thoroughly done by the uploader. Anything else will likely earn the user a permaban or loss of upload privileges.

BrokenEagle98 said:

Agree with Unbreakable here. I am by no means a fan of automatic uploading, even if some of the other Boorus do it. Each upload should be deliberate, and the tagging should be thoroughly done by the uploader. Anything else will likely earn the user a permaban or loss of upload privileges.

We already have automatic uploading from various different sites though. What difference would it make?

I would say Danbooru is the most strict board in terms of quality and tagging. Other boorus already scrap danbooru to them, it would be better if other booru would scrap any other from to them and we keep ongoing with the danbooru quality.
75%~ of gelbooru is danbooru posts, out of the 1 000 000 unique to gelbooru posts I would guess a very low percentage would be accepted nowadays, if you tag them and source up to standard would maybe get close to 10% approval?

TL;DR Learn to judge quality, browse gelbooru with -user:danbooru or on other website the equivalent, upload what you find good to danbooru and tag it well, the notable upoloaders on danbooru can look at higher quality pics set elsewhere and are more motivated to do so than look at gelbooru or other (yandere is a bit better than gelbooru tho).
Would be fine if you do it yourself, for automated uploading look elsewhere cause approval is strict here. There are other "the one big single booru" projects around.

Updated

teacapo said:

Most (if not all) of the images on Yande.re are suitable for Danbooru.

Not all of them, you would still need someone to manually check each image unless they are a better version of an existing one.

teacapo said:

Who do you think is adding and correcting tags on Danbooru? One uploader, or the community?

There's a difference between fixing/adding a bunch of tags to some uploads here and there and fully tagging up hundreds or thousands of images, no one would agree to do that.

teacapo said:

We already have automatic uploading from various different sites though. What difference would it make?

We do? Where?

teacapo said:

A lot of images isn't exactly one image though.

So what you're saying is, automatically uploading thousands of undesirable images to Danbooru is somehow less bad than uploading just one?

The community. Why would it be any different?

That's pretty rich, coming from a user with precisely zero contributions to this site thus far.

iridescent_slime said:

undesirable images

Yande.re's content is a lot more desirable than half of the stuff getting approved here. I see that it clearly frustrates you for whatever reason. If you want to lash out, do it somewhere else.

Unbreakable said:

Not all of them, you would still need someone to manually check each image unless they are a better version of an existing one.

That'd become slightly problematic for sure. I think this is the only proper argument so far too. How often does someone upload an image from Twitter, and somebody else uploads a better version of the same image from Pixiv or somewhere else?

fredgido said:

I would say Danbooru is the most strict board in terms of quality

Danbooru is one of the most strict boorus in terms of quality, yeah.

fredgido said:

There are other "the one big single booru" projects around.

Such as? Hydrus?

teacapo said:

Such as? Hydrus?

yes and maybe http://tbib.org/index.php?page=about and sankaku also eats a lot of other boorus, any other is better candidate for this.

If you want to things by hand most automaticly you can start by scraping all post pages of yandere, get the danbooru posts table and remove the matching md5s from your yandere posts table. After you are left to unique yandere files you can maybe reverse search them all automatically and slowly to check if any simillar but not equal are on danbooru. After maybe or not removing those sort the remaining by score and favcount ( you could do some filtering on tagcount already and upload those tag count ready for danbooru).

fredgido said:

yes and maybe http://tbib.org/index.php?page=about and sankaku also eats a lot of other boorus, any other is better candidate for this.

If you want to things by hand most automaticly you can start by scraping all post pages of yandere, get the danbooru posts table and remove the matching md5s from your yandere posts table. After you are left to unique yandere files you can maybe reverse search them all automatically and slowly to check if any simillar but not equal are on danbooru. After maybe or not removing those sort the remaining by score and favcount ( you could do some filtering on tagcount already and upload those tag count ready for danbooru).

I'll check out TBIB, it appears they're not using MD5 hash values which is a bummer

teacapo said:

I'll check out TBIB, it appears they're not using MD5 hash values which is a bummer

the images' MD5 is their file name.

I think it gathers gelbooru e621 oreno konachan danbooru r34 etc, just tip the guy on patreon and ask something from it.

Disregarding the quality aspect of it, Yandere hosts a fair number of the same images from Danbooru, either taken from the same source or from Danbooru itself, and they (apparently) change the md5 while doing it, so automatic rehosting from Yandere would result in the creation of a lot of duplicate entries. My vote goes to no for automatic rehosting.

CodeKyuubi said:

Disregarding the quality aspect of it, Yandere hosts a fair number of the same images from Danbooru, either taken from the same source or from Danbooru itself, and they (apparently) change the md5 while doing it, so automatic rehosting from Yandere would result in the creation of a lot of duplicate entries. My vote goes to no for automatic rehosting.

Yeah, I wasn't for automated uploading, just automate curating for upload candidates, you still have to decide on quality then tagging but with a much easier job.
I just checked and the md5 are really different on Yandere, I guess file hash is to give up then, there are a lot of other ways to remove matches tho. (edit: wow they remove the exif info... assholes)

Just let me say again, never do automatic uploads.

Updated

BrokenEagle98 said:

Agree with Unbreakable here. I am by no means a fan of automatic uploading, even if some of the other Boorus do it. Each upload should be deliberate, and the tagging should be thoroughly done by the uploader. Anything else will likely earn the user a permaban or loss of upload privileges.

I know the uploader gets a point to two to their post count, but someone else does the tagging, what besides "top tagger" credit do they get?

g3gen said:

I know the uploader gets a point to two to their post count, but someone else does the tagging, what besides "top tagger" credit do they get?

Nothing else at this time. As far as I know, nobody has ever brought up before the idea of having some other kind of credit (like on your user profile). If there was interest in such though, I'm sure that something could be added.

1 2