Donmai

Tumblr sources (also bookmarklet-related)

Posted under General

Not yet. Just Dmail an active approver like myself the stuff you need replacing with and they'll handle it for you, especially if it's bad_id and you have a direct link to the image.

Replaced. I'll open up a thread later for it since I think it might serve useful while we wait for potential backups on albert's end (so we can dig out full-size links for sample images).

I get not wanting duplicates, but this

DanbooruBot said:

Mikaeri replaced this post with a new image:

Oldkage-matu-re.tumblr.com/post/...C%E7%94%9F%E8%AA%95%E7%A5%AD2017f82681b5142e453c10c3ee0000b7819ajpg1158 x 1789803.1 KB
Newi.imgur.com/fBBN1Mw.jpg341f3296a60c82b9af40e1d65506acaejpg485 x 75068.27 KB

and then deleting it because "sample" is a pretty fucking dick move. I don't even care if it's a duplicate or not, and I'm not trying to defend the uploader; but if you want to delete it because it's a duplicate then delete it because it's a duplicate.

So this has led to two scenarios which I'd like to know how we should handle.

Inferior image posted after tumblr raw sample

parent:2696847 status:any
The tumblr sample was posted before the Pixiv, which is lesser quality than the tumblr post after it's been replaced. Should the Pixiv post therefore be flagged or both left active?

MD5 match of unsampled picture was uploaded after tumblr raw sample

parent:2744773 status:any
The Pixiv post was uploaded after the sampled tumblr post, however it is an exact match of the unsampled tumblr post. Should either of these be flagged?

I don't think this is a marginal matter. As we may end up with hundreds of such situations I think we should have an rule to apply for when they pop up.

chinatsu said:

So this has led to two scenarios which I'd like to know how we should handle.

Inferior image posted after tumblr raw sample

parent:2696847 status:any
The tumblr sample was posted before the Pixiv, which is lesser quality than the tumblr post after it's been replaced. Should the Pixiv post therefore be flagged or both left active?

The tumblr one should be deleted. RaisingK's already starting to do that with detected tumblr samples (and it's why I now have 3 sample deletions under my name, yay! Here's one of them: parent:2750944 status:any)

MD5 match of unsampled picture was uploaded after tumblr raw sample

parent:2744773 status:any
The Pixiv post was uploaded after the sampled tumblr post, however it is an exact match of the unsampled tumblr post. Should either of these be flagged?

Exact match aside from resolution. Both stay, in this case. Currently, you can't upload an image with the same md5 -- you have to replace one to free the md5 of the original, then replace the post you want with the original.

Gollgagh said:

He did a deceptive thing by changing it to a sample. That's the part I take issue with.

It didn't need to happen, and I would consider it an abuse of your powers.

It's not deceptive, it's not abuse. Because of how the replacement system works, it's the only way to resolve the issue. You can't have two identical pictures in the database. The way around that is to replace the post with a lesser quality image and then, optionally, with the original sampled image.

The policy regarding image samples as documented many times over in the wiki and discussed and clarified in the forum is that since we have the ability to replace images, you should not manually upload an image and make the sampled image the parent, however that is what you did. And hence, your posts were deleted for being duplicates. You learned about the _raw files from the very same thread where this was explicitly stated, so who do you have to blame but yourself for not paying close attention?

Gollgagh said:

He did a deceptive thing by changing it to a sample. That's the part I take issue with.

It didn't need to happen, and I would consider it an abuse of your powers.

I have made it especially clear that nobody is to replace _raw images right now so we don't have to play this game of hooky moving favorites over when we know people don't like it. I value them much more than I value some poor shmuck who thought it'd be okay to bypass those warnings and do it anyway.

We handle samples in a new way given the feature. Most samples that can be replaced have already been replaced, and currently there are scripts running to replace them in-place so people don't cheat the system. If someone does, well ain't that unfortunate. Just suck it up and move on, because it happened.

chinatsu said:

And hence, your posts were deleted for being duplicates. You learned about the _raw files from the very same thread where this was explicitly stated, so who do you have to blame but yourself for not paying close attention?

I am not the person that posted the duplicates.

Currently, you can't upload an image with the same md5

Did not realize this. I apologize.

I saw red after this comment and didn't check to see if there was a technical reason.

Please carry on.

Updated

@☆♪ said:

Is there a thread to request post replacements if you don't have the privilege? I went ahead and tracked down my own bad_id tumblr uploads (assuming the still live ones will be automated). Ah well, since I'm posting anyway, might as well just request here, but it'd be nice to know for the future if there's a thread.

Please replace post #2276553 with https://68.media.tumblr.com/b578fc702ce0bd407cee71824e701e68/tumblr_nutqb3lzDO1uv6edpo1_raw.png

For the one other bad_id from tumblr of my uploads, I found the _raw but it's byte-wise identical to its pixiv parent. I'm going to flag it, since it is a redundant sample, but mentioning that here in case someone wants to say that we shouldn't be doing that. (There will be plenty more posts like that, as Mikaeri said on page 1.)

topic #14156

And by the way these are tumblr samples not tumblr bad ids. Important distinction.

@Cold_Crime It's unfortunate, but you have to read the forum the next time you choose to start uploading duplicates and/or not consider using replaceme.

This goes for anyone else that uploads "larger" posts from yande.re. Right now, DON'T. Request a replacement instead. Let me repeat that again.

Do NOT repost highres Tumblr stuff from Yande.re. It is duplicate content with inferior metadata (they strip it).

Wait for one of us to replace them.

Updated

Had an idea regarding byte-wise duplicates, though I'm not sure if I actually like it. If we wanted to keep two duplicate images. We could add bogus metadata (like an extra text chunk for PNGs) or even a byte after the end of the file to make the md5 different without changing the image. It would be better to just have one of each image, but if that's problematic for political reasons this might be a lower resistance middle ground, not sure.

chinatsu said:

topic #14156

And by the way these are tumblr samples not tumblr bad ids. Important distinction.

They were both. I've uploaded other tumblr samples, but I only hunted down the ones that were also bad ids because the rest can be automated.

☆♪ said:

Had an idea regarding byte-wise duplicates, though I'm not sure if I actually like it. If we wanted to keep two duplicate images. We could add bogus metadata (like an extra text chunk for PNGs) or even a byte after the end of the file to make the md5 different without changing the image. It would be better to just have one of each image, but if that's problematic for political reasons this might be a lower resistance middle ground, not sure.

It's a decent idea but I think it's better we still conform to having less duplicates in general. My idea is this:

I think approvers should have the ability to offload image samples/inferior duplicates to some sort of dummy account for permanent deletion later by an Admin or something. This would be by way of changing the uploader name, and I think it's useful because of a few things I mentioned back in forum #132889:

  • Users wouldn't be able to 'fake' their upload ratio by tagging their deletions samples if it were deemed that samples wouldn't count against you
  • Cleans up the problem of deletion count including samples to begin with, and would sweep a lot of salt away.

They were both. I've uploaded other tumblr samples, but I only hunted down the ones that were also bad ids because the rest can be automated.

Yup. As was mentioned in forum #132797 and forum #132798, if you have the original link to the image then just request a replacement in the thread. Tumblr never 'truly' deletes images, even if posts may disappear from the internet, so you can always find the full-size again through reblogs or something.

During the scans I've been doing, I've seen that some approvers haven't been switching the source back after reuploading...

After this whole fiasco is over, it'll probably be required to go back with a script or something and change all of the sources back.

Luckily, this can be done with undo, however only if the reuploader did not also make any concurrent tag changes... :/

It can't be helped for existing posts, but I should mention there is an upcoming change that will make it easier for approvers to change the source back going forward:

issue #3181:

  • Adds a Final Source option to the replacement dialog. If present, the post's source field will be set to this value after replacement. This makes changing the source field back to the HTML page after replacement easier.

That's what the "Final Source" field is for. Put the final value that the source field should be set to after replacement there.

The proper value for the source field depends on the site: for Tumblr it's the html page, but for Pixiv it's the direct image. That's why it isn't automatic.

Updated

Also, this was stated in the Danbooru 2 Issues thread, but I should repeat it here: if you try to upload a tumblr sample now, Danbooru will automatically correct it to the best available version. This is usually the _raw version, but there are cases where the _raw isn't available but _1280 or _500 is.

This also works during replacement, meaning you don't have to change the "Replacement URL" field yourself. You can leave it as _1280 and Danbooru will automatically fix it during the replacement process. If you do this, be sure to clear the "Final Source" field.

tapnek said:

Finding it annoying that the source field isn't changed back to the proper link automatically. Gonna be hell to revert all the tag changes.

I was planning on doing a smart mass undo once this whole fiasco is over, if nobody else gets to it.

1 2 3 4 5 6