I was not sure where to ask this so I'm posting this here.
I've developed a script to search for pixiv revisions to upload, and in the process I ended up using PIL to detect file difference pixel-by-pixel ( link ). Diff files created with that library end up being solid black boxes if the two images are identical pixel by pixel, which makes it very easy (albeit heavy on CPU and a bit slow) to detect if an image and its pixiv source are visually completely identical.
I was thinking of automatically adding replaceme to these posts, but I then noticed that most of the files tagged with it haven't really been given much attention lately, most likely due to the restructuring of replacement privileges that made them mod-only. (Admittedly, replaceme source:*pixiv* is empty but I'm not sure if it's because nobody used it that way in months or because they are frequently taken care of).
Should still do that? Or would it be better for me to just collect all of the posts with this property and list them all here once I'm done, for someone to replace in batch? (That'll probably take me months at this pace.)
I realize an effortless alternative would be to just leave them alone since there's effectively no difference in them, but the md5_mismatch tag is already pretty bloated and removing some hundreds of posts from there would make it cleaner and also easier to check for revisions/files with real differences, as these identical files have no pattern and must be analyzed one by one otherwise, if one is not manually keeping track of each of them.
Pinging @RaisingK since I can't think of anyone better to ask about this.