Donmai

[Prototype] User Report Ver 6.3

Posted under General

Type-kun said:

I actually wanted to see a weekly count distribution - a table for week 1, week 2 etc. That said, we can safely assume that there'll be about 6000 rows per week. That's 310k per year, not a small amount, but not that large either. Given its nature, it would be a well-balanced tree if indexed by date+user and should work fast.

Yeah, I just gave you what I had off hand. I could compile a weekly distribution of rows once I get home if that's still something that would be useful...?

Type-kun said:

Albert's busy ironing out saved searches. I'm currently catching up on RoR and Git to be able to contribute directly rather than with pseudocode and ideas/issues, but it'll take some time 'till I'm ready. The remainder of issue #2640 is quite simple to fix, but it's something that needs testing, so I'm not going to fix it blindly. Once my dev environment is set up properly, small bugs will be squashed faster, hopefully.

Ah good, I just wanted to make sure that it doesn't fall through the cracks since that issue has already been closed.

BrokenEagle98 said:
They're technically wiki edits, but should those be attributed as artist edits then...?

I would make a column for artist wiki. I think that when you edit an artist body it counts as both artist edit and wiki edit. I will check on this matter when I'm back home.

Yes, it does count. I just checked the following:

mei_(maysroom) [wiki] (updated at 2013-06-12T12:11:06.058Z)
mei_(maysroom) [artist] (updated at 2013-06-12T12:11:06.133Z)

On 2013-06-12 12:11, henmere made just a change to the body for the artist #69566. On the artist version info, it shows no changes which would show up as an Other change in the Artist table. On the wiki version info, it shows a change to the body, which would show up as a Body Edit change in the Wiki table.

So, what I'll do is if an artist type wiki page shows up in the version list when I'm doing the wiki page data collection and it's just a Body change, I'll just disregard it (the wiki Other Names cannot be edited from the Artist page). If it's a Title or an Other change, I'll check the corresponding artist page to see if there were any changes, and if so disregard that edit.

When I'm going through the artist version list, I'll also check the corresponding wiki page to see if there was a change, and add it to a "Wiki Page" column in the Artist table.

These sorts of reports are no longer hard to generate (although data for stuff outside of posts may require some additional summary tables).

I'm more interested in the processes for actually paying attention to this stuff. If the purpose is to identify users suitable for promotion (or demotion), then who is going to check these reports? How do they get notified? On what schedule? Should there be any process for promotion/demoting someone?

Exposing the per-user privileges is something I plan on adding to the user json soon, just haven't gotten around to it yet.

Well, @albert as you can see Nitrogen09 and I reported some users (most often Translators) to Wypatroszony (and a few others to memento mori and NWF Renim) according to this feedback page (except two users which were promoted in the middle of the month).
The idea is to keep track on every category with that and reporting users at the end of the month, when such a list generated, to one moderator.

As for the last question: What do you mean by "process"^^?

Beyond the administrative use that Provence mentioned, I think the reports could also act as a morale boost and a measuring stick.

As a morale boost, it recognizes those users who may not upload a lot, but help out in other ways. It may even encourage some users to help out more so that they can get on a list.

As a measuring stick, it can let you know where you stand in relation to the top users for each category.

Besides those, it could just be something that's interesting to follow...

My question is who is responsible for looking at the reports? Is it you and Wypatroszony and others at the end of every month? Is it something you remember to do every month or do you need some sort of notification? Or is the report more a thing you need to reference every once in awhile to verify a user's history.

By process I mean is there a checklist of things to do in order to get someone a promotion.

The point I'm trying to make is whether or not the report itself (automatically generated once a month) is sufficient or if additional tools are needed.

Allow me to give some concrete examples of what I mean by additional tools. These are things that could be implemented in a second phase and don't need to block any work on reporting, but I'm interested in what else can be done so that this report doesn't end up being something used by only a handful of people.

Based on the report, Danbooru could post a new topic in the forum highlighting users who have contributed a large number of notes and are perhaps worthy of promotion. This would give not only mods but other users a platform to investigate and praise a user.

A simpler tool might be just sending a dmail to all mods every month that they should check out the report and see if anyone is worthy of a promotion. In order to reduce the workload maybe a specific random subset can be highlighted for each mod.

Obviously some sort of balance needs to be struck between frequency and usefulness. But these are the sorts of ideas I'm interested in hearing since they could affect how the report is designed.

I started this at with the month OniTea and Shallie were promoted and I did it the last few months every time at the end of each month. So I didn't need a notification for doing that.

And every user I reported was also looked through manually. That means if their translations or uploads really are worth a promotion. I mean a table is good, but that's only an indicator of numbers. But the mod who received those reports are doing a manual check anyway...I think. It's like a second check (at least when I reported users for promotion).

So I think that the reports each month per se are sufficient.

albert said:

These sorts of reports are no longer hard to generate (although data for stuff outside of posts may require some additional summary tables).

I presume that reportbooru is capable of storing the data for all past months (weeks), not just the last one? I'm asking this because I can immediately see another use for this data - a monthly (weekly) histogram for user activity accessible from user profile. Not sure exactly how useful that would be, but I'm pretty sure it would be a welcome change.

As some people often comment on users who routinely undertag their images, here's a look at the top and bottom upload taggers of the top uploader lists.

top 10 general tag users per upload:
1. zeparoh 47.420
2. Qpax 43.308
3. KazuyaRazuKazama 39.077
4. Provence 38.490
5. Nitrogen09 34.569
6. BlindSargent 33.080
7. Shallie 31.966
8. Sacriven 31.633
9. Lannihan 30.882
10. dereyoruk 30.403

bottom 10 general tag users per upload:
1. cutemi2 7.387
2. beltman 13.459
3. Christianlush 14.607
4. akp47 14.736
5. Blue_Stuff 15.214
6. nanami 15.319
7. magenta-crimson 15.365
8. RazingK 16.468
9. kars41 17.093
10. DeusExCalamus 17.735

top 10 (total tags less tag errors and removes) per upload:
1. zeparoh 50.917
2. Qpax 46.455
3. KazuyaRazuKazama 42.481
4. Provence 41.452
5. Nitrogen09 37.747
6. BlindSargent 36.912
7. Sacriven 34.875
8. Shallie 34.801
9. Lannihan 34.355
10. Jarlath 34.186

bottom 10 (total tags less tag errors and removes) per upload:
1. cutemi2 9.774
2. beltman 17.786
3. akp47 18.292
4. Cristianlush 18.567
5. magenta-crimson 18.669
6. nanami 18.844
7. Blue_Stuff 19.384
8. DeusExCalamus 20.915
9. RazingK 20.921
10. kars41 21.740

Obviously, different images will justify a different number of tags - for some images less than 10 tags may be fully tagged, whereas for others the tally will be in the hundreds. But for someone uploading 100+ images it should more-or-less balance out unless a user uploads only images that have lots or few tags applicable to them.

The best tagger on the list out of those without unlimited uploads is GiantCaveMushroom, in 13th out of 40 for both rankings.

It's also perhaps worth mentioning that 2 of the users on the bottom 10 rankings are currently banned.

The average figure is 25.081 for the first list and 28.515 for the second. The average tags per image for both lists are slightly higher as those who upload more also tend to add more tags (5 of the top 10 tags/image have >500 uploads, whereas none of the bottom 10 does, although the next person off the list has well over)

It might be interesting to have a top (and bottom) tags/upload list for everyone - if possible with errors and removes subtracted from the list. For obvious reasons there would need to be a threshold for the number of uploads to stop both lists being flooded with people with 1 or 2 uploads. 100 is more than enough for this. 50 would probably be fine, too.

While we're at it:

Lowest proportion of tags removed or errors: (remember, this is a good thing)
1. GiantCaveMushroom 0.119%
2. tapnek 0.155%
3. zeparoh 0.160%
4. Nitrogen09 0.169%
5. BlindSargent 0.205%

If people want me to then I'll remove the negative stuff, but this is all calculable from the information in the OP anyway...

Updated

The only one of those lists which request removal is likely to have a big impact on is the last pair of rankings, which I just threw in because I had the data. It's still a good thing to have a low figure for that, though, unless it means that you're not bothering to add request tags when you should. Even having low request usage is a good sign (again unless it's you neglecting to add them) as using the artist/copyright/source/character requests or tagme inherently means that you aren't fully tagging the image, and if the tag is deleted then it probably means that someone else was able to*. Although of course your inability to add the tags may be entirely justified (eg. weird and nigh-untranslatable names and so forth).
Use of copyright request and translation request** is a different matter, of course.

Given that the 6 people with the lowest figures on these rankings are 1st, 2nd, 5th, 6th, 13th and 24th out of 40 for the total tags ranking, I don't think it's likely that this is a sign that they're routinely not bothering to add request tags when they should be.
The "highest removals rate", of course, is much more likely to be influenced by these things - for instance if someone uploads lots of comics then they are more likely to have more translation requests, which may then be removed if someone goes through translating them.***

*so you're not talking about ridiculous situations like the character request on post #2438041. Which will probably stay there forever.

**or worse, if you start translating the image and stick a "partially translated" and/or "check translation" on the image as well, in which case the translation request can suddenly cause 2 or even 3 deletions for the image.

***thinking through this, I've decided to get rid of the "highest removal/error %" list. Only the top one on the rankings looks like dodgy tagging - for instance andalus (in second) does loads of commentary requests and translation requests, which would thus make up the majority of their removals. And the top person on the list is banned anyway.

I personally consider tagme on a post with more than 10 tags an error in tagging, especially since 90% of what it covers can be handled by various request tags. For posts with less than 10 tags, tagme is just lazy as hell and is something I think should be weighted at a -10 for tagging report purposes, if not more. I've seen users throw tagme into a post to skip tagging them in a few occasions.

1 2 3 4 5 6 15