Donmai

Danbooru 2

Posted under General

Can the "Logout" option on the "My Account" page be moved to the bottom? I'm sick of accidentally logging out when I miss the "My Profile" link my a few pixels.

Soljashy said:
Even if we only split the for a given post between, say, "major" and "minor", it would make the system so much more useful.

This would be awesome. And then single-tag searches could give more priority to images where that tag is marked "Major". So a search for hoshii_miki would return things like post #559165 before post #584969. Or you could just have a checkbox by the search box that said "Sort by relevance" or something.

Suiseiseki said:
EDIT: Perhaps a neutral level wouldn't be a bad idea to add either, but I only really see that being useful on pictures with high amounts of small details.

Yes, all tags should default to neutral.

Updated

Suiseiseki said:
...what about tags having different levels of relevance?

I'd say this is an essential feature to have. The ability to sort results by relevance to your query is one of the most basic features of any search engine. The question is how you determine tag relevance. There are lots of possible ways you can approach this.

There's Shinjidude's method, which uses the physical area a tag occupies as a proxy for the tag's relevance. There are a few problems I see with this approach. First, it would require a lot of work to spatially tag every tag in an image. I imagine most people wouldn't see the point in bothering with it.

Second, it only really works with concrete objects. More abstract tags, like sketch or crossover, for example, can't really be spatially tagged.

Finally, the assumption that physical area is proportional to relevance isn't true in many cases. A pantyshot may not occupy much physical space, but it can still be a prominent part of the image.

Another approach someone suggested is to have users manually rate tag relevance, for example by rating prominent tags as major and insignificant tags as minor. The problem here is what happens when two people disagree on whether a tag is major or minor. Who gets to decide?

A bigger problem is comparing tag relevance between posts. Posts A and B may both feature a prominent pantyshot, but the pantyshot in post A may be somehow more prominent than the one in post B. This approach doesn't capture this information, which makes ranking search results less useful.

A third proposal is to have users vote on tag relevance. This was proposed in the context of having people vote on subjective tags, but there's really no reason why it can't be used for all tags in general. It's similar to the previous idea, but instead of marking tags as either major/minor, you either upvote or downvote them.

The big advantage here is it provides an easy way to compare tag relevance across posts. If a tag gets 20 upvotes on post A but only 2 upvotes on post B, then the tag on post A is more relevant and therefore post A should be ranked more highly in searches for that tag.

Another approach I've been thinking about would be adopting the Delicious tagging model. Basically, users would be able to add their own personal set of tags to posts. Your personal tags wouldn't interfere with the main set of tags, or with other users' tags. This would be useful in itself, because it would allow you to organize posts in whatever way is personally useful to you. When you think about it, this is basically a generalization of the idea of allowing users to have multiple favorites lists.

And as for tag relevance, in this approach it could be inferred from tag popularity. That is, if 50 people have tagged a post with cute as one of their personal tags, then that post is more relevant and should be ranked more highly than posts that only a few people have tagged as cute.

evazion said:
Another approach someone suggested is to have users manually rate tag relevance, for example by rating prominent tags as major and insignificant tags as minor. The problem here is what happens when two people disagree on whether a tag is major or minor. Who gets to decide?

Granted, this could be a problem, but disagreements like these are happening already (e.g. ratings, whether a given post gets a tag or not, etc) and I think we're doing all right despite it all.

evazion said:
A bigger problem is comparing tag relevance between posts. Posts A and B may both feature a prominent pantyshot, but the pantyshot in post A may be somehow more prominent than the one in post B. This approach doesn't capture this information, which makes ranking search results less useful.

This is why I would suggest we start out with only two settings, one for "prominent" and one for "obscure", if you will. That way you don't really have to deal with different levels of prominence, and it would solve many of the problems discussed in forum #37442.

Hmm, I like and dislike some things about that system.

First, I see it potentially taking a lot of DB space and resources. That was one of the reasons why we never moved towards multiple favorite lists, and were so slow to restricting multiple votes, right? (Given my recent requests though I should be the last to point that out)

On the other hand, I do like how it could naturally produce relevance metrics for the tags. We could even go so far as to give higher ranked or more established users higher weight when computing this, since they are more likely to know how the tags "should" be used, and would mitigate any potential abuse from new accounts.

I'm not sure how effective it would be in practice though. First I don't foresee many people re-tagging already tagged posts with the same tags when they already show up in search results they use. See Google's analysis on 5-star ranking, I can see the same user apathy causing this system to not be useful.

We could make it easier to use by having a "thumbs-up" "thumbs-down" by every tag, and more efficient by pairing each tag in the list with a single integer which is incremented or decremented. That implementation though wouldn't prevent multiple votes, maintaining who voted what would be the biggest resource hog with this.

Also even if we did that, I still don't see people incrementing tags for characters or random items on a regular basis. It might work much better for subjective tags where people actually have a strong opinion one way or the other. In that situation though, it makes less sense, and seems less fair to give ranked members more clout.

I didn't really mean it as a method to sort by relevance, as much as relevance within the picture itself. An actual method for sorting images by relevance would probably be complicated and invite drama.

A picture generally has a focus point, say a character, and things directly related to that character, such as a pantyshot, would be a major tag. Items in the background, which someone may want as a search to find a picture but someone searching for that specific item wouldn't likely want in a search result would be a minor item.

In a case like jjj14's example post #584969, the idolmaster tag would be major, and each character could be neutral, since it's not really about any of them, but is definitely about Idolm@ster.

Actually I would say if we developed a system where a tag's relevance could be computed or defined, the ability to sort by relevance is too useful not to implement. I wouldn't let drama stand in the way of that. An item's prominence typically isn't that subjective anyway and could usually be worked out in the comments or on the forum anyway.

I think I actually like your system best of the ones brought forward because it will work with a single tagger and doesn't require mass user participation (like Evazion's), and it isn't tedious or cumbersome or bound by heuristic (like my system would be).

Also a very similar system seems to be working for The Doujinshi DB Project where they use a 5-bar cell-phone signal metric to indicate the prominence of copyrights, characters, and themes in various doujin.

I'm not sure form experience how well it's working for them though since I don't frequent that site. I'm also not sure how well it would translate there since they are a lot more rigid with their editing style than we are. But it seems to be working fairly well.

Overall I think each of the proposed systems have strengths and weaknesses, but yours seems the most balanced.

-----

My system would be a pretty natural addition and would add utility to a spatial tagging system, if that was something we were thinking about adding anyway. It would also be completely objective, whereas the other systems aren't.

On the downside, it would probably be very cumbersome to require every user to use it for every tag. That would limit people using it, and hence it's effectiveness. Also, as Evazion points out, it is useless for intangible tags and the heuristic breaks down sometimes where small things can be quite relevant and vice-versa.

-----

Evazion's system would work very well for subjective tags, and is a very natural way of guaging what is prominent in the users' view as a whole. If handled correctly, it's also the one least sensitive to abuse.

Like I state above though, I question whether it would be used by enough users to make it useful for non-subjective tags. It also would prevent new or less popular posts from having relevance set properly.

Regardless of whether we use this system for general use, I would support it as some sort of replacement for subjective pools/tags; and think that private tags / multiple favorite lists are a great idea on their own.

-----

On the downside, Suiseiseki's system is actually the most subjective since it requires and allows a single person to set the prominence (where the other systems calculate it automatically). The same degree of subjectivity is already present with the current tagging system though, so it's probably not a big deal except with naturally subjective tags.

It also has the least degree of granularity of the three regardless of if we go with 2, 3, or 5 levels (both Evazion and my systems allow for basically continuous gradiation). That could be a good or bad thing.

On the upside, it works with less users participating (only one is required), and is probably the least resource intensive of the three. It's also the one that demands the least effort in general, and as such is probably the most likely to be used widely.

Updated

Shinjidude said:
First, I see it potentially taking a lot of DB space and resources. That was one of the reasons why we never moved towards multiple favorite lists, and were so slow to restricting multiple votes, right? (Given my recent requests though I should be the last to point that out)

It will probably require one number to be stored for each tag for each post. (Small Integer? Byte? Not sure what database engine Danbooru uses.) Yes, it will increase the size of the DB, but then again, each one of the proposed methods will. I don't see why it should have any noticeable effect on performance, however.

Yeah, I mentioned using one int per tag in that post. The thing I think will take a lot of space is the list of users who voted on each tag, which we would need to prevent people from voting multiple times.

Since we are expecting many users to vote on each tag to make this system useful, that's potentially the equivalent of at least an integer per each user ID who voted per tag per post, and that could add up to be a lot. If we did without this additional bookkeeping, it would be difficult to prevent someone from spamming tag votes with impunity.

As for performance, I said that with the thought that the relevance metrics would be worked into the regular search function. It may be lighter than I think, especially if it's only one int per tag per post, but it wouldn't work directly with the current text-search system that is used now, and I have to think it would add at least some overhead because of that.

My system would likely introduce the same sort of overhead.

Suiseiseki's system (a flat {major,minor,neutral} per tag) might be a bit lighter though, it would only require two bits per tag for storage, and for performance, I can see it working fairly well with our current metatag search functions.

Updated

Shinjidude said:
The thing I think will take a lot of space is the list of users who voted on each tag, which we would need to prevent people from voting multiple times.

That's why I suggested a hard limit on the number of "votable" tags, so the posts don't end up dragging a clusterfuck of data along, and with no control.

Shinjidude said:
I still don't see people incrementing tags for characters or random items on a regular basis. It might work much better for subjective tags where people actually have a strong opinion one way or the other.

And that is the other reason why subjective and objective tags should be treated with separate systems.

I've been writing enough walls of text lately (ugh...) so I just second Shinjidude overall.

I like the major/minor system the best because it's instantly fully effective, and instantly fully reversible if the value was wrong.

I question the implementation process thought.
My subjective tags worked separately with a limited selection system similar to the current "add to pool" option, in their own part of the interface.
But usual tags are not selected they're written, so you're basically free to do everything. Which makes me question how do you input the relevance data (can it be done in the upload form? or only after the post is validated?), and how as you're inputting these data the system differentiates tags using the relevance feature from perma-neutral tags which won't (artists, copyrights). In other words, how the system reads your inputs and avoids breaking if you wrote crap.
I assume it's more easily implemented if you only deal with the relevance part after the site has processed your standard tag list and identified the metatags.
Well, just trying to figure out how that would work.

I also question the need for a third/neutral state, for two reasons:

  • It becomes harder to use. More room for judgement means less consistency, and more subjectivity (hello fetishes). Cases like post #493888 are already difficult enough to tag with only two depths level.
  • When searching for konpaku_youmu, I don't know how right it is to find post #591553 before post #570208 or the one above, and I don't know if we even want to ask us the question in the first place.

So I believe we should only use two. Something along the lines of {strongly incidental, everything else}.

Updated

I think we need a neutral state for primarily the same reason we need a questionable rating. No matter what system we implement, people will be lazy and not use the system properly.

We need an ambivalent state to default to rather than force everything to default to one or the other concrete poles. If it sounds better, we could default to null (relevance not set), and require people to set the tag to major or minor, but that's effectively the same thing. Similarly we would need a state to default all pre-existing tags to, arbitrarily calling them major or minor would be wrong.

I don't see null/neutral being chosen often for the reasons you state, but I can forsee it being defaulted to a lot.

---

As for interface implementation details, I would suggest that setting tag relevance be done after the post has been validated, and the tags added to the list. We could then do something like provide a symbol or set of symbols to the right of each tag, and allow the user to click on them to toggle / cycle through the relevance settings.

Here are some ideas:

1-Symbol: cycle through three symbols {minor, neutral major}: {○,◑,●}, or {˩,˧,˥} for example, defaulting to neutral, then cycling high, low, neutral again. Three colors could also be used, greying out neutral's color is probably a good idea.

2-Symbols: provide two symbols side-by-side to indicate minor and major, such as {○,●} or {▼,▲}. Both symbols would default to being greyed out to indicate neutral. Clicking one or the other would toggle to that setting, clicking an active setting again would disable it back to neutral. This method would avoid issues with misrating things by accidentally cycling too far

Updated

Erm, I was writing that I couldn't get the point of the neutral/major distinction, when I finally figured out that you can filter your searches rather than ordering them (duh), thus allowing post #591553 and post #570208 to appear separately on top of their own search types (respectively konpaku_youmu:major and konpaku_youmu:neutral), which sounds indeed magnificent.
So I scrapped everything.

So yes, three levels.
I don't like the subjectivity opportunities it conveys ("Mokou looks better than Kaguya here so she has to be major.") but it's still too valuable despite that.

Shinjidude said:
No matter what system we implement, people will be lazy and not use the system properly.

We need an ambivalent state to default to rather than force everything to default to one or the other concrete poles.

I was meaning to use the "everything else" state as default (named "standard" I suppose), so you'd just have needed to tick the "minor" box for the least prominent ones.
But we don't care anymore now anyway.

About the interface, I was hoping we could have the taglist sorted by categories (daydreaming?) and moving the tags from one to another with +/- buttons next to them.
Something like:

  • artists:
    • ? artist1
  • copyrights:
    • ? copyright1
    • ? copyright2
  • characters:
    • major:
      • ? + - character1
    • neutral:
      • ? + - character2
      • ? + - character3
    • minor:
  • general:
    • major:
      • ? + - general1
      • ? + - general2
    • neutral:
      • ? + - general3
      • ? + - general4
      • ? + - general5
      • ? + - general6
    • minor:
      • ? + - general7

Maybe the metatags split is overkill and takes too much space (without the interline spacings *cough*), I don't know. It would be useful to have them sorted on long taglists (which has probably been suggested zillions times already anyway).

Updated

Hmm, that tag list looks too expansive to me. If we are going to start doing a tree, I hope the reason is because we are getting hierarchical tags.

Using ? + - to the left for relevance is a very bad idea because that's exactly what we use for the wiki and adding / removing tags from a query currently.

As for sorting by tagtype, Evazion already wrote a greasemonkey script that lets you do that, and if Albert likes the idea it could be worked into the site's javascript. The same style rules could be used for sorting by relevance.

I don't see that the types and relevances need headers, since the type is already obvious by color, and the relevance could be indicated by the selector on the right.

My idea would leave us with a taglist looking something like:

Sort by: A-Z | Type | Relevance | Count (this could be a drop-down)
? + - artist1 ◑ 1234
? + - copyright1 ◑ 1234
? + - copyright2 ◑ 1234
? + - character1 ● 1234
? + - character2 ◑ 1234
? + - character3 ◑ 1234
? + - general1 ● 1234
? + - general2 ● 1234
? + - general3 ◑ 1234
? + - general4 ◑ 1234
? + - general5 ◑ 1234
? + - general6 ◑ 1234
? + - general7 ○ 1234

Picture this where the sort options at the top work dynamically, and color is used to define tag types and potentially the relevance (which could also be the 2-symbol approach I noted above, or image based symbols)

I think that would be a look a lot more compact and cleaner, and would free up the tree semantics for something else down the road *hint* *hint*.

Updated

I don't know about you, but for me this thing is mind-boggingly complex. I don't even want to start figuring what it could possibly mean, and the chance the average user will bother is approximately -11.76%. I linked to the second system effect for a reason. Yes, it'd be nice to have tags tell us more, but with the above seventy-page proposal, all you're gonna get is a more telling lack of tags, because nobody will give a flying fuck about something this complicated.

How complex is it to have a three-way prominence metric per tag that gracefully degrades for people too lazy to use it, and only adds one symbol per tag to the existing interface?

Most of the "70 page proposal" was back-and-forth discussion and alternative systems. This is a heck of a lot less complex than the original thing you chided me for (which I'll admit was much too cumbersome for general use).

I'll acknowledge general users are lazy, and many still don't tag things properly, but power-users usually step in and take care of it for them. I don't see how this would be any different.

Updated

You don't need the "+ -" part with this system right? So I suppose you meant "? artist1 ◑ 1234" and so on.

Well, I thought of the move up/down idea because it's easy enough for people to understand.
You don't need the tree if you use colors instead yeah but that would be using 6 or 8 colors then (dark blue, medium blue, light blue, gets difficult with a 4th level...). Well, maybe that can work too.

I like your "1234" idea as you can use multiple types of sorting with it.
And the symbol is kind of redundant I guess. I'd rather put the selected number in bold or bigger, like "1 2 [b]3[/b] 4".
Doesn't look that cool with the artist and copyright tags not using the system though.

I don't know. I like the color idea anyway.

Hazuki > Assuming most of danbooru's uploads come from regular users who know how to tag properly and all, I don't think clicking a few buttons on some of your uploads is a lot more work.
Plus it would be a little easier to get participation from members who can't bother to edit tags because it's tiresome. Clicking a button here and there is quicker and might receive some more involvement.

Updated

Er... the ?, +, -, and 1234 have nothing to do with what we are discussing. They are simply copies of what's already in the interface.

? is for the wiki
+ is to add a tag to a query
- is to add a negated tag to a query
1234 was meant to indicate the tag count which is also already there

I added the sorting (which like I said has already been implemented in a script) which isn't original

Other than that though the only thing relevant to the relevance/prominance metric we are talking about is the symbol (◑) to the right that works as both an indicator and a interface for setting the relevance (by cycling like I said above).

Absolutely everything else already exists.

Also the colors stay the same set we already have (red=artists, green=character, purple=copyright, blue=general). No need for 6 or 8, that'd be very confusing and is the reason we don't have more tag types than we do.

The only addition I proposed color-wise would be to grey out neutral relevance symbols so they don't stand out as much.

Updated

Aw, sorry, fine then the symbol selection is nice if it's clear enough I guess (is that +/- "add tag to query" feature really that useful by the way?).

Shinjidude said:
Also the colors stay the same set we already have (red=artists, green=character, purple=copyright, blue=general). No need for 6, that'd be very confusing and is the reason we don't have more tag types than we do.

Well I read you write "and color is used to define tag types and potentially the relevance".
I thought you were meaning to use multiple color levels for relevance.
Alright I follow you. Simple enough.

1 7 8 9 10 11 12 13