Donmai

Accents should die - discuss

Posted under General

This topic has been locked.

Gargamuza said:
So, Robert Garcia's real name is Roberto Garcia? Stop ignoring my arguments.

Besides, I've read LaC's posts. They are irrelevant to the discussion, if you haven't realized it already.

As far as I know, the rules of this site say that we should go by the name the character has in the anime/game the character appears in. Since no KOF has Angel named as Ángel, using any other name is not correct and is just a personal interpretation of the name, which is impractical for this site, and not accurate.

If you fucking read the arguments instead of being so busy being right, you'd have fucking realised that her name is transcribed into katakana as アンヘル, which is NOT how you'd transcribe Angel, as that's エンジェル, but incidentally it's *exactly* how you'd transcribe the Spanish pronunciation of Ángel. So far the one ignoring arguments is you, and that's exactly why you got banned.

Gargamuza said:
Calm down that rage, pal.

http://en.wikipedia.org/wiki/Hypocrisy. You're well on your way to earning another record, I'm getting tired of arguing with someone whose entire argument is that they're right.

The issue for me isn't about whether the name in Spanish (or whatever) should have an accent or not. I frankly *do not care*, at all, but it seems the discussion is still stuck back in that quagmire.

My issue with accents is the fact that they aren't user-friendly, and using them imparts little to no worthwhile information to a user. I personally do not think tags should include any characters outside of a-z, 0-9, and the standard symbols that require nothing more complicated than the shift key. That is the basis of this thread.

I mentioned in option 1) that an alias wold be called for if accents were decided to be utterly, drop-dead necessary, and to be quite honest I *do not care* what is decided for this character. You can continue to battle that out amongst yourselves.

I am more interested in what we do *in general*. Forest, not one Spanish tree.

Do we go through the database and crack out the foreign dictionaries and add cedillas and umlauts and all the rest, alias every one we come across, and be satisfied that we're going to miss something somewhere and have more debates like this?

Or do we just drop accents as a rule and decide that the information that's important is conveyed in a simpler and yet still wholly sufficient manner if we use 'c' rather than ::opens charmap:: 'ç'?

Tree post:

Gargamuza said:
Not only her name isn't Ángel, KOF fans know her by Angel.

How do you pronounce it, btw?

Gargamuza said:
It's not unrelated, because the only reason why some people want to use Ángel instead of Angel is that Angel is Mexican, hence she should use an accent since it's a Spanish speaking country.

I, for one, had no idea that she was supposed to be Mexican.

But if we go by this logic, Robert Garcia should be named Roberto since Robert isn't an Italian name and could probably be the result of a misstranslation.

Roberto Garcia still sounds Spanish to me.

Gargamuza said:
Why do you consider Angel, without an accent, is a Spanish name?

But what name is it, then? If it's an English name, it should be エンジェル or something instead of アンヘル.

Forest post:

jxh2154 said:
We don't use a cedilla in ZnT Louise's name (per albert's decision, no less), because nobody would ever think to type it, and even if by some miracle they did you should NOT have to memorize an alt+#### sequence or start>run>charmap to type a tag.

Nobody is going to type "louise_francoise_le_blanc_de_la_valliere" anyway: you reach those posts using aliases or the disambiguation feature (case in point: I just searched for "louise" to get there). Writing the name properly wouldn't make things harder for anyone.

2) Similarly, if danbooru can be programmed to automagically search ấ, ầ, ẩ, ẫ, ậ, ắ, ẳ, ả, ā, ă, à, á, â, ã, ä, å, and whatever else as well when a user types in 'a'.

Oh, that's not a bad idea. Actually, it's a very good idea.

albert said:
There's presumably a reason why we make such a desperate effort to romanize artist names (often getting it wrong in the process): because most of us don't know Japanese and can't be bothered to struggle with Japanese input. Similarly, inputting accents is a hassle and unnecessary as long as a tag is unique

There is, however, an important difference: most people have no idea what 葉月 is, but everyone can read "Françoise", even if they might have trouble typing it. And for that, we could use the awesome power of computers to let "francoise" match "françoise" as well.

jxh2154 said:
Do we go through the database and crack out the foreign dictionaries and add cedillas and umlauts and all the rest, alias every one we come across, and be satisfied that we're going to miss something somewhere and have more debates like this?

"We wouldn't be having this debate if everyone just agreed with me" is not a very compelling argument, I'm afraid. ^^

LaC said: Nobody is going to type "louise_francoise_le_blanc_de_la_valliere" anyway

Eh, this is still a tree reply. The fact that her name is stupid long is a different issue, already resolved elsewhere. It just happened to be the first example on my mind since we'd discussed it before. But it could apply equally to a very short tag people *would* type.

"We wouldn't have this argument if everyone just agreed with me" is sort of a piss-poor argument, I'm afraid.

That's somewhat disingenuous. What I said was true enough: if we drop accents, we won't have decide if a name should be accented or not. Maybe that would be "piss-poor" if it were my entire argument, but it's not.

It's true that I do think it's a perfectly valid consideration however, given the rancor and animosity that one tag with **two** posts has generated. Is it worth it? Really? Do we want to see it happen again? This place is already bordering on unnecessarily hostile nowadays, and I've personally seen it turn people off.

(To preemptively squash the counterargument I just set you up for, no I am not advocating throwing in the towel every time an issue gets people riled up. Instead it's simple cost/benefit thinking.)

But even if it never generated an argument again (or if we decided we like cursing each other out over a tag), there are still issues of doing it consistently and accurately. Again it all boils down to, is it worth it? Really?

---------

More constructively, as for the idea of automatically mapping special characters to their standard ASCII equivalent:

Oh, that's not a bad idea. Actually, it's a very good idea.

I just don't know if it's worth albert's time. If he can implement it easily, sure whatever go ahead, so long as he decides whatever potential costs (in time or otherwise) are offset by what it gains us. My position is that it gains us quite nearly nothing over replacing accents with plain ASCII counterparts, so the costs have to be exceptionally low. Unless someone wants to write the code for albert, which would be nice. Does anyone know what's involved in this sort of programming? Is it easy? Hard?

jxh2154 said:
More constructively, as for the idea of automatically mapping special characters to their standard ASCII equivalent:
I just don't know if it's worth albert's time. If he can implement it easily, sure whatever go ahead, so long as he decides whatever potential costs (in time or otherwise) are offset by what it gains us. My position is that it gains us quite nearly nothing over replacing accents with plain ASCII counterparts, so the costs have to be exceptionally low. Unless someone wants to write the code for albert, which would be nice. Does anyone know what's involved in this sort of programming? Is it easy? Hard?

Assuming ruby has sane libraries for dealing with Unicode, it's easy, thanks to the thoughtful design behind the standard. There's something called canonical decomposition, which splits characters into their basic glyph and any combining modifiers. So for å (U+00E5 LATIN SMALL LETTER A WITH RING ABOVE) for instance, the canonical decomposition is U+0061 LATIN SMALL LETTER A + U+030A COMBINING RING ABOVE.

So if you basically tell it "Also search all characters that are 'U+0061 plus something else' combinations when 'a' is pressed", and this can be done easily without increasing server load (because it would impact every search but matter in a very small number of cases) or whatever, then I won't protest much if it's implemented, even if I still don't want to see accents in tags.

I guess the question then is what the software would do if it comes across an ambiguity. What would it return if you searched 'foo' and there were tags for 'foo' and 'foö'?

For the Angel issue:

Are SNK idiots? Yes. But the rule is use whatever the source material uses. If SNK uses Angel, and fans use Angel, then Danbooru will use Angel.

Concerning non-ASCII characters in tags:

I will look into unicode normalization but I'm not expecting much. Ruby 1.8's unicode support is entirely library-dependent, and I'm wary of the overhead brought on by unicode processing for something as basic as tag searching. And I think normalization would bring up additional issues with name clashes. But I will look into it.

For now, I would like to stick with ASCII characters for tags. I will revisit the issue of whether aliasing would solve any problems after I add a note field to aliases and implications.

If you have additional strong opinions about this matter, please PM me. But I think this thread has run its course.

1 2