Wikipedia talk:Usage of diacritics/Archive 2
This is an archive of past discussions about Wikipedia:Usage of diacritics. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 |
Summary of the discussion
Since the argument seems to continue going round in the same eternal circles, let's try to summarise the two main positions (this is my attempt, with the disclaimer that I support the proposal, so someone opposing it should probably rewrite/expand the arguments against):
- For the proposed guideline
- We have the technology to use diacritics, so we should use them where they add information
- WP already uses diacritics far more widely than the majority of English sources do, so to a large extent the proposal documents current practice (although this may not apply to non-diacritic extended characters, which are used much more widely than the proposal would imply)
- Readers know that foreign diacritics are optional in English, so are not misled by seeing them
- In the vast majority of cases, the spelling of a foreign name with the original diacritics is acceptable in good English, and does not make it less recognisable to readers used to seeing it without diacritics
- Although the use of diacritic-less forms is also acceptable in English, use of such forms in WP means subtracting information and thus making the encyclopedia worse for no particular gain
- Special cases where the above arguments do not apply are by and large handled through the exceptions stated in the proposal
- Implementing a policy of "doing what reliable sources do", though superfically attractive, leaves open the major question of what sources are to be considered reliable in this matter, what to do where sources clash over a particular name, and how to handle situations where such policy leads to a confusing clash of styles between different names
- Against the proposed guideline
- There is already a naming convention guideline that covers this issue called Wikipedia:Naming conventions (use English) (WP:UE) which depends on current Wikipedia policies (and is in harmony with them) to guide on the choice of name to be used and whether it is appropriate to use or not use diacritics.
- Adopting hard rules about diacritics leads to forms which are against English idiom, and breaches at least two Wikipedia policies WP:NC "article naming should prefer what the greatest number of English speakers would most easily recognize, ..." and WP:V "Articles should rely on reliable, third-party published sources with a reputation for fact-checking and accuracy."
- Use of diacritics where the majority of English sources do not use them is likely to mislead readers into thinking that they are most commonly used in English
- Use of diacritics when they are never used in English is not writing in English (This would be equally true of extensions which are always used in English, and which this convention would exclude, but there are fewer of those.) This is the English Wikipedia.
- We get into trouble when a foreign name contains both diacritics (which the proposal says should remain) and extended characters (which the proposal says should be transcribed)
- Common usage is usually easy to determine if editors are willing to look at reliable third party sources using good faith. Where it is not WP:UE describes how to minimise conflicts: see the sections WP:UE#Divided usage and WP:UE#No established usage.
- Most cases where the use of diacritics is reviewed by a large pool of uninvolved editors either move them or provide clear evidence that the diacritics are common in English usage.
- Many cases are in fact decided by the standard of common usage; where they are not, they are non-consensus due to a plea that the diacritics are "correct" in some other language, which would also oppose this proposal.
- Just as many differences between national varieties of English, can distract from the information contained in the article for people familiar with another dialect (and hence our rules in the MOS on national varieties of Englis)— The evidence for this is the number of times English is "fixed" by altering spelling and grammar in articles from one varieties of English to another and by requested moves for the articles such as Orange (colour) — so too diacritics on words that do not usually have them in English, or no diacritics on words that usually have them can also be distracting (evidence of this can be seen with requested moves for articles such as Zürich). Using reliable third party sources to indicate common usage, reduces the annoyance factor of the "wrong" spelling for the largest number of people.
- --Kotniski (talk) 10:10, 27 June 2008 (UTC)
- Modified by -- Philip Baird Shearer (talk) 16:06, 27 June 2008 (UTC)
- It's the last of the "against" arguments I find common and faintly ridiculous. There is no evidence whatsoever of "distraction" being caused by the presence of other national forms - people may change between forms from ignorance, habit or any other reason - and likewise for diacritics. Our rules are there to keep the number of changes down no matter what the reason - and there is absolutely nothing in policy about "distraction", being a completely subjective and wildly variable matter between readers and not even a simple function of relative usage.
- Comprehensibility is what matters, and any argument which claims understanding (or "annoyance", or "distraction") is a binary-valued quantity based on a simple majority vote on usage is obviously over-simplistic - just look at the example of Schröder given above. The same goes for Zurich/Zürich - English speakers come across both spellings a reasonable amount of the time without massive misunderstanding or incomprehension, just as we assume they are familiar with both color and colour - and that's a nearly 80/20 split on the internet. Saying that merely because a spelling is used by less than 49% of the sites on Google it is suddenly more annoying or incomprehensible to more people is unfounded, unlikely and simplistic. When picking a spelling, if comprehensibility to all is not at risk, then we are justified in starting to look at whether one of the spellings has other possible extra benefits, such as diacritics. Our overriding aim is that we don't use spellings not well-established in English - any talk of majorities is an over-restrictive blunt tool, and talk of "annoyance" and "distraction" are obscurantist on the par of "I like diacritics cos they look exotic" or "it's distracting not to have diacritics cos it's not the right spelling in Foo-ian". The third "against" argument is also equally speculative for much the same reasons. Knepflerle (talk) 16:59, 27 June 2008 (UTC)
- I think the main sticking point here for me is whether ć is the same as c is the same as ç. I "see" (pun intended) them as three different letters, and so I would avoid using ć the same way I would avoid using π in an article title for an English encyclopedia, probably because in the Spanish ("Castellano" as I learned it) alphabet n and ñ are different letters, not one letter and a modified version (the "tilde" isn't really a diacritic at all). There are cases where using the non-English character is the sensible thing to do (q.v. El Niño), but it should be avoided when there is an obvious and widely-used alternative.
- I agree that "majority use" is a blunt instrument, but "fringe use" (q.v. Milosevic, see my comments above) is easier to identify and should not be encouraged. Somedumbyankee (talk) 17:31, 27 June 2008 (UTC)
- If that's a "sticking point", you're focusing on the wrong issues. It is neither here nor there whether these are diacritics or separate letters in a different language - what matters is whether, in a particular word, they have a detrimental effect on comprehensibility to English-speakers or not. Needless to say there is a world of difference between the comprehensiblity of ç and that of Greek/Cyrillic letters such as π. But for Göttingen, Schröder, Zürich, François Mitterrand, Dvořák, ångström and El Niño the loss to comprehensibility is practically zero because these forms are seen with quite decent (if not >50%) regularity in English language texts. Knepflerle (talk) 23:44, 27 June 2008 (UTC)
- "Focusing on the wrong issues" is a bit misleading. I'm focusing on different issues, which is probably why we don't agree. My point is mostly that ć isn't an English letter any more than ñ or θ are, and the English Wikipedia shouldn't be using foreign spellings when most reliable sources aren't using them. When reliable sources do, I see no problem using them, but the reality is that this policy would force us to use them when our sources do not. Verifiability, not truth is the "wrong" issue that bothers me. Somedumbyankee (talk) 00:14, 28 June 2008 (UTC)
- The current proposed guideline does not necessarily require any English-language sources use what you call the "foreign spelling"*, and for this reason I am not in favour of it. Verifiability requires that some reliable English-language sources use the "foreign spelling". Comprehensibility requires that a decent proportion of reliable English-language sources use the "foreign spelling". Our current guideline requires a majority use the "foreign spelling". The middle path which I favour, is that we could (but of course do not have to) accept a spelling as long as it is comprehensible, and this is a stricter requirement than and implies verifiability. It is, however, a slight slackening of the current UE which requires simple majority and which means occasionally we may be not using more information-rich spelling which is just as comprehensible to English-speakers. (* PS: I contend that a spelling which is used in a decent proportion of English-language texts is as English as any other, no matter perceived "foreignness" - what is more English than what a decent amount of English speakers write? Café is certainly English usage, and Zalaegerszeg and Pilisszentkereszt are just as foreign as Hódmezővásárhely!) Knepflerle (talk) 00:51, 28 June 2008 (UTC)
- To quote the current policy, "If there is a consensus on spelling in the sources used for the article, this will normally represent a consensus of English usage." That policy isn't looking for a majority, it's looking for a consensus of how reliable sources spell it. It defers judgment to people who know better than Wikipedia editors instead of instruction creep that "fixes" "incorrect usage" by "half-literate" people like the UN.
- For the record, café is really more of a foreign branding, intentionally taken raw from the French to sound foreign and gastronomically appealing. It's not really an English spelling, it's a foreign word routinely used in English. Then again, that accounts for the majority of the language depending on how you slice it. Somedumbyankee (talk) 01:30, 28 June 2008 (UTC)
Café is not an English word, just "foreign branding" because of your unfounded and irrelevant speculations on why it entered English use? Or just a case that no true English word would look that foreign? "That policy isn't looking for a majority" - just read the first line: "Use the most commonly used English version of the name of the subject" Knepflerle (talk) 11:24, 28 June 2008 (UTC)
- This is more of an WP:ENGVAR issue. Café is far more natural in British English; it would be more speculation to point out that the British spend the most time in France of any English-speaking nation. But even in Britain the frequency of café may well be declining, and using the accent has been controverted at WT:MOS. (The discussion may well have been archived by now.)
- WP:UE has never been held to mean a 51% majority, except in cases driven by some other motive; wording to clarify this, if the section on Divided Usage does not, would be welcome. Septentrionalis PMAnderson 18:49, 28 June 2008 (UTC)
Couple of my comments
- Point number 1-absolutely base -for personal names (excluding royalty ! ) !
Point number 1 in this proposal shoul reflect to all personal names in latin script! Unless it opposes with the point number 5
- Point number 3 is nonsense
there are no national conventions for transliteration! So, there is no convention neither for transliterrration of letters Like "Đ, đ". Only convention that exists for tranliterration is in German language : transliteration of Ä Ö Ü and ß into AE, OE ,UE and SS where the usage of these characters is disabled due the software problems( URLs, e-mail addresses... ). I remind you that wikipedia has no such a problem! What is the difference between Tuđman and Ngô Đình Diệm ??
- I support point 5. ( " # When a person has changed his or her name (for example, in the process of naturalization as a citizen of another country), the new form of the name is used." ) This applies only for legal documents (passport etc. not football fan member cards !! )
So ,these rules would apply to the Edward George de Valera who became Éamon de Valera , Wilhelm Oberdank who became Guglielmo Oberdan , Ivan Vučetić who became Juan Vucetich.
I will add these 3 examples to the proposal to the point 5 . --Áñtò | Ãňţõ (talk) 20:28, 27 June 2008 (UTC) 20:12, 27 June 2008 (UTC)
- Of course there is "evidence of distraction". It is strongest for ß and đ, where editors have said so; see Talk:Meissen and its archives. But it is clear that TennisExpert finds even single letter variants from the names he is accustomed to distracting; so do I, although I think less so. Septentrionalis PMAnderson 20:25, 27 June 2008 (UTC)
- If that will make things easier we can make always :REDIRECTIONS. i.e . Tudman to Tuđman, Dokovic to Đoković. perhaps a little tutorial for English monoglots that will be explaining them that "There are some things that ar not written in English. There are some names written by strange letter! etc. .... " There are laways "copy -paste " methods. --Áñtò | Ãňţõ (talk) 20:37, 27 June 2008 (UTC)
- Not only are diacritics visually distracting, but they're a barrier to editing on English-language Wikipedia, as I have explained before. Those are important considerations, as is preserving the fundamental principle that this is an English-language encyclopedia and everything in it should be based on reliable English-language sources and not on original research, personal opinions, or emotional appeals to nationalism. Tennis expert (talk) 20:50, 27 June 2008 (UTC)
- Personally, I'd never expect of anyone to enter any diacritics while editing. (This is an important issue, and we've missed it in the discussion.) As for "emotional appeals to nationalism", well: turnabout is fair play. GregorB (talk) 21:08, 27 June 2008 (UTC)
- Not only are diacritics visually distracting, but they're a barrier to editing on English-language Wikipedia, as I have explained before. Those are important considerations, as is preserving the fundamental principle that this is an English-language encyclopedia and everything in it should be based on reliable English-language sources and not on original research, personal opinions, or emotional appeals to nationalism. Tennis expert (talk) 20:50, 27 June 2008 (UTC)
- @Tennis expert . If you are not familiar with diacritics then don't use them. You can write an article without them and later native speakers might help you with name spelling. And problem solved. --Áñtò | Ãňţõ (talk) 10:47, 28 June 2008 (UTC)
- Comment I generally support diacritics, but agree with con #4 "when a foreign name contains both diacritics ... and extended characters". If we're going to use one, we should use both.
- As an encyclopedia, I think we should retain as much reference information as possible. Newspapers and magazines may not care, since the individual will be disambiguated by being topical at the time of coverage, but that is not the case for us.
- Also, please let's not get into the straw man of common English words (we're discussing proper names), or the debate about whether to use an English vs. foreign form for a proper name (this would only apply once it's decided to use the foreign form). kwami (talk) 20:33, 27 June 2008 (UTC)
- This proposal proposes that we include misinformation; less than some users would like, I admit, but more than we should. Septentrionalis PMAnderson 20:36, 27 June 2008 (UTC)
- What are the misinformation in this proposal??? Explain please!!! --Áñtò | Ãňţõ (talk) 20:46, 27 June 2008 (UTC)
- This proposal proposes that we include misinformation; less than some users would like, I admit, but more than we should. Septentrionalis PMAnderson 20:36, 27 June 2008 (UTC)
- Yes, I am also curious as to what "misinformation" you're talking about. I thought this debate was merely a matter of opinion, not of fact.
- One other comment. Using an encyclopedia is part of a person's education. By removing diacritics in order to dumb down the encyclopedia (because they're "distracting", "unfamiliar", "difficult", etc.—in other words, because our readers aren't educated or intelligent enough to handle them), we're not doing our readers any favors. Of course, removing diacritics because the names have been assimilated into English is an entirely different matter. kwami (talk) 21:02, 27 June 2008 (UTC)
- Con point 3: Use of diacritics where the majority of English sources do not use them is likely to mislead readers into thinking that they are most commonly used in English. That will do as a summary, but it is too weak. It does mislead, and in some cases (Tudjman is the most obvious, but I'm sure there are others) it misinforms about what is ever used in English (excluding the hopeless pedant and the terminally illiterate). Septentrionalis PMAnderson 21:10, 27 June 2008 (UTC)
- One other comment. Using an encyclopedia is part of a person's education. By removing diacritics in order to dumb down the encyclopedia (because they're "distracting", "unfamiliar", "difficult", etc.—in other words, because our readers aren't educated or intelligent enough to handle them), we're not doing our readers any favors. Of course, removing diacritics because the names have been assimilated into English is an entirely different matter. kwami (talk) 21:02, 27 June 2008 (UTC)
- Such cases are indeed genuinely dumbing Wikipedia down, where following common usage would not. Septentrionalis PMAnderson 21:10, 27 June 2008 (UTC)
- That's hardly "misinformation". All we would need to say is "commonly spelled 'Tudjman'", unless of course in that case we decide "Tudjman" is an assimilated English name and go with 'Use English'. kwami (talk) 21:31, 27 June 2008 (UTC)
- Try and include "commonly spelled 'Tudjman'" in the article, and see what happens. But of course "Tudjman" is an assimilated English name; that doesn't stop our nationalists from arguing about it. (Some of these tennis players seem to have had their names assimilated to English, or at least Western European, usage, also; but that is a question of fact, to be decided by evidence.) Septentrionalis PMAnderson 21:36, 27 June 2008 (UTC)
- That's hardly "misinformation". All we would need to say is "commonly spelled 'Tudjman'", unless of course in that case we decide "Tudjman" is an assimilated English name and go with 'Use English'. kwami (talk) 21:31, 27 June 2008 (UTC)
- Part of the problem is that it seems like many of the comments seem to want to reject the entire WP:UE guideline. I'm hearing many of the comments on this page as "modern assimilated English names are not English usage they're just misspellings and using them insults the reader." If there is a reasonable consensus among many prominent and reliable English language sources on how to spell a name, I would take that as clear evidence that there is an assimilated English name and it should be treated the same way as Munich or Napoleon or Venice. If editors really have a problem with the way that established English usage handles a word, they should bring it up with the reliable sources and not try to "correct the world" through Wikipedia. Somedumbyankee (talk) 22:40, 27 June 2008 (UTC)
Pointing just few flaws in that changing ö->oe ä->ae in my native Finnish. Former Prime minister Anneli Jäätteenmäki, huh Jaeaetteenmaeki? Or what about the following sentence, Mosquitos on Lake Onega's ice: Saeaeskiae Aeaenisjaerven jaeaellae. Great.. When removing "diacritics" only, Willow Tit fi:Hömötiainen turns into Homo Tit. It just isn't acceptable to cripple the letters. As I see it, all WikiProjects of countries which use diacritics should quit as I don't see any point in creating articles on names they are incorrect. Atleast I wouldn't create / edit any. --Pudeo⺮ 23:38, 27 June 2008 (UTC)
- The name may be incorrect in Finnish, but it a name is spelt a different way to the Finnish in the majority of reliable English language sources, then the spelling is not incorrect in English. "Napoleon Bonaparte" is to an incorrect English spelling of Napoléon Bonaparte. For that matter neither is Wikipédia an incorrect spelling in French. --Philip Baird Shearer (talk) 12:43, 28 June 2008 (UTC)
- Again : I have told this to PMAnderson and I will tell to you as well:Some thing can be "correct" if it is against the system , against the rules! And so far:
- there are no rules in English language for spelling of foreign names.
- There are no rules for transliterration of the names from the Latin script.
- If there are such rules provide me some sources. Those sources can be only English grammar/ortography books from some anglistic studies at some university. They are only experts for English grammar /ortography rules (d . Not NYT, CNN , BBC neither Playboy, Cosmopolitan or FHM because they are not in charge for regulating of English language.--Áñtò | Ãňţõ (talk) 15:53, 28 June 2008 (UTC)
- There are no rules for English but usage. Anto does not know this; then again, he misspells "orthography" consistently - ortography is not usage. Enough. Septentrionalis PMAnderson 18:41, 28 June 2008 (UTC)
- Point 1. your insinuations about my spelling are pathetic! (A Bushmen is making jokes with Chinese how short he is)Considering the fact you are monoglot and unproven "expert" for any issue. who wants to show himself great "expert". So far-Mission failed!! Point 2. ortography or orthography -whatever.Find me that manual , Mr. Big Expert! I am waiting entire month! --Áñtò | Ãňţõ (talk) 07:32, 29 June 2008 (UTC)
- For me it is not important which solution will win, but it is only important that we will have 1 rule for all languages. If there will be vote for any universal solution please "call me"--Rjecina (talk) 20:54, 28 June 2008 (UTC)
- There shouldn't be one rule for all languages; each language handles things differently.--Prosfilaes (talk) 12:36, 1 July 2008 (UTC)
- There are no rules for English but usage. Anto does not know this; then again, he misspells "orthography" consistently - ortography is not usage. Enough. Septentrionalis PMAnderson 18:41, 28 June 2008 (UTC)
- Again : I have told this to PMAnderson and I will tell to you as well:Some thing can be "correct" if it is against the system , against the rules! And so far:
Diacritics infact necessary different letters in some languages
Hi, I would again like to point out a certain thing in Swedish, Finnish, Norwegian, Danish eg. alphabets. There are letters like Å, Ö, Ä, Ø there. The fact is, they certainly can't be considered just "accented versions" of the other ones. That is like implying Q is an accented version of O because it just has an additional dash there. Their appearance has nothing to do with the way they are pronounced.
Swedish perhaps provides the best example. If a word has both, Å and Ä, they both would be rendered as A here. Very wrong. They are completely different letters which's pronounciation is different. Ä is more like "/ee/" and Å "/oo/".
The article Kimi Räikkönen was some time ago moved to Kimi Raikkonen. I opposed this move with the fact his official name in all papers is Räikkönen. You can't change other person's name here in Wikipedia if his name is other in all legal documents. Again the fact they are different letters, Räikkönen is a last name of 936 persons and Raikkonen of 16 persons ([1]). So it is a different lastname, you can't change it.
I'm sure the problem exists in other languages as well, since in Vietnamese d and đ are different letters as well, but I'm most familiar with these languages I brought examples of. Fully supporting the usage of "diacritics" and proper rendering of names as we have Unicode and redirects here. --Pudeo⺮ 12:07, 25 June 2008 (UTC)
- The Finnish and Swedish Wikipedias should of course differentiate as they in fact do; but we are the English Wikipedia. We should, in such cases, include the foreign spelling as information, and differentiate when reliable English sources find it necessary to do so. (Quite often they do: Åland Islands is the conventional spelling.)
- Nor are Scandinavian languages alone in this; the Os in Orion represent ω; the O in Odysseus represents ο: different letters, with different sounds, in Greek. But English does not distinguish; we, the English Wikipedia, need not, and should not. Septentrionalis PMAnderson 12:38, 25 June 2008 (UTC)
- en.wiki is wikpedia in English not anglophone POV wikipedia. Are you able at all to distinguish those two phrase??
- No, different wikipedias should not be different. Unfortunately , lot of artcicles (related to the politics/history) are de facto POV of certain nations. But we should make effort to eliminate them.
--Anto (talk) 15:21, 25 June 2008 (UTC)
Yes, Septentrionalis, your comments on Orion and Odysseus are completely off the mark: these are completely assimilated English names. Of course we don't and shouldn't use diacritics, except in their etymology. Personal names which are not assimilated into English are an entirely different matter. It's like the difference between writing an English-derived word in kana or romāji in Japanese. I don't understand why people insist on confusing these concepts. kwami (talk) 18:53, 25 June 2008 (UTC)
- So are many tennis players. What evidence can there be that a name is fully assimilated, and perhaps altered in the process, but usage? (I pass by, as inconsequential, the detail that the assimilated form of the second name is Ulysses; Odysseus is a nineteenth-century learned correction.) Septentrionalis PMAnderson 15:40, 26 June 2008 (UTC)
Indeed this is the English Wikipedia, but not every name in the world is in English. That's why different letters are used as well (of the Latin alphabet) because there simply can't be any substitutive letters. --Pudeo⺮ 19:30, 25 June 2008 (UTC)
- Most names, including many with diacritics, are spelled the same in English as in the original language. Some are not. Whether a given name is is a question of fact; the way to answer it is to look at what English does with the name, not at which letters are involved (as this proposal would do). Septentrionalis PMAnderson 15:52, 26 June 2008 (UTC)
- A personal name is a fixed thing you get at birth. If it has diacritics, they may be lost through emigration in a country ignoring them as in the case of the current French president. The name’s owner may change it her/himself, like taking an artistic name or a pseudonym for public appearance. In all other cases, i.e. nearly all, a personal name is neither alterable nor translatable.
- About sources, do not forget that, although Unicode has been technically available here for over a decade now, keyboard drivers have not followed (e.g. many of us have a key for an acute accent but it works only on vowels). I do not think that Wikipedia should reproduce the sloppiness of others who could not render (or did not bother rendering) a personal name in its original form. I disagree as well with the idea that because people may have been used to see a name without diacritics, they should be served that form. It is like putting up wrong beliefs just because they have been frequently quoted. REDIRECTs are definitely needed but the title page should be simply accurate.
- The issue may be different with loanwords and places - this discussion might be easier if split in 3 parts... Clpda (talk) 17:03, 26 June 2008 (UTC)
- Clpda, do you have any evidence that stripping accent marks is "sloppiness", if so how do you explain the process of anglicisation of words like hotel or should English speaking people still write "hôtel" because they are sloppy? Usage for whatever reason governs English, and if the majority of reliable English language sources strip the accent marks off a word then we should follow their example (see WP:V, WP:OR and WP:NC)
- Names are just the same. Napoleon Bonaparte usually written that way in English. The name is not usually written in English as it is in French "Napoléon Bonaparte". Even Encarta strips the diacritic something they do not do for Lech Wałęsa even though most reliable sources do. I do not think that your position is credible if we are to keep within Wikipedia content and naming policies, which is to use what most reliable English language sources use. --Philip Baird Shearer (talk) 18:15, 26 June 2008 (UTC)
- I explicitly restricted my comment to personal names, so your example of 'hôtel' is not in my line of discussion. I'm perfectly fine with 'hotel'. By the way, I'm not sure that any other language having imported 'hôtel' from French has taken the circumflex with it, even the languages which, contrary to English, are used to diacritics.
- I disagree with 'names are just the same'. However, I fully admit that the name of historical people such as Napoléon could be rendered without its diacritic(s). Other names that were debated above on this page, such as the one of an author of Azerbaijan, are not (yet) historical, whatever his fame within the English speaking world, and should be kept in its original spelling. I understand that drawing the line may be occasionally difficult (what is 'historical' enough?) but the discussion would then be left to their individual pages. If a consensus can be reached for over 95% of the pages concerned, that's already a good result... Clpda (talk) 19:05, 26 June 2008 (UTC)
- What's the line between historical and not historical? Is it the point at which a conventional spelling (which may be either with diacritics or without) becomes most common in English? If not, what is it? and how do we determine it without original research? Septentrionalis PMAnderson 19:20, 26 June 2008 (UTC)
- When they have have been adopted or naturalized to English in historical texts. Almost all languages have their own variants for European monarchs. That's okay, but you can't change Formula One World Champion's name without his permission. :-). Hotel is an English word, with French origin. It is fully adopted, thus naturally acceptable without the diacritic as it has been. Not all words are adopted, like Norse mythology Óðr. Then I don't see any point in crippling the word trying to be "English" using the classical Roman alphabet. --Pudeo⺮ 19:30, 26 June 2008 (UTC)
- [Adapting the name of a monarch to one's own language is] okay, but you can't change Formula One World Champion's name without his permission. Says who? On the contrary, we anglophones do both all the time; we always have. (Not Formula One, of course; but the spelling of foreign jousters was much more erratic.) You may prefer more moral languages, in which spelling is regulated by government edict; you are free to do so. If so, do leave us in our sloth and heathen folly. Septentrionalis PMAnderson 19:44, 26 June 2008 (UTC)
- When they have have been adopted or naturalized to English in historical texts. Almost all languages have their own variants for European monarchs. That's okay, but you can't change Formula One World Champion's name without his permission. :-). Hotel is an English word, with French origin. It is fully adopted, thus naturally acceptable without the diacritic as it has been. Not all words are adopted, like Norse mythology Óðr. Then I don't see any point in crippling the word trying to be "English" using the classical Roman alphabet. --Pudeo⺮ 19:30, 26 June 2008 (UTC)
- What's the line between historical and not historical? Is it the point at which a conventional spelling (which may be either with diacritics or without) becomes most common in English? If not, what is it? and how do we determine it without original research? Septentrionalis PMAnderson 19:20, 26 June 2008 (UTC)
I absolutely agree with Pudeo. —Nightstallion 19:40, 26 June 2008 (UTC)
- Then go ahead and establish MoralWiki, where you can impose any commandment that seems good to you. We have a policy to write in English, and a preference for communicating with our readers. Septentrionalis PMAnderson 19:48, 26 June 2008 (UTC)
OH My ! God! Aren't all these (more then 2 million articles) written in English??? Have you been thinking they were in Hungarian???
Perhaps we should change this article about English language:
From this
Regulated by: no official regulation
into this:
Regulated by: User:Pmanderson on wikipedia
LOL
--Anto (talk) 19:50, 27 June 2008 (UTC)
- Anderson, your conceit is getting annoying. If you feel you need to insult or make fun of other editors' opinions, I must assume you don't feel your own opinions can stand on their merits. Your habit of repeatedly and evidently purposefully misrepresenting others' arguments is also less than impressive; again, it appears you are unable to address the issue at hand. The more you write, the better you make your opponents look, even when I don't agree with them. kwami (talk) 20:08, 26 June 2008 (UTC)
- I quoted an argument and responded to it; I did not knowingly distort it. If I have, please explain. If Pudeo is not asserting a moral imperative to write most current persons with their birth name, whether it is ever so used in English, I do not understand his position at all. If I do understand it, I see no basis for its binding force.
- Anderson, your conceit is getting annoying. If you feel you need to insult or make fun of other editors' opinions, I must assume you don't feel your own opinions can stand on their merits. Your habit of repeatedly and evidently purposefully misrepresenting others' arguments is also less than impressive; again, it appears you are unable to address the issue at hand. The more you write, the better you make your opponents look, even when I don't agree with them. kwami (talk) 20:08, 26 June 2008 (UTC)
- That position would, it seems to me, require a nineteenth century WP to use Napoleón; it would require us to use Franjo Tuđman now. The first is an idiom violation; the second is contrary to the explicit wording of this proposal. What are the three of you defending? There is certainly no consensus to always use diacritics; this is, in its way, a compromise proposal.
- I await information. Septentrionalis PMAnderson 01:21, 27 June 2008 (UTC)
- What I meant is: there's Charles XIV John of Sweden, although in Swedish it is Karl Johan, Henry IV of France although in French it is Henri. Okay, it's so in almost all languages due to acceptable historical texts. However, you are not allowed to change my name for example without my consent. It is what it is in legal documents. Kimi Räikkönen is Räikkönen, and in fact removing äö results in a different last name used by 16 people! (see earlier link to Name Register Centre). This is an encyclopedia: while removing diacritics improves nothing (we have redirects), removing them erases the only proper way to call them and factuality. Welcome to Unicode age and an encyclopedia that covers the subjects of the whole world. --Pudeo⺮ 11:49, 27 June 2008 (UTC)
- I await information. Septentrionalis PMAnderson 01:21, 27 June 2008 (UTC)
- (Warning, this reply is silly!) "Self-identifying usage" is actually the second criterion used in WP:NCGN when a common English usage cannot be determined. It's definitely a useful tiebreaker, but a common English usage, if it exists, is always preferred as res judicata. A fortiori, Many English language publications will leave off the diacritics simply because English basically doesn't use any, so an anglicization will automatically leave them off without being malum in se. Excessive use of foreign words or forms is frowned upon as pretentious and silly, and I would give a lawyer example but res ipsa loquitur. Somedumbyankee (talk) 15:00, 27 June 2008 (UTC)
- The legal distinction here seems particularly weak, because it is as true of the kings as of tennis players. Didn't Bernadotte sign legislation Karl Johan? Then, if this argument were binding, we would have to call the article that, which hardly anyone else does.
Septentrionalis PMAnderson 20:21, 27 June 2008 (UTC)
- As for the main comment there's Charles XIV John of Sweden, although in Swedish it is Karl Johan, Henry IV of France although in French it is Henri. Okay, it's so in almost all languages due to acceptable historical texts. Yes, of course; and it is also true for non-royals, and people within their lifetime. Henry Fuseli was so spelled in his own time, and still is; more recent examples are Stanislaw Ulam and Waldemar Matuska; Novak Djokovic is so spelled by the Britannica Book of the Year, which is as close to a historical text as he is likely to get. Septentrionalis PMAnderson 21:18, 27 June 2008 (UTC)
We're not speaking about the names of persons, whose names were adjusted to the respective languages. Adjusting the names of kings, bishops, popes, patriarchs etc. is the thing that's being done in other languages also.
We're speaking about the names of persons that do not belong to that category.
I don't know for you, but on Croatian Wikipedia, we use redirects to proper form of original names - in short: we practice what we preach.
Noone from us here on en.wiki knows the grammar and ortography of all possible languages, and how is written someone's name properly.
E.g.: Lech Walesa, Sissel Kirkjebo, Voros Lobogo, Szekesfehervar, Tirgu Mures, Constanta, Sibenik, Besancon, Guimaraes, Rascane, Kizilkoy, Citroen, Hotel des Invalides, Leixoes, Uniao Leiria.
But we have redirects (what is the purpose of redirects, if not for this?), so we can know the proper name. Proper, correct information. Isn't that one of necessary conditions of Wikipedia?
Insisting on not using the diacritics is insisting on illiteracy. Kubura (talk) 07:24, 1 July 2008 (UTC)
- If foreign usage is the rule editors on the Croatian Wikipedia have agreed to all well and good, it may well be what people do when writing Croatian (you tell me), but it is not what people do when writing English, and our Wikipedia policies do not support the use of foreign names that are commonly spelt a different way in English. BTW I note that the article hr:Kristofor Kolumbo is not under hr:Cristoforo Colombo so you had better change it if ordinary people are under their foreign spellings on Croatian Wikipedia, or do you make allowances for common usage as well over at Croatian Wikipedia? --Philip Baird Shearer (talk) 13:36, 1 July 2008 (UTC)
- Precisely. Also hr:Luj August., regent Francuske, with its odd punctuation, which could also profit from collation with the French or English Wikipedia; this is the Duke of Maine, whose regency was not confirmed; if it existed, it lasted for twenty-four hours. Septentrionalis PMAnderson 13:53, 1 July 2008 (UTC)
We need diacritics like a hole in the head
We need diacritics like a hole in the head. They should be avoided except where absolutely necessary. --Anticipation of a New Lover's Arrival, The 21:19, 22 June 2008 (UTC)
- Well, that was certainly a helpful addition to the dialogue. Unschool (talk) 22:26, 22 June 2008 (UTC)
- Actually, it was exactly as dogmatic as some of the "use diacritics, because they're correct" posts we've had. Let the two extreme positions cancel each other out. Septentrionalis PMAnderson 22:34, 22 June 2008 (UTC)
- As usually, Pmanderson is misinterpreting other peoples' statements. Nobody has told that diacritics must be always used! That is just one of your fabrications! We insist that the personal names have to be spelled as the persons spelled their names themselves. If persons have anglfied their names and they use that forms. So, we have John Malkovich , George Radanovich, Gary Gabelich , George Chuvalo etc. the extremism here would be to insist that their names have to spelled in their original forms (Malković, Radanović, Gabelić, Čuvalo) -which makes no sense because they don't use that name forms. So, nobody insists for usage of Croatian forms!! --Anto (talk) 10:44, 25 June 2008 (UTC)
- I don't believe there's a lot to add. We've got diacritics even on city names like Zurich, though the umlaut in that city's name is dropped by the native French-speaking Swiss, and the umlaut is seldom used in contemporary English references to Zurich. It's just daft, and only makes it more difficult to search for names in Wikipedia. --Anticipation of a New Lover's Arrival, The 23:06, 22 June 2008 (UTC)
- Not if appropriate redirects are used. ···日本穣? · Talk to Nihonjoe 02:11, 23 June 2008 (UTC)
- Well, my problem is not with finding articles, it's with reading them. The whole purpose of WP:UE is to make this English-language encyclopedia comfortably readable for all persons who read English, not just the small percentage (yet overrepresented on Wikipedia talk pages) of readers who are comfortable with non-English characters. No one who speaks languages other than English is hurt by Vossstrasse (they already know about how English traditionally handles eszett), but many readers will be thrown off by a spelling that their best guess tells them is VoBstraBe. Those redirects do come in handy—and they should be used to help the person who types in Voßstraße. Unschool (talk) 02:45, 23 June 2008 (UTC)
- Can you read them properly with transliteration. No, you caN'T . I can bet that you pronounce 99% of the foreign names improperly. So, what is a big deal??
- Well, my problem is not with finding articles, it's with reading them. The whole purpose of WP:UE is to make this English-language encyclopedia comfortably readable for all persons who read English, not just the small percentage (yet overrepresented on Wikipedia talk pages) of readers who are comfortable with non-English characters. No one who speaks languages other than English is hurt by Vossstrasse (they already know about how English traditionally handles eszett), but many readers will be thrown off by a spelling that their best guess tells them is VoBstraBe. Those redirects do come in handy—and they should be used to help the person who types in Voßstraße. Unschool (talk) 02:45, 23 June 2008 (UTC)
- Not if appropriate redirects are used. ···日本穣? · Talk to Nihonjoe 02:11, 23 June 2008 (UTC)
- Actually, it was exactly as dogmatic as some of the "use diacritics, because they're correct" posts we've had. Let the two extreme positions cancel each other out. Septentrionalis PMAnderson 22:34, 22 June 2008 (UTC)
--Anto (talk) 17:04, 23 June 2008 (UTC)
- Although Diacritic does mention it (not sourced), ß is not what I consider a diacritic. There is no mark to modify the S; it is a base character. While it is of the Latin script, it probably should not be dealt with by this proposal. 207.46.92.16 (talk) 04:07, 23 June 2008 (UTC)
- I think that's a semantic quibble. Eszet (ß) poses almost identical problems for readers of English not familiar with German. If ß does not fit the proposal as named, I think it would be better to rename the proposal so that it covers ß to your satisfaction. --Anticipation of a New Lover's Arrival, The 09:35, 23 June 2008 (UTC)
- Surely not identical problems - eszet doesn't look anything like double "s", whereas letters with diacritics do look like their alterantive forms (readers quickly get used to "not seeing" diacritics they don't want truck with). The distinction between the two cases is made in the proposal. Actually, though, much current practice on WP seems to be to include the non-diacritic extended characters like the eszet, Icelandic thorns, Croatian crossed d's and so on. Just to make it quite clear, implementation of the proposal as it stands would actually lead to significantly fewer foreign squiggles in Wikipedia, not more.--Kotniski (talk) 10:00, 23 June 2008 (UTC)
'ß' is not a diacritic indeed, but a ligature. Until about the beginning of the 19th century in Britain and France as well, a single 's' had the form of, more or less, a 'f' without its horizontal bar. This explains the left side of the letter and then, when you ignore the top right (the ligaturing part), the bottom of the right side looks like a modern 's'. This ligature was in use in other European languages too at the time, but disappeared, except in Germany and (partly) in Austria. Switzerland - another country where German is spoken by a substantial part of the population - has abandoned it at the same time as the 'Fraktur', often called 'Gothic script'. It is absent from a Swiss keyboard and was (probably - that's so far away that I don't remember) already away from typewriters. In addition, 'ß' is always converted to 'ss' in indexes, search arguments and so on. So it is almost like an esthetical display choice like 'œ' vs. 'oe'. Clpda (talk) 22:23, 29 June 2008 (UTC) (additions/corrections Clpda (talk) 12:34, 30 June 2008 (UTC))
- So would implementation of present guidance; the problem is not guidance, it is a handful of nationalist editors who will run to forms familiar to them whatever English does. The downside is that this proposal would, since it relies on common usage for the character, not each individual word, ban diacritics where we should use them, and require diacritics where we do not. Septentrionalis PMAnderson
- Haha . Can you read any foreign name which is not English??? --Anto (talk) 17:02, 23 June 2008 (UTC)
- I think that's kind of the point of WP:UE, you only have to be able to read English to use it. Somedumbyankee (talk) 17:13, 23 June 2008 (UTC)
- It is, after all, the English Wikipedia. Where there is a well established English usage, we should follow it. --Anticipation of a New Lover's Arrival, The 18:21, 23 June 2008 (UTC)
- Sorry ,guys, but you have missed the point. there are some things that you won't be able to understand not matter how good is your Englis . Such as Differential (calculus) -which is not easy to understand. And sorry , we can't simplify you with adding and substraction.
- It is, after all, the English Wikipedia. Where there is a well established English usage, we should follow it. --Anticipation of a New Lover's Arrival, The 18:21, 23 June 2008 (UTC)
- I think that's kind of the point of WP:UE, you only have to be able to read English to use it. Somedumbyankee (talk) 17:13, 23 June 2008 (UTC)
For simlified issues search the books for children--Anto (talk) 04:56, 24 June 2008 (UTC)
- See Wikipedia:Make technical articles accessible. This isn't technical, but the same logic applies. I would like to believe that you shouldn't need a PhD to use wikipedia, but I am, after all, an American of no particular intelligence. Somedumbyankee (talk) 15:21, 24 June 2008 (UTC)
- This is not technical issue whether to use diacritics for some names. We can use them- Wiki software allows it. For the Latin Script names there is not need to transliteration. One thing you have to confess :Some things will never be clear to you. Some things you will never understand. Same thing for me. But you can not distort facts in order to be intelligible to you. The reality is complex as it is. Some things might be complex. What is the cure for that ?? LEARNING , LEARNING , LEARNING! Don't blame Newton because you don't understand calculus! Do not blaim Bruce Lee because you don't know karate. And don't blame Germans because you can't pronounce German names. it is only YOUR fault. If you don't understand something and don't want to know - go away from it. Don't be destructive.--Anto (talk) 17:41, 24 June 2008 (UTC)
- Indeed - and good point. Orderinchaos 17:55, 24 June 2008 (UTC)
- This is not technical issue whether to use diacritics for some names. We can use them- Wiki software allows it. For the Latin Script names there is not need to transliteration. One thing you have to confess :Some things will never be clear to you. Some things you will never understand. Same thing for me. But you can not distort facts in order to be intelligible to you. The reality is complex as it is. Some things might be complex. What is the cure for that ?? LEARNING , LEARNING , LEARNING! Don't blame Newton because you don't understand calculus! Do not blaim Bruce Lee because you don't know karate. And don't blame Germans because you can't pronounce German names. it is only YOUR fault. If you don't understand something and don't want to know - go away from it. Don't be destructive.--Anto (talk) 17:41, 24 June 2008 (UTC)
- (dropping indent count) Actually, I think he missed my argument entirely. The policy I pointed at was about making the articles accessible to the "average user" and has nothing to do with "technical problems". I am neither ignorant nor anti-intellectual and I pronounce German reasonably well from many years of singing it (the average jelly donut would probably find my pronunciation stilted, though). Wikipedia is an encyclopedia built on reliable sources, not the place to push proper spelling because "all of the
terroristscommunistsfascistsmethodistsHedy (HEDLEY!) Lamar's band of thugsother people are doing it wrong." Somedumbyankee (talk) 23:00, 24 June 2008 (UTC)- You have missed my point, entirely! We insist on using the name forms that are used persons themselves. And these are the facts! How some half-literrate journalists call him/her that is secondary- less relevant issue. precisely because [2] that everybody can write a book or build a website in which he can claim what evere he wants--Anto (talk) 10:58, 25 June 2008 (UTC)
- The New York Times, the BBC, the US Department of State, and the United Nations are "half literate"? Uh... yeah. These are all reliable sources about Slobodan Milosevic, and to kick it off, the Serbian Embassy to the United States uses no diacritics. Clearly they're a bunch of idiots who haven't given the topic much thought. Somedumbyankee (talk) 19:38, 25 June 2008 (UTC)
- You have missed my point, entirely! We insist on using the name forms that are used persons themselves. And these are the facts! How some half-literrate journalists call him/her that is secondary- less relevant issue. precisely because [2] that everybody can write a book or build a website in which he can claim what evere he wants--Anto (talk) 10:58, 25 June 2008 (UTC)
- All verifiable, but hardly reliable. Again, here is where Encarta shows its professionalism and accuracy: Slobodan Milošević. Wikipedia too gets it right, but you never know for how long... 124.102.8.155 (talk) 21:32, 25 June 2008 (UTC)
- So the Serbian Embassy can't spell his name properly, but Microsoft can. Excuse me while I have a good chuckle at that comment. Somedumbyankee (talk) 21:41, 25 June 2008 (UTC)
- All verifiable, but hardly reliable. Again, here is where Encarta shows its professionalism and accuracy: Slobodan Milošević. Wikipedia too gets it right, but you never know for how long... 124.102.8.155 (talk) 21:32, 25 June 2008 (UTC)
- Some sources are reliable for certain issues. But not for all. Including the spelling!!! I believe there was never a man in Serbia with name Slobodan Milosevic! --Anto (talk) 17:52, 26 June 2008 (UTC)
- Please read some reliable English language sources such as the ICTY website MILOSEVIC Case Information Sheet(IT-02-54) "Bosnia and Herzegovina" then you will be aware that such a person existed. --Philip Baird Shearer (talk) 19:18, 26 June 2008 (UTC)
- Between some random guy on the internet and the Serbian Embassy, I think it's obvious which one I find more credible. Somedumbyankee (talk) 18:01, 26 June 2008 (UTC)
- Some sources are reliable for certain issues. But not for all. Including the spelling!!! I believe there was never a man in Serbia with name Slobodan Milosevic! --Anto (talk) 17:52, 26 June 2008 (UTC)
- It does not change a fact that a named "Milosevic " did not exist. ICTY might be reliable for the data from his history. But not for his name spelling. Just take a look at the book of Carla del Ponte in which she calls "Serbs and Croats sons of bitches" here and here
Are we gonna put this statemnt somewhere ( in articles Croats , Serbs as a statement from "reliable source"?? --Áñtò | Ãňţõ (talk) 10:23, 28 June 2008 (UTC)
- By that logic, there never was a man named ar:سلوبودان ميلوسيفيتش, el:Σλόμπονταν Μιλόσεβιτς, ko:슬로보단 밀로셰비치, he:סלובודן מילושביץ', ja:ソロボダン・ミロシェビッチ, or th:สโลโบดัน มิโลเชวิช either. All those people also obviously stupid because they can't spell his name right!!!!one!! English is a distinct language and it has its own customs and traditions and ways to spell foreign words. It has just as much "right" as Hebrew to spell it differently. Somedumbyankee (talk) 14:36, 1 July 2008 (UTC)
If we apply either guideline correctly, because of redirects finding articles should not be a problem. As for the arguments pertaining to difficulty of reading the articles - are we really saying that this article in the New Statesman is fundamentally "less difficult to read" than this one from nine months later, because it doesn't use ö in Schröder? Or the Guardian's football reporting inherently confusing to English-speakers compared to The Independent's? Are the Economist's articles on Czech subjects easier to understand than those on French ones due to the vagaries of their style guide? When we look at the cost-benefit analysis of using diacritics, diacritics do have benefit to those who understand them; the extent of the cost to readers who do not understand them has not yet been demonstrated. I'm not saying "all diacritics should be allowed because they do less harm than good" - however, the discussion here might benefit from explicit demonstration of how diacritics negatively affect articles, so that it we can better focus discussion on specific issues and cases and begin looking at how these problems could be addressed. Knepflerle (talk) 19:06, 23 June 2008 (UTC)
- The Economist's style guide, as a crude rule of thumb, is doing more or less what current guidance would encourage. Françoise is English usage, and we should use it (so would the Economist); Plzeň is not; we use Pilsen, the Economist makes do with a easier method and plumps for Plzen. Septentrionalis PMAnderson 19:48, 23 June 2008 (UTC)
- Do these differences have any impact on our readers' understanding compared to theirs? Do readers cope with the small differences, perhaps in a similar way readers cope with differences in orthography between UK and US publications, or the way The Times uses Lyons and other press uses Lyon, or is there something fundamentally different in the case of diacritics? Are the negative effects provably worse in names from languages outside the French-German-Spanish-Italian axis? What if the form of the diacritics in other languages is the same as these ones - (eg Jana Novotná) - is it the form of the diacritic or the language of origin which is crucial? Knepflerle (talk) 20:18, 23 June 2008 (UTC)
- Is this a demand to adopt this policy unless we can do a scientific study on our readership's comprehension? If so, why should this rhetorical device be confined to this proposal? First show me the our readers benefit from Ngô Đình Diệm; I certainly don't, and I doubt many of the readers of the histories of Vietnam which don't use those diacritics do. Septentrionalis PMAnderson 20:27, 23 June 2008 (UTC)
- Do these differences have any impact on our readers' understanding compared to theirs? Do readers cope with the small differences, perhaps in a similar way readers cope with differences in orthography between UK and US publications, or the way The Times uses Lyons and other press uses Lyon, or is there something fundamentally different in the case of diacritics? Are the negative effects provably worse in names from languages outside the French-German-Spanish-Italian axis? What if the form of the diacritics in other languages is the same as these ones - (eg Jana Novotná) - is it the form of the diacritic or the language of origin which is crucial? Knepflerle (talk) 20:18, 23 June 2008 (UTC)
- And above all, the key question: what's wrong with writing this English Wikipedia in English? Septentrionalis PMAnderson 20:29, 23 June 2008 (UTC)
- I appear to have hit a nerve. I explicitly pointed out above there is no implicit demand in my questions. This simple cost-benefit analysis is what the major pro-diacritic argument boils down to - small positive benefit to small number of readers and no negative effect on the rest is still a small positive effect overall; I am just highlighting this in the hope that focusing on the details of this argument on both sides leads to new directions instead of infinite facile restatements of old chestnuts.
- We've been focusing on your broad-brush "key question" for years now, but there's no agreement yet for a variety of reasons. And yes, readers' comprehension is the correct yardstick against which we should be developing new ideas. What the use of diacritics can add or detract from readers' comprehension is precisely what we should focus on, and this is an invitation for people to do just that. Knepflerle (talk) 20:57, 23 June 2008 (UTC)
- PS: in a discussion on the generalities of using diacritics and in a subsection titled like this one, I think it's important to highlight some inherent contradictions and implementation problems with the eradication of diacritics (I wrote about this in more detail in this post to WT:NC(UE), but note the specific wording of the proposal there was quite different to this). However, I am quite happy in both the ideological consistency and practice of the current WP:UE's usage-based rules, sitting comfortably as it does with the core policy of WP:V - verifiability of orthography, not "truth" being the deciding factor. I support what we've got already over any alternative proposed so far. Hope this clarifies things somewhat. Knepflerle (talk) 21:20, 23 June 2008 (UTC)
- Using Đoković when our readers are accustomed to Djokovic, or Schroder when they are accustomed to Schroeder or Schröder are both barriers to comprehension. How high they are we cannot tell, so the cost-benefit analysis is unperformable, but both should be avoided as far as practicable. We should not assume away real costs, nor should we claim to know what we cannot; both are all-too-common problems with cost-benefit analyses. Septentrionalis PMAnderson 21:17, 23 June 2008 (UTC)
- The first case I might well personally believe. The second I cannot - all three spellings are widespread in English-language literature, just as we expect English-language speakers are accustomed to color/colour and -ise/-ize. Just say, merely for sake of argument, that Schroder were the predominant form seen in English-language sources, say 70/30 over Schröder, I still strongly doubt seeing Schröder would be a "barrier to comprehension", any more than the town of Zzyzx is hard to comprehend because it does not obey standard English orthographical rules. It is hard to believe comprehension is an issue when non-technical commonly-read English-language reliable sources such as the newspapers and magazines highlighted above use the diacritic regularly. It is hard to believe comprehension is impaired to any significant level at the Schröder article when other words might commonly use the same diacritic, say at Göttingen, unless there is a form of transient word-blindness I am unaware of. And yet, in that case would not use Schröder because of the predominance clause in WP:UE, even though it would be a verifiable, reliably-sourced spelling which might offer extra information to some of our readers without impairing the others. I'm not saying we should change our stance, but it's that kind of situation that has meant the discussion on this topic is still ongoing. Knepflerle (talk) 21:42, 23 June 2008 (UTC)
- It's Schroder and Gottingen that would be barriers to comprehension to a reader who expected the umlauts; majority usage works both ways. Septentrionalis PMAnderson 23:56, 23 June 2008 (UTC)
- You have somewhat missed my point: if the usage split as we measure it is say 60/40, then I doubt readers expect either version, or find either a barrier to comprehension, as they will be regularly exposed to both. Our readers expectations and understanding are not measured in the black-and-white of our majority decisions, which is why there may be a case for analysing other benefits of particular spellings. Knepflerle (talk) 13:09, 26 June 2008 (UTC)
- I agree that in the case of a 60/40 split (i.e. no obvious or consistent English usage) we should stick to the more complete spelling. My problem with this proposed guideline is that it recommends we use that spelling in 95/5 split cases where the spelling with diacritics is obviously not the common English usage. Somedumbyankee (talk) 14:36, 26 June 2008 (UTC)
- Instead of "no obvious/consistent usage" I would rather call it "parallel dual usage" - just like -ise and -ize spellings are both widespread in the English language canon taken as a whole, and we expect readers to see both here on en.wp just like they see both in English-language world. I agree that the current proposal goes too far, and that a little tweak to the existing WP:UE could account for this case - a sufficiently common level of use is what is important, not whether Google gives 46% or 52% of the results to one spelling (especially given Google's inherent biases, useless optical recognition of diacritics, patchy coverage, poor counting algorithms which lead to incorrect totals...). But that is a discussion for another day, at WP:UE. Knepflerle (talk) 16:32, 26 June 2008 (UTC)
- I also agree. Anyone who argues that we must take the 52% side of a 52/48 split is ignoring the basic justification of WP:UE; our readers will have seen the 48% usage. They are also ignoring the problem that all our search engines result in samples (and samples with unknown biases) of all English usage. What language would you propose? Septentrionalis PMAnderson 16:58, 26 June 2008 (UTC)
- As always, it'll be hard to convey the spirit precisely whilst eliminating ambiguity - I'll have a think, and post to here and UE. Knepflerle (talk) 17:40, 26 June 2008 (UTC)
- I agree that in the case of a 60/40 split (i.e. no obvious or consistent English usage) we should stick to the more complete spelling. My problem with this proposed guideline is that it recommends we use that spelling in 95/5 split cases where the spelling with diacritics is obviously not the common English usage. Somedumbyankee (talk) 14:36, 26 June 2008 (UTC)
- You have somewhat missed my point: if the usage split as we measure it is say 60/40, then I doubt readers expect either version, or find either a barrier to comprehension, as they will be regularly exposed to both. Our readers expectations and understanding are not measured in the black-and-white of our majority decisions, which is why there may be a case for analysing other benefits of particular spellings. Knepflerle (talk) 13:09, 26 June 2008 (UTC)
- It's Schroder and Gottingen that would be barriers to comprehension to a reader who expected the umlauts; majority usage works both ways. Septentrionalis PMAnderson 23:56, 23 June 2008 (UTC)
- The first case I might well personally believe. The second I cannot - all three spellings are widespread in English-language literature, just as we expect English-language speakers are accustomed to color/colour and -ise/-ize. Just say, merely for sake of argument, that Schroder were the predominant form seen in English-language sources, say 70/30 over Schröder, I still strongly doubt seeing Schröder would be a "barrier to comprehension", any more than the town of Zzyzx is hard to comprehend because it does not obey standard English orthographical rules. It is hard to believe comprehension is an issue when non-technical commonly-read English-language reliable sources such as the newspapers and magazines highlighted above use the diacritic regularly. It is hard to believe comprehension is impaired to any significant level at the Schröder article when other words might commonly use the same diacritic, say at Göttingen, unless there is a form of transient word-blindness I am unaware of. And yet, in that case would not use Schröder because of the predominance clause in WP:UE, even though it would be a verifiable, reliably-sourced spelling which might offer extra information to some of our readers without impairing the others. I'm not saying we should change our stance, but it's that kind of situation that has meant the discussion on this topic is still ongoing. Knepflerle (talk) 21:42, 23 June 2008 (UTC)
I think the point about Plzeň vs. Pilsen is off the mark: Plzeň is Czech, Pilsen is English. That's an entirely different issue than whether we write the Czech name Plzeň or "Plzen". I don't care whether we use the Czech or English name. However, I do object to the faux-Czech name "Plzen" because it is imprecise. Such easy-to-avoid imprecision is not appropriate in an encyclopedia. True, many readers won't know the difference and won't care. A few will know enough to supply the diacritic themselves. But there are a large number of us who appreciate seeing the actual name, and who don't know enough to fill in the gaps. Take Ngô Đình Diệm: Readers who don't know how to pronounce that won't be able to pronounce "Ngo Dinh Diem" either; however, those who know enough to work out Ngô Đình Diệm will get it completely wrong if we leave off the diacritics. As for Schroder, Schroeder, or Schröder, are we really going to get into an edit war with every Wikipedia article over which spelling is "most familiar" to which groups of people? Why not just write his name as it's spelled and leave it at that?
We don't need to dumb down Wikipedia on these matters. Any encyclopedia worth its salt shouldn't be dumbed down. Are we interested in emulating the EB here, or have we given up on ever achieving any respectability and are willing to settle for World Book? kwami (talk) 21:35, 23 June 2008 (UTC)
- Most English speakers can write, and have heard, "Ngo Dinh Diem"; they probably would pronounce it with a vile accent, but diacritics will not solve that; it is even less likely to be fixed by an unsourced spelling, which most English speakers have never seen and will learn nothing from (we can include it in parentheses, if any would). It is mere pedantry, and interferes with comprehension, to use spellings our readers have not seen.
- Inventing diacriticed spellings without authority is not merely dumbing down, it is being dumb; this proposal would mandate ignoring the authority of our actual sources to invent terms like Catherine of Aragón.
- As you can probably tell, I have had enough. If any editor supports this who actually has English as his native tongue, do let me know; until then, I utterly oppose this effort by aliens to rewrite the English language for their own convenience. Septentrionalis PMAnderson 21:51, 23 June 2008 (UTC)
- I gladly forgo any added validity of my arguments that depends only on the language of my parents. Knepflerle (talk) 22:13, 23 June 2008 (UTC)
- My native language is English. I'm also not talking about "inventing" spellings. Just the opposite: I'm suggesting that we don't invent spellings by deleting diacritics. I would never write "Catherine of Aragón" because that is English, and the long-assimilated English name of the province is "Aragon". In English, use English. However, when we have a name that is not established in English usage, I think that we should use the actual name. E.g. the provinces of Vietnam, which for the most part are completely foreign to English speakers, and have official Latin spellings, but which are presented here in bastardized form. Just because newspapers are sloppy and drop off the diacritics is no reason for us to be sloppy too. kwami (talk) 00:28, 24 June 2008 (UTC)
- See WP:UE#No established usage. Who are we to say when a word becomes established in English -- it sounds like original research. If we see what reliable English language sources use and copy those then we will be following WP:V and WP:NC. For example should we name the country Romania or Rumania or Roumania, before WWII it would probably have been Rumania, but current sources suggest Romania is most common. It may be in the future that names like Aragón become the norm in which case we can change or page name, but until then we should follow the lead in reliable English language sources. --Philip Baird Shearer (talk) 18:30, 26 June 2008 (UTC)
- I also take exception, I'm most definitely not "alien" - English is my first, and only, language. I can handle alphabets of about two dozen other languages without being able to speak them, but I think most educated people in Australia can as well, as can many who are not but are exposed to them in other ways (in particular South Slavic languages which are the third largest language minority in my country after Italian and Chinese.) Orderinchaos 16:06, 24 June 2008 (UTC)
- They aren't our inventions, they're other people's inventions. My (not so) humble opinion is that if the consensus of authoritative sources is wrong, Wikipedia should be wrong too. It's really the same concern as WP:TRUTH and WP:FRINGE. Somedumbyankee (talk) 01:29, 24 June 2008 (UTC)
- Yes, they are Your inventions! You (couple users here on en.wiki) are trying to make some non-existing "law" about English language by imposing rules that does not exist in any university English grammar tutorial. --Áñtò | Ãňţõ (talk) 10:51, 28 June 2008 (UTC)
- We are, after all, relying upon the authority of the best sources in English, not World Book: the New Cambridge Modern History uses Ho Chi-minh (XII, p. 325); Oxford DNB uses Ho Chi Minh (Kingsley Amis); so do our competitors. Who uses "Hồ Chí Minh"? If it is common usage in comparable sources to spell the provinces with diacritics, fine. But we should not redesign English. Septentrionalis PMAnderson 01:41, 24 June 2008 (UTC)
- (I didn't say we were relying on World Book.)
- Should we at least retain all diacritics if we place a name in italics as a foreign name? kwami (talk) 02:06, 24 June 2008 (UTC)
- Yes, as we should represent any foreign word correctly. But we should only do so when necessary; unnecessary foreign words are showing off, like the travel writers who displayed their German by using Bahnhof where "railway station" would have done just fine. For one thing, foreign letters, even single letters with diacritics, can render as little square boxes; I can testify that the same is true of accented Greek, and therefore the FA to which I largely contributed alternates on using the smooth breathing. Septentrionalis PMAnderson 02:47, 24 June 2008 (UTC)
- We are, after all, relying upon the authority of the best sources in English, not World Book: the New Cambridge Modern History uses Ho Chi-minh (XII, p. 325); Oxford DNB uses Ho Chi Minh (Kingsley Amis); so do our competitors. Who uses "Hồ Chí Minh"? If it is common usage in comparable sources to spell the provinces with diacritics, fine. But we should not redesign English. Septentrionalis PMAnderson 01:41, 24 June 2008 (UTC)
- "are we really going to get into an edit war" - well according to WP:UE we should research the usage at every talk page and use that as a binding decision. The edit wars are the unfortunate occasional consequence of conflict with editors' opinions. Normally if one spelling is predominant then using the other would impede understanding, but in cases like the one I mention above the predominance might not give any extra clarity but still cause loss of information useful for others. Whether we can develop a guideline that eliminates this possibility and still satisfies WP:V by not using spellings undocumented in English-language texts is an open question. Knepflerle (talk) 21:53, 23 June 2008 (UTC)
- In most cases it is clear what reliable English language sources indicate as common usage. That leaves two categories WP:UE#Divided usage: "When there is evenly divided usage and other guidelines do not apply, leave the article name at the latest stable version. If it is unclear whether an article's name has been stable, defer to the name used by the first major contributor after the article ceased to be a stub" and WP:UE#No established usage "...follow the conventions of the language in which the entity is most often talked about (German for German politicians, Turkish for Turkish rivers, Portuguese for Brazilian towns etc.)." --Philip Baird Shearer (talk) 19:18, 26 June 2008 (UTC)
wow. this discussion must have been going in circles for fully four years now. Without moving an inch forward in terms of reason or common sense. The only guideline we need is "check usage in English language WP:RS", end of debate. A good example of a case where diacritics are actually useful is Pāṇini (not an Italian sandwich). There are lots of English language sources that give Sanskrit terms in full IAST, no debate there. Catherine of Aragón otoh is an excellent example of what not to do. WP:RS, WP:UCS, all further debate on a case-by-case basis please. dab (𒁳) 11:49, 28 June 2008 (UTC)