Wikipedia talk:Naming conventions (Vietnamese)/Archive 1
This is an archive of past discussions on Wikipedia:Naming conventions (Vietnamese). Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 |
Diacritical marks
The main Wikipedia:Naming conventions (use English) page says that diacritical marks should not be used unless it is familiar with English readers. The whole idea behind transliteration is that an English-keyboard user need not be required to figure out how to type out the Vietnamese diacritical marks. Yellowtailshark 02:45, 29 May 2007 (UTC)
- The "Đ" is not pronounced like an English "D" in either northern or southern Vietnam. This creates a problem for words with this letter. For cities with well known romanizations like Saigon and Hanoi, we'd probably want to use our normal WP rule, like we do for Milan or Rome--just use the English version, without diacritics. But for names, it seems that using diacritics is good. Badagnani 03:53, 19 September 2007 (UTC)
- Quite honestly, I see no place for diacritics at all in any of the articles in English Wikipedia except as a gloss to illustrate the Vietnamese spelling of a place name or proper name after the first occurrence of the name in its normal, unaccented English spelling. This is not a question of political correctness (respect for how others spell names), it's a question of simple convenience for the vast majority of readers who neither know nor care about how Vietnamese names are accented and are not interested in obtaining the fancy software to be able to type in Vietnamese. I find it difficult to locate the articles I wrote recently on the 1860s Cochinchina Campaign because place names like Vinh Long and Bien Hoa (their normal spelling in English) have been given accents. This is supposed to be an encyclopedia written in English, not Vietnamese. Having said that, we should gloss all Vietnamese place and personal names with their accented versions on their first appearance in an article.
- This is how I personally have been dealing with the problem, in the lead sentence of my articles:
- The Capture of Bien Hoa (Vietnamese: Biên Hòa) on 16 December 1861 was an important allied victory in the Cochinchina campaign (1858–62).
- Diacriticals are used for all the European languages. See Gerhard Schröder, Horst Köhler, Hermann Göring, Göttingen, Lübeck, Finistère, or Lech Wałęsa. Finding the articles? That's what redirects are for. Kauffner (talk) 19:09, 25 July 2009 (UTC)
- Diacritics are used for European languages when English uses them. This is sometimes but not always; the most obvious example being George Frideric Handel, not Händel. Usage should prevail; some of these examples should be changed. Septentrionalis PMAnderson 16:05, 12 July 2010 (UTC)
- Diacriticals are used for all the European languages. See Gerhard Schröder, Horst Köhler, Hermann Göring, Göttingen, Lübeck, Finistère, or Lech Wałęsa. Finding the articles? That's what redirects are for. Kauffner (talk) 19:09, 25 July 2009 (UTC)
Knowing the History of Vietnam and its relations with other countries, especially English-speaking countries, since we're talking about an English Wikipedia, I'd guess that there are two major bodies of literature that talks about Vietnam: during the Indochina Wars, and literature written after the US embargo was lifted and Vietnam's relations with other countries became normalized. So when I weigh in on "common usage" argument for having diacritics, I suspect that literature written during the war would omit the diacritics, and that if there are diacritic usage in English media, then that is really a more recent phenomenon. It would be an exhaustive statistical research to count pronouns in all articles and books written about Vietnam during the war to see what names and geographical locations were often mentioned. My Lai, Ngo Dinh Diem, Bien Hoa and Lam Son seem like names that were mentioned often enough in media and books to constitute non-diacritics as common usage. But, let's say with a hamlet in North Vietnam, where media coverage during the war was limited, then it's not so clear cut. Spelling conventions for cases like those will likely to come from more recent sources (past 20 years), and quite possible that diacritic convention would dominate. yellowtailshark (talk) 12:18, 30 July 2010 (UTC)
Article title
I looked at a few government sites that use áo dài and ao dai. The US Embassy in Canberra, Australia uses ao dai. So does the HCMC People's Committee for its English-language pages. The site uses áo dài for its Vietnamese languages sections. It seems to me that diacritical omission will become used on official sources. Yellowtailshark 05:18, 19 September 2007 (UTC)
- "ao dai" appears in various English-language dictionaries, so this is a common usage name. Kauffner (talk) 07:55, 13 July 2010 (UTC)
- The official website of the government of Vietnam is recently writing names with diacritics in its English section. DHN 05:45, 19 September 2007 (UTC)
In the case of Cả River when I googled for it, the only reference was the Wikipedia article itself (hah! go figure). Ca River is mentioned in the Encyclopedia Britannica as well as this paper from the National University of Laos. Yellowtailshark (talk) 04:33, 14 May 2008 (UTC)
- As a model, scholarly usage beats official usage or newspaper usage. In scholarly writing, you put the diacriticals in as long as there are no technical barriers. Wikipedia doesn't have the technical barriers that prevented people from putting in diacriticals historically. For, say, German, the diacriticals go in, period. It doesn't depend on sources or official Web sites. It's Göttingen, Lübeck, and so forth. You can write good German without diacriticals, but Vietnamese without diacriticals is just gibberish. Kauffner (talk) 04:39, 30 June 2008 (UTC)
- We are not discussing the usage of German in English; and if we were, this would be wrong on both counts. English does not always use diacritics, and scholarly usage is not our model; our article titles are chosen for lay readers, not for specialists. Septentrionalis PMAnderson 19:11, 12 July 2010 (UTC)
- Let us, therefore, consider actual parallels. We do not - and should not - include Greek accents, ancient or modern, in article titles; we do not - and should not - include pinyin tones. In both cases, we indicate the marks once in a transcription of the Greek or Chinese characters. So here. Septentrionalis PMAnderson 22:00, 12 July 2010 (UTC)
- You do realize that Vietnamese speakers use the Latin alphabet with diacritics? The situation is therefore is parallel to German and Polish, but unlike Greek or Chinese, which have their own characters. If a publication doesn't use diacritics for technical reasons, it cannot be accepted as a model with respect to this issue. Of course English does not always use diacritics. Names that are common usage in English -- Hanoi, Saigon, Vietnam, etc -- should remain unaccented. Kauffner (talk) 07:55, 13 July 2010 (UTC)
- Yes, I do. That's why I'm comparing it to Pinyin (which has two sets of optional marks for the tones), not to Chinese characters. But even Latin alphabetic languages - perhaps especially they - are respelt on adoption into English; Novak Djokovic, the Djoker, and Handel are two clear examples of this. It may be that in a few centuries or even decades the diacritics of Vietnamese will be as familiar, and as widely adopted, as those of French; but I don't believe, and see no evidence, that that time has come. Wikipedia is not a crystal ball. Septentrionalis PMAnderson 01:50, 14 July 2010 (UTC)
- You do realize that Vietnamese speakers use the Latin alphabet with diacritics? The situation is therefore is parallel to German and Polish, but unlike Greek or Chinese, which have their own characters. If a publication doesn't use diacritics for technical reasons, it cannot be accepted as a model with respect to this issue. Of course English does not always use diacritics. Names that are common usage in English -- Hanoi, Saigon, Vietnam, etc -- should remain unaccented. Kauffner (talk) 07:55, 13 July 2010 (UTC)
Proper nouns
Names
I wonder how useful these templates would be? Template:Vietnamese name and Template:Vietnamese name2 Yellowtailshark 03:30, 29 May 2007 (UTC)
- I think these are very useful, firstly because most people would not know that it is the custom to use given names in the ensuing text, and because many of those who are not Asian specialists but who run categorizing and standardizing campaigns will need to know these things, so as to not list someone alphabetically by their given name, or to make other sorts of category/template/standards related mistakes. LordAmeth 14:33, 29 May 2007 (UTC)
Family-name, middle-name, given-name order? Or given-name, middle-name, family-name order? It seems for Vietnamese within Vietnam, the family-name is given first. But for those outside of Vietnam, you will also see the given-name first. Perhaps we should stick with the name order that the person is most commonly referred to as. Yellowtailshark 03:53, 30 May 2007 (UTC)
Locations
I see that words of cities are spelled joined, and sometimes not (e.g. Hà Nội → Hanoi; but Đà Nặng → sometimes Danang, sometimes Da Nang). Any thoughts on this? Yellowtailshark 03:34, 29 May 2007 (UTC)
- I am deeply interested, but not very knowledgeable or experienced in Vietnamese history, so I don't presume to speak from knowledge of what the scholarly standards may be. But on Wikipedia, I believe that we should try as much as possible to place things in the format most recognizable to the average English speaker. Our average reader is likely to have heard of Hanoi and Tonkin and Saigon, and so these places should be represented in the spelling most common in English; other places like Can Tho and Hai Phong I at least have not heard of, and so perhaps these (and the multitude of more obscure places) should be represented however is most proper in Vietnamese. Since Vietnamese is written in Roman letters (with diacritics, but not in Chinese characters or another writing system), I would imagine there ought to be standards within the Vietnamese language as to this issue, no? LordAmeth 14:33, 29 May 2007 (UTC)
- Perhaps what might help is to use the spelling from the city's official website. In regards to Danang, they consistently use it without the space. Yellowtailshark 02:58, 30 May 2007 (UTC)
- Proper Vietnamese is with diacritical marks and spaces between the syllables, e.g. Đà Nặng, not Danang or Da Nang. English-language usage isn't created by the city's official Web site. If there is a well-known English-language spelling, for example "Saigon" or "Hanoi," that should be used. Otherwise, we should follow Vietnamese usage. Kauffner (talk) 04:54, 30 June 2008 (UTC)
- There is a whole Vietnamese Wikipedia, for which proper Vietnamese spelling matters. Otherwise, this is an encyclopedic fact, which should indeed be mentioned - once per article. Septentrionalis PMAnderson 16:34, 12 July 2010 (UTC)
- Proper Vietnamese is with diacritical marks and spaces between the syllables, e.g. Đà Nặng, not Danang or Da Nang. English-language usage isn't created by the city's official Web site. If there is a well-known English-language spelling, for example "Saigon" or "Hanoi," that should be used. Otherwise, we should follow Vietnamese usage. Kauffner (talk) 04:54, 30 June 2008 (UTC)
Monarchs
Sometimes I see King An Dương Vương used, even though vương means "king". Would this be redundant? Should it just be King An Duong?
- The standard on Wikipedia is to not include titles in article titles. The article should thus be listed at An Duong or An Duong Vuong but not at King... anything. As for how he is referred to later in the article, I'd vote for king only because it's a term widely understood and recognized in English, and because it is widely accepted as the term used to refer to these rulers. There are plenty of other terms (shogun, Opperhoofd, Shah) which do not easily translate to a single term like "king", and those I think can certainly be used as is. However, the more obscure a term is, the more necessary it is to translate or explain it briefly in parens whenever used. In other words, if we are going to start articles with "So-and-so was a vuong of Annam in X year", then there really needs to be a "(King)" right after vuong. LordAmeth 14:33, 29 May 2007 (UTC)
- Although Wikipedia:Naming conventions (names and titles) says not to use titles, it also mentions that it does not apply to East Asian monarchs. According to Wikipedia:Manual of Style (Japan-related articles)#Names of emperors, "Emperor" is an integral part of the name. However, Vua (Emperor) Bao Dai doesn't have any sort of honorific titles in the article name. Then again, Bao Dai isn't the real name, but an imperial title for the era of reign. Likewise, it seems Vương (King) is an integral part of the imperial title. Which would suggest that we translate An Dương Vương as the An Duong King or King of An Duong (his real name was Thục Phán). An Duong, it seems, was a toponym. Yellowtailshark 18:57, 30 May 2007 (UTC)
- Oh, I see. And, of course, now that you mention it, I should have realized that all the Japanese, Chinese, and Korean monarchs do have the title included. Sorry. LordAmeth 22:36, 30 May 2007 (UTC)
- Vương and vua both mean "king". "Emperor" is hoang để. Vietnamese generally say vua. It's Vua Bao Dai (King Bao Dai), Các Vua Nguyễn (Nguyen dynasty) and so forth. Kauffner (talk) 03:36, 30 June 2008 (UTC)
Chinese and Pinyin transliterations
I do a lot of work on the Chinese/Vietnamese prehistory and ancient history articles. Because the two modern societies share quite a bit in common in terms of their ancient pasts, I use the Template:CJKV to standardize transliterations. One issue is, however, which name to use, Zhao Tuo or Trieu Da for the main article title? Yellowtailshark 03:30, 29 May 2007 (UTC)
- Vietnamese person -> Vietnamese name. Chinese person -> Chinese name. Zhao Tuo was Chinese. DHN 03:36, 29 May 2007 (UTC)
Copied over from /Tasks
- Comment - For articles about Vietnamese Americans who don't use diacritics when spelling their name, that's easy--we do the article titles without diacritics (though we can include them in the first paragraph). But for province names, for example, we have some with and some without. It might be best for these if we arrive at a consensus regarding one way of doing it. Badagnani 07:36, 18 July 2007 (UTC)
- Working on putting together Wikipedia:Naming conventions (Vietnamese). Yellowtailshark 03:45, 19 September 2007 (UTC)
Biography standards
I've been working on the biographies lately, so I have thought of some standards:
- Title is the name in non-diacritical form. Reasons: For any given Vietnamese name, published English-language usage will be nearly 100 percent without diacritics. This includes not just the news media, but also sources that you might think would include diacritics: Britannica, Columbia, Vietnam News Agency, and National Geographic. The title establishes normative use: It tells the reader that this is an acceptable English-language form. There is no significant use of Vietnamese diacritics in published English and we should not mislead the reader in this regard. The non-diacritical form of the name should certainly appear somewhere in the article. The title is most logical place for it since a typeable title makes searching and linking easier. Unlike printed encyclopedia articles, Wiki articles often function as stand-alone works. This makes our article titles analogous to book titles. Book titles rarely use special characters, and certainly don't use Vietnamese diacritics.
- The version of the name with diacritics goes in the opening and is put in boldface, per WP:MOSBIO. This avoids opening the essay with an awkward construction along the lines of "Le Quy Don (Vietnamese: Lê Quý Đôn)" .
- The name on top of the box should should be given in the opposite style as in the title, i.e. normally with diacritics.
- Running text should be free of diacritics that are merely decorative and do not serve an instructive purpose. For example, there is no need to repeat the diacritic version of the subject's name if this has already been given in the opening. Excessive diacritics in the running text create clutter and strain the eye. Kauffner (talk) 15:06, 25 July 2011 (UTC)
Updating the guidelines on diacritics
This page has yet to be updated to reflect the vote that was taken back in July. It's still basically a list of reasons not to use diacritics. There was a unanimous vote in favor of using anglicized forms when they are "in common use" (i.e. Hanoi, Ho Chi Minh, Vietnam, etc.). (See Question 1). But for other cases the vote was 3-2 in favor of using the Vietnamese form, or at least that is how I interpret Question 9. Kauffner (talk) 14:04, 2 February 2011 (UTC)
- I have taken this discussion as an indication of Wiki community views and made appropriate changes in the guidelines. Article titles are read by a broader group of people than the actual articles, so our use of diacritics should reflect that. Kauffner (talk) 05:47, 25 June 2011 (UTC)
- The link appears to be broken. Can you repair it and add the key sentence where it mentions Vietnamese please? In ictu oculi (talk) 03:00, 28 June 2012 (UTC)
- fixed now. --KarlB (talk) 19:01, 17 July 2012 (UTC)
- The link appears to be broken. Can you repair it and add the key sentence where it mentions Vietnamese please? In ictu oculi (talk) 03:00, 28 June 2012 (UTC)
I see the discussion above, and I have slightly more sympathy for removing tones in Vietnamese than for removing accents from French people, but there needs to be a WP:common sense line drawn somewhere. It is very odd to remove the tone from a major cultural item like ca trù. As I just mentioned to Kauffner I have never seen ca trù without the accent before - it is, as far as I have seen, always used on CD covers: CD1 CD2. The tone is also used by UNESCO. ... In this case I don't think ca + tru actually produces a new meaning, but tru without the tone looks like the verb "stay" which is a bit offputting. In ictu oculi (talk) 12:59, 24 June 2012 (UTC)
- Also what is going on with Traditional Vietnamese musical instruments and category:Vietnamese musical instruments? Should Dan Tinh be at đàn tính, or tính tẩu? In ictu oculi (talk) 14:04, 24 June 2012 (UTC)
- If I wanted to create an article on a subject related to Vietnamese music, the sources I would check are VietnamNet Bridge ("ca tru" site:english.vietnamnet.vn), Viet Nam News ("ca tru" site:vietnamnews.vnagency.com.vn), and VOV ("ca tru" site:english.vov.vn). These sites have hundreds of stories where "ca tru" is spelled without diacritics. All three are based in Vietnam, so there is no technical barrier that would prevent them from using diacritics. The general policy of Wikipedia is to "follow the sources" and to use the best sources available, preferably English language. We don't "correct" our sources by adding marks that they don't have. As for CDs, Amazon's top-selling Vietnamese music CDs are here and here. The most authoritative academic source on music is New Grove. I can't be sure how they spell "ca tru". But as they don't use diacritics for Polish, I assume that their policy is to use Latin-1 diacritics only. Kauffner (talk) 04:42, 18 July 2012 (UTC)
- Anyone will tell you that New Grove is absolutely not the most authoritative source for World Music, Garland is. I wasn't aware that New Grove wasn't a reliable source for Polish names until you mentioned that, but they certainly are a reliable source for Polish composers, spelling of names apart.
- Anyway, you say "The general policy of Wikipedia is to "follow the sources" and to use the best sources available," so on what basis is a diacritic disabled website like Viet Nam News which doesn't spell the new French president's name correctly a more reliable source for the spelling of Vietnamese musical instruments and other terms in South-East Asian musicology than The Garland Handbook of Southeast Asian Music?
- (Perhaps this essay needs a tag on the header saying it is under discussion). In ictu oculi (talk) 07:48, 18 July 2012 (UTC)
- Umm?? New Grove has Đàn ty bà and Charles Bodham. “Witold Lutosławski.” The New Grove. Why do you say New Grove doesn't? In ictu oculi (talk) 07:52, 18 July 2012 (UTC)
- If I wanted to create an article on a subject related to Vietnamese music, the sources I would check are VietnamNet Bridge ("ca tru" site:english.vietnamnet.vn), Viet Nam News ("ca tru" site:vietnamnews.vnagency.com.vn), and VOV ("ca tru" site:english.vov.vn). These sites have hundreds of stories where "ca tru" is spelled without diacritics. All three are based in Vietnam, so there is no technical barrier that would prevent them from using diacritics. The general policy of Wikipedia is to "follow the sources" and to use the best sources available, preferably English language. We don't "correct" our sources by adding marks that they don't have. As for CDs, Amazon's top-selling Vietnamese music CDs are here and here. The most authoritative academic source on music is New Grove. I can't be sure how they spell "ca tru". But as they don't use diacritics for Polish, I assume that their policy is to use Latin-1 diacritics only. Kauffner (talk) 04:42, 18 July 2012 (UTC)
Vietnamese diacritics on the main page
Squeamish readers shield your eyes but there are Vietnamese diacritics on today's main page (even though the article title does not include them). What gives? — AjaxSmack 02:35, 24 July 2012 (UTC)
- The article uses non-diacritic forms all through the text: Ngo Dinh Nhu, Nguyen Khanh, Ngo Dinh Thuc, etc. Full name boldfaced in the opening is the style I proposed above. Kauffner (talk) 05:58, 24 July 2012 (UTC)
Christina Schwenkel preface
This is from Christina Schwenkel The American War in Contemporary Vietnam: Transnational Remembrance and Representation. Bloomington: Indiana University Press, 2009 and may be helpful:
NOTE ON USE OF DIACRITICS
Vietnamese is a tonal language written in an adapted version of the Latin alphabet with additional diacritical marks to signify particular tones and vowel qualities. Without these diacritics, the meaning of a Vietnamese word is ambiguous. For this reason I have chosen to include diacritical marks in this book to most accurately represent terms, locations, and people's names.
However, at the same time, I recognize that diacritics may prove distracting to those unfamiliar with the conventions of the language. Taking into concern both specialists and generalists who may read this book, I opted to keep all Vietnamese diacritical marks with the exception of widely known geographical names such as Vietnam, Hanoi, Ho Chi Minh City, and Saigon. I also removed the diacritics for familiar Americanized phrases, such as “Viet Cong” or the “Ho Chi Minh Trail.” Vietnamese who have migrated to other countries often drop the diacritics from their proper names. I thus refer to individuals according to their own practice, and according to their choice in name order (in Vietnam surnames are placed first).
While I recognize potential inconsistencies in my own practice here (for example, Ho Chi Minh City versus President Hồ Chí Minh), I feel this is the most reliable solution for making the text accessible to all audiences.
In ictu oculi (talk) 00:28, 20 July 2012 (UTC)
- Amazon lists almost 14,000 books that discuss the Vietnam War. This is one of them. Kauffner (talk) 01:18, 20 July 2012 (UTC)
- The point is that this is one with a page explaining why English exonyms like Vietnam, Hanoi, Ho Chi Minh City, and Saigon are not given full Vietnamese spelling like Hồ Chí Minh or Ngô Đình Diệm which are not exonyms. In ictu oculi (talk) 01:54, 20 July 2012 (UTC)
- She admits that this is "my own practice". There is no claim that what she is doing is representative of any category of English-language material. GBooks has far more examples of "Ho Chi Minh", the allegedly no-established-usage dude, than for the "familiar Americanized phrases" Ho Chi Minh City and Ho Chi Minh trail, as you can see here. Kauffner (talk) 03:45, 20 July 2012 (UTC)
- The point is that this is one with a page explaining why English exonyms like Vietnam, Hanoi, Ho Chi Minh City, and Saigon are not given full Vietnamese spelling like Hồ Chí Minh or Ngô Đình Diệm which are not exonyms. In ictu oculi (talk) 01:54, 20 July 2012 (UTC)
- This is a very well-considered piece about how to resolve the diacritics issue - I heartily endorse it and hope this approach will be adopted by Wikipedia. Colonies Chris (talk) 17:26, 23 July 2012 (UTC)
Kauffner, did you read the information that was posted? Please stop being so dismissive of what other people are contributing; this source demonstrates the need for the diacritics in the proper pronunciation of the word. Sure, you might not come to wikipedia looking for that, but I can guarantee you that someone out there is. Keeping it in the article title will make their life that much simpler, and this practice is standard pretty much everywhere at this point. Zaldax (talk) 19:19, 27 July 2012 (UTC)
- It is not about "keeping it in the article title". The Vietnamese titles are predominantly at non-diacritic forms now. Not many readers will be familiar with the Vietnamese alphabet, and it's tricky to figure out pronunciation from the spelling even if you do know it. Pronunciation can explained in the opening of the article. Not all relevant information needs to be in the title. The title should tell the reader what the subject is commonly called in English, not multitask. As for the above, "very well-considered piece," it is simply an author's explanation of her own usage. This book is most obscure and its usage far out of the mainstream. English-language GBooks has tens of thousands of examples of "Ho Chi Minh", two or three for "President Hồ Chí Minh." Britannica and Columbia both use diacritics for European languages, but not for Vietnamese. Vietnamese has a far more diacritics than any European language, and not even National Geographic puts them it. Joe Wikipedia editor does not know better than the people who deal this issue professionally. Kauffner (talk) 00:44, 28 July 2012 (UTC)
- Kauffner, I only saw this this morning when P.T. Aufrette linked to it "I can boast of moving the Vietnamese bios and geography to non-diacritic titles -- It's hundreds of titles and took me several months to do. Kauffner (talk) 12:52, 14 November 2011 (UTC). Given that you have only actually created 1x Vietnam article (which ironically you argued for Trương Tấn Sang) what gives you the license to move 950+ articles under your own name and as many hundreds again misusing the dbmove tag with "uncontroversial moves" requests to admins (which shows in the dbmove log), and then lock the undiscussed moves by editing the redirects? And what is so terrible about Hồ Chí Minh (3 diacritics) compared to Cúán úa Lothcháin (4 diacritics), Lech Wałęsa (2 diacritics), Antonín Dvořák (3 diacritics), Ľudovít Štúr (4 diacritics), József Eötvös (4 diacritics), Andris Bērziņš (Latvian President) (3 diacritics) or this lot? Why should all English, Irish, West European, East European, Scandinavian, Turkish, Albanian, Hawaiian bios have the fully-spelled name according to reliable-for-statement-being-made sources in title lede and where possible in text, but not Vietnamese - after your 2,000 or so undiscussed moves. In ictu oculi (talk) 01:22, 28 July 2012 (UTC)
- Re: National Geographic, there is a fairly obvious reason why they chose to adopt a manual of style that omits Vietnamese diacritics: an untrained person with no knowledge of a European language can still add and proofread diacritics for that language, by visual inspection and using a tool like Character Map in Windows; however, you really can't do this for Vietnamese unless you have some knowledge of the language and its keyboard input methods. A commercial publication like National Geographic can't commit to having at least one Vietnamese-speaking staff member in perpetuity, so the simplest solution for them is to make an exception and omit diacritics for Vietnamese even though they don't omit them for European languages. However, a crowdsourced publication like Wikipedia does not have this problem: there are thousands of contributors, including native speakers of many different languages. If "Joe Wikipedia editor" left out the diacritics, someone else who cares about Vietnamese diacritics and is knowledgeable about them can come by at any time later and add them. We don't have the same constraints as a commercial print publication with a small staff, tight publication deadlines, and a final frozen published text that can never be modified later or corrected. So there is no need for us to adopt their manual-of-style practices either. — P.T. Aufrette (talk) 06:10, 29 July 2012 (UTC)
- Let's not forget that our WP:AT policy clearly mentions that in article titles we serve the general audience rather than the specialists. That's of course also the basis for WP:COMMONNAME. This Schwenkel note states that she concerns for both the specialists and the generalists, so is not very applicable as a model for our article titles. Cathering for the general audience in article titles can only mean that we use the term they are most likely to search for and is most recognizable for them. That purpose is generally not served by the diacritics in Vietnames names. If there is an existing anglicized version that is common in our sources, then that's the best title from the perspective of the general audience. And it's easy enough to figure out what rendering the audience is searching for. E.g in the case of the Lech Wałęsa. Even on a worldwide basis more people look up "Lech Walesa": [1]. And if you look at search in the UK [2] or the USA [3], then search for the diacritics rendering is virtually non-existent. Never mind the Vietnamese diacritics. You could move all Vietnamese names to diacritics spelling, but it wouldn't serve the general audience in any way. All it does is slow wp servers down, because the redirects get hit all the time. While P. T. Aufrette is right that we don't have the constraints of a print publication, we also don't have the constraints of existing in only one language. We can have all the Vietnamese diacritics we want in the Vietnamese version of wikipedia, so there is no need to fill up en.wp with them. MakeSense64 (talk) 11:16, 29 July 2012 (UTC)
- Yeah. It wouldn't help the readers in any way whatsoever, except for, I dunno, telling them the proper spelling of someone's name. I don't see why you don't find that useful, but I certainly do. We're not here to pander to our readers either.--Obi-Wan Kenobi (talk) 18:42, 29 July 2012 (UTC)
- I see the purpose of a title a bit differently. In my opinion, it should tell the reader what the subject is commonly called in English, per WP:UE and so forth. To provide marks that are extremely rare in published English-language sources does not provide information, but rather misleads. Kauffner (talk) 05:22, 30 July 2012 (UTC)
- @Obiwankenobi. That's a moot point. We always show the native spelling(s) in the opening paragraph. So if an article is kept at a common name title, it doesn't mean that readers cannot find the native spelling. Have a look at our article Ho Chi Minh City and you will see what I mean. Per our current AT policy we serve the general audience in article titles, and we serve both the general audience and the specialist in the article body. MakeSense64 (talk) 05:36, 30 July 2012 (UTC)
- I see the purpose of a title a bit differently. In my opinion, it should tell the reader what the subject is commonly called in English, per WP:UE and so forth. To provide marks that are extremely rare in published English-language sources does not provide information, but rather misleads. Kauffner (talk) 05:22, 30 July 2012 (UTC)
- Yeah. It wouldn't help the readers in any way whatsoever, except for, I dunno, telling them the proper spelling of someone's name. I don't see why you don't find that useful, but I certainly do. We're not here to pander to our readers either.--Obi-Wan Kenobi (talk) 18:42, 29 July 2012 (UTC)
- No, it's a disservice to the general audience, to present a low-fidelity reproduction of someone's name (or the name of a city) based on a few specious (and likely inaccurate) google searches.--Obi-Wan Kenobi (talk) 06:10, 30 July 2012 (UTC)
- The cities too? I don't see how that can be justified in terms of WP:NCGN. If the Vietnamese government wants anglicized spellings and uses them in VNA and the other English-language material they put out, are we going say, "Sorry, Vietnam, but you're not Vietnamese enough." There is an RM for cities here, if anyone reading has an opinion. Well, anyway, I guess Thành phố Hồ Chí Minh here we come. Kauffner (talk) 06:36, 30 July 2012 (UTC)
- Kauffner, the sarcasm doesn't help and I think there is a more pressing question you need to give a clear answer on, rather than expending bytes here.
- However, all Wikipedia:Naming conventions (geographic names) WP:NCGN says is: "Vietnam - A naming convention is under discussion at Wikipedia talk:WikiProject Vietnam" unquote.
- As for arguments like "If the Vietnamese government.. Sorry, Vietnam, but you're not Vietnamese enough." that evidently is (i) emotive and (ii) illogical given according to your own arguments when you supported diacritics at Talk:Truong Tan Sang RM where you say the main English newspaper in VN does use the diacritics on towns.In ictu oculi (talk) 03:28, 1 August 2012 (UTC)
- The cities too? I don't see how that can be justified in terms of WP:NCGN. If the Vietnamese government wants anglicized spellings and uses them in VNA and the other English-language material they put out, are we going say, "Sorry, Vietnam, but you're not Vietnamese enough." There is an RM for cities here, if anyone reading has an opinion. Well, anyway, I guess Thành phố Hồ Chí Minh here we come. Kauffner (talk) 06:36, 30 July 2012 (UTC)
- No, it's a disservice to the general audience, to present a low-fidelity reproduction of someone's name (or the name of a city) based on a few specious (and likely inaccurate) google searches.--Obi-Wan Kenobi (talk) 06:10, 30 July 2012 (UTC)
Other contributors, Would everyone here agree that there is a pyramid for exonyms, as in most languages:
- Toponyms - some English exonyms#Vietnam
- Personal names - no exonyms
- Things, category:Vietnamese words and phrases - by default in Vietnamese In ictu oculi (talk) 03:28, 1 August 2012 (UTC)
- I think you may have a general trend there, but no clear lines. There are very obviously exonyms in English for foreign names - but this is not simple diacritic stripping. And things, again it depends on whether the thing is so common that it has entered the language as native - I don't know of any vietnamese phrases myself, but there may be some french ones that we use regularly in English like c'est la vie that are widely understood.--Obi-Wan Kenobi (talk) 19:21, 1 August 2012 (UTC)
Gallery
I thought this might be at least good to think about while we pause. In ictu oculi (talk) 18:39, 1 August 2012 (UTC)
-
Renée Geyer, Australia
-
Severina Vučković, Croatia
-
Như Quỳnh, Vietnam
-
Jurga Šeduikytė, Lithuania
-
Phương Vy, Vietnam
-
Hélène Ségara, France
-
Róisín Murphy, Ireland
-
Hiền Thục, Vietnam
-
Sıla Gençoğlu, Turkey
-
Björk Guðmundsdóttir, Iceland
-
Mỹ Linh, Vietnam
-
Ági Szalóki, Hungary
-
Élodie Frégé, France
-
Laura Samojłowicz, Poland
-
Ximena Sariñana, Mexico
-
Gülseren Yıldırım, Turkey
-
Lương Bích Hữu, Vietnam
-
Zoë Skoulding, England
-
Mairéad Ní Mhaonaigh, Ireland
-
Hồ Ngọc Hà, Vietnam
-
Michèle van der Aa, Netherlands
-
Joanna Jabłczyńska, Poland
-
Marija Šestić, Bosnia
-
Thanh Hà, Vietnam
- ahh.. hurts my eyes... all those lovely singers, with such non-english names. too many squiggly lines! What were their parents thinking? Shall we anglicize all of their names? I bet they don't even sing in English... Obi-Wan Kenobi (talk) 19:18, 1 August 2012 (UTC)
- Indeed. My purpose in posting this gallery was to make the point that on en.wp French, Irish, Czech, Lithuanian, Serbian, Turkish BLPs all retain the Latin alphabet diacritics. Why then is Vietnamese Latin alphabet to be an exception?
- So far in the discussion following the original (no good faith in my view - witness the scare tactic "and Sài Gòn"), and canvassed RfC the following editors (A) expressed support for treating Vietnamese names as any other Latin alphabet. For other editors it isn't clear whether opposition (B) includes acceptance of en.wp practice re East European and Turkish BLPs. (C) includes disagreement with use of diacritics in East European and Turkish names. I don't know about others but I for one would be interested in seeing how that breaks down. So have made the little Census box below. (The reason for specifying "East European and Turkish" is that some editors accept French/Spanish/German since NY Times uses French/Spanish/German, so the point of "East European and Turkish" is that en.wp uses these but majority non-specialist English Google Books and Google News sources such as NY Times, don't. In ictu oculi (talk) 02:40, 4 August 2012 (UTC)
Even More Other options
Comment Having watched this ongoing diacritics saga, it has become clear to me that rewriting our policies or guidelines will not solve this. We simply have people who are and remain on opposite sides of this. We need a solution that satisfies both sides. So let me suggest a few other options that are technically possible:
5) Change the WP code to allow for multiple article titles. Readers who access the page through "Dang Huu Phuc" see that as the title, readers who came searching for "Đặng Hữu Phúc" get the page rendered with diacritics in the title. Everybody happy, and diacritics wars become a thing of the past.
6) Create an "English wikipedia" and an "English(international) wikipedia". The English wp uses anglicized names whenever they exist in any reliable sources, the "international" version uses native spelling of names as much as possible. Everybody happy, and our English readers with a visual impairment can read WP again (right now they struggle to do so)
7)Embed a "diacritics stripping" script into WP, which people can turn on or off in their own user settings. Such scripts already exist in Perl and Javascript: [4] and [5].
Why continue to argue about an issue that technology can solve with a few lines of code? Let everyone make their own choice wherever we can. User customization is the way of the future anyway, so there is no need for WP to be a laggard in that regard. MakeSense64 (talk) 06:01, 23 July 2012 (UTC)
- MakeSense64 The diacritics saga is over - all en.wp articles are now spelled correctly except for Ana Ivanovic, who we can leave there as a monument to what Joy (Shallot) calls it. That only leaves Vietnamese. You may want to take your technical request to Village Pump. In ictu oculi (talk) 06:35, 23 July 2012 (UTC)
- Btw - sorry about cutting and pasting this chunk inadvertently just now, got caught in an edit conflict and uploaded wrong version. Restored. Actually your proposal 7 isn't that bad, Prokonsul Piotrus looked into this last year. In ictu oculi (talk) 06:44, 23 July 2012 (UTC)
- Reply No problem, I have become used to that kind of things. And I am not interested in the village pump, if there is some use in my idea it will get picked up sooner or later.
- You think the diacritics saga is over, and I think it only gets started once you have moved almost every name to foreign spelling. Some people hate it to see names with diacritics stripped on WP, others don't like to have foreign spelling being pushed in their face and see it as a form of "reverse-colonialism". That's not going to change. If WP itself doesn't offer a "plain English" version without diacritics for those readers who want a "normal" read (for their English eyes), then WP will not only continue to lose editors, then it is only a matter of time till somebody else offers a more "readable" version of English WP, because the technology to strip diacritics from pages already exists. It is that simple: you serve your customers or you lose them (sooner or later). MakeSense64 (talk) 07:25, 23 July 2012 (UTC)
- Okay but is Prokonsul Piotrus' idea not yours and he already looked into it. It won't be reality unless people really want it.
- Yes I think the European diacritics saga is over. I would even say that your WP:TENNISNAMES essay was the thing that led to the end. But anyway, we're here to discuss Vietnamese. In ictu oculi (talk) 07:39, 23 July 2012 (UTC)
- I don't know Prokonsul Piotrus. I think you are dreaming that the diacritics saga is over just by having moved (nearly) all names to native spelling titles. As you know, some countries like Macedonia seem to be dropping diacritics, and all over the world young people are discovering that they can easily read and write without diacritics when they send an SMS. Spelling is something that will remain in flux for a long time to come. The best way to deal with it is for WP to make space for all existing variations, because that will satisfy most of the users.
- Essays are meant to stimulate discussion, and WP:TENNISNAMES has succeeded beyond expectation in doing so. You are still talking about it. What more one can ask for?
- MakeSense64 (talk) 09:36, 23 July 2012 (UTC)
- Hi MakeSense64
- Prokonsul Piotrus' page is User:Piotrus.
- Macedonian isn't a Latin alphabet language, it is a dialect of Bulgarian and both are Cyrillic, it is only natural that after independence they will probably eventually gravitate to using Bulgarian romanization rather than their traditional Serbian romanization. That has zero to do with why en.wp articles, except for Ana Ivanovic, are at European Latin-alphabet names. And again it has zero to do with Vietnamese. Sure, I give your essay full credit, as above. But we are here to talk about Vietnamese. :) In ictu oculi (talk) 10:38, 23 July 2012 (UTC)
- You digressed by starting to talk about "European" diacritics saga and WP:TENNISNAMES, not me. Credit where it belongs. MakeSense64 (talk) 11:07, 23 July 2012 (UTC)
- Please look at your first sentence "Having watched this ongoing diacritics saga," - you clearly meant foreign names in general not just Vietnamese ones. Everyone else is beyond this. In ictu oculi (talk) 12:54, 23 July 2012 (UTC)
- Whom are you trying to fool, IIO? Ever since you commented in this topic last week you have been talking about "European names", "Czech accents", Turkish, Polish, and so on... but others should stick to Vietnamese only. heheh?
- It's fairly obvious that the "other options" I floated would affect all diacritics on WP... But please read your own posts before asking others to discuss Vietnamese only. No double standards, please. MakeSense64 (talk) 13:42, 23 July 2012 (UTC)
- Whatever. In ictu oculi (talk) 14:01, 23 July 2012 (UTC)
- Please look at your first sentence "Having watched this ongoing diacritics saga," - you clearly meant foreign names in general not just Vietnamese ones. Everyone else is beyond this. In ictu oculi (talk) 12:54, 23 July 2012 (UTC)
- You digressed by starting to talk about "European" diacritics saga and WP:TENNISNAMES, not me. Credit where it belongs. MakeSense64 (talk) 11:07, 23 July 2012 (UTC)
- Btw - sorry about cutting and pasting this chunk inadvertently just now, got caught in an edit conflict and uploaded wrong version. Restored. Actually your proposal 7 isn't that bad, Prokonsul Piotrus looked into this last year. In ictu oculi (talk) 06:44, 23 July 2012 (UTC)
Without wanting to step into the same hellish discussions as ever, I really like MakeSense's option 7 a lot. The Serbian Wikipedia does a similar thing enabling users to switch between Latin and Cyrillic. The basic Latin letters only version could even transcribe all text not in {{lang}} templates in article text as well as in titles. - filelakeshoe 14:27, 24 July 2012 (UTC)
- It's an interesting idea, but the articles still have be at a single title. If we could implement option 7, would people agree to leaving these titles with the correct diacritics, along with an ascii-redirect? Then users who don't want to see diacritics can turn them off, which would strip them from titles and elsewhere. One problem would be, what do you do in cases where accents are used to disambiguate (ex: Hue and Huế)?--Obi-Wan Kenobi (talk) 14:47, 24 July 2012 (UTC)
- I think such cases are pretty rare, and there is probably a way to avoid that kind of problem. If such a "switch option" system already exists for Serbian wp, then it cannot be much of a problem to create a diacritics-stripper for en.wp. It never hurts to give people choice. I am even thinking to create a diacritics-stripper as a Firefox plugin. MakeSense64 (talk) 14:57, 24 July 2012 (UTC)
- The second article would still be at Huế in the database and its URL would be /wiki/Huế, but the title would appear as "Hue" for people with diacritics stripper enabled. - filelakeshoe 15:07, 24 July 2012 (UTC)
- That's right. The display text would be stripped of diacritics, but the hyperlinks to other pages would be unaffected and thus work normally. MakeSense64 (talk) 15:19, 24 July 2012 (UTC)
- I'd like to see a citation for the claim that diacritics cause people with visual impairment to "struggle" to read text. If that were the case, then surely accessibility tools for French and German speakers would strip diacritics for native speakers of those languages who have a visual impairment. I have not heard of such a thing. But I wouldn't oppose a diacritic-stripping option in Preferences, purely as a customization feature, as long as it's not the default. — P.T. Aufrette (talk) 06:24, 29 July 2012 (UTC)
- I have no sources for you (although they may exist). Visually impaired people generally have no problems to read the diacritics that are common in their native language, because they are used to them. It is when the text starts being full of diacritics that are completely alien to them (like Vietnamese diacritics), then it becomes a hinderance. Heck, they are even a hinderance for people without visual impairment. Here is an online diacritics-stripper I have tried: [6]. E.g. paste the text from our Ho Chi Minh article into it. The diacritics-stripped version is a much smoother read, reading the diacritics version makes me tired before I reach the end of the article. But maybe that's just me.
- I agree that diacritic-stripping should not be made the default option. But I have noticed that more and more notable websites customize texts (and even language) based on the user-locale, and I think that's an interesting possibility too. When a reader from the UK or USA accesses a wp page then it could default to diacritics-stripped, when a reader from Eastern-Europe or Vietnam accessed a page it defaults to diacritics-on....(almost) everybody happy. WP does try to "serve" its readers, doesn't it? MakeSense64 (talk) 07:23, 29 July 2012 (UTC)
Just to clear up a misunderstanding:the "switch option" in the Serbian wikipedia is not stripping diacitics but is converting between two equally valid script versions of the Serbian language, one Latin and one Cyrillic. Agathoclea (talk) 06:21, 17 August 2012 (UTC)
Summary of discussion
Use diacritics
- “Use of diacritics is educational, which is one of our goals." Br'er Rabbit (talk) 00:57, 30 July 2012 (UTC)
- “This blog entry explains for example why the Economist doesn't use VN diacritics - bottom line it is a staffing issue.; Wikipedia has no such issues." KarlB/Obiwankenobi
- "I think the thing I really notice is that Vietnamese English-publication sources are beginning to use the accents - like Vietweek, Baomoi, scholarly western books - at exactly the same time en.wp has been pulled in the opposite direction." In ictu oculi (talk) 02:16, 23 July 2012 (UTC)
- “We have the technology, so let's use it!” Mjroots (talk) 22:54, 22 July 2012 (UTC)
- “When many English sources still use the diacritics, there is no reason for WP not to as well.” Dicklyon (talk) 15:34, 22 July 2012 (UTC)
- “exactly per In ictu oculi's rationale.” — Dmitrij D. Czarkoff
- “Removing the diacritics causes LOSS of information”, Zaldax (talk) 19:16, 27 July 2012 (UTC)
- "Skipping the diacritics would be failing in our duty as an modern online encyclopaedia, with Unicode at our disposal, to represent facts accurately, even if they are difficult.” Colonies Chris (talk) 21:10, 22 July 2012 (UTC)
- "for the less known words, please keep the diacritics intact and don't just cite Britannica for killing them (diacritics), since you will kill the meaning accompagning the words as well." Grenouille vert (talk) 15:03, 22 July 2012 (UTC)
- "We don't have the same constraints as a commercial print publication with a small staff, tight publication deadlines, and a final frozen published text that can never be modified later or corrected. So there is no need for us to adopt their manual-of-style practices either." — P.T. Aufrette (talk) 06:10, 29 July 2012 (UTC)
- I'd looked at those pictures of people above, a lot of other languages that use the Latin alphabet are able to keep their diacritics so why not Vietnamese (except for the common words such as Hanoi, Saigon); per Mjroots. ༆ (talk) 23:30, 5 August 2012 (UTC)
- We are an encyclopedia. Our goal is to educate. We should have the proper names as the title. -DJSasso (talk) 16:45, 7 August 2012 (UTC)
- Oppose unnecessary removal of information. And what Itsmejudith said. —Kusma (t·c) 19:11, 7 August 2012 (UTC)
- The diacriticals: add information, do not detract sufficiently from the underlying characters, are handled by redirects (in titles), educate readers, and may even spark an interest in their effect .GFHandel ♬ 23:03, 7 August 2012 (UTC)
- There are many words in Vietnamese without diacritics look exactly the same, and that can cause many misinformations during the process of identifying the Vietnamese name, both people's name and geographic name. Sholokhov (talk) 15:22, 14 August 2012 (UTC) — Preceding unsigned comment added by Sholokhov (talk • contribs)
- Stripping diacritics is not the same as anglicizing the name. If diacritics are removed, the debate will change from diacritics usage to proper spelling sans diacritics, and will result in a much higher chance of Wikipedia hosting incorrect information. Zaldax (talk) 13:48, 15 August 2012 (UTC)
Follow English-language sources [disputed summary title, see below]
- “we, ‘follow the general usage of English-language reliable sources.’ This is something that WP:DIACRITICS already stipulates.” Kauffner (talk) 19:08, 18 July 2012 (UTC)
- “English readers, particularly those who read no other language, are accustomed to some French and Spanish diacritical marks through long proximity…. Vietnamese is far more peripheral to the typical native English reader.” Yopienso (talk) 16:47, 20 July 2012 (UTC)
- “Use the diacritic-free English alphabet to title Vietnamese subjects. This convention seems to be used universally among English-language sources: even by English-language publications in Vietnam!” Shrigley (talk) 20:51, 20 July 2012 (UTC)
- “Having those diacritics means that it's not an English article title,” User:Benlisquare
- ”I would say that we can retain diacritics for infoboxes only" Colipon+(Talk) 18:46, 21 July 2012 (UTC)
- "We can have all the Vietnamese diacritics we want in the Vietnamese version of wikipedia, so there is no need to fill up en.wp with them." MakeSense64
- "It is not normal for English language sources to use Vietnamese diacritics" Formerip (talk) 14:12, 26 July 2012 (UTC)
- "We are writing an English encyclopaedia. Where English has a significant impact in the sources, or where the object is known with an English word, we use the English word (pho)." Fifelfoo
- “I believe the article names should be without diacritics, but the articles themselves should have them as needed.” User:Rms125a@hotmail.com
- "I think that is more likely to be consistent [to follow the spelling used by other encyclopedias],” Eraserhead1 <talk> 19:13, 22 July 2012 (UTC)
- “The rendering of text in the majority of reliable English sources is without diacritics, which is how titles and in-prose usage should be rendered here.” – NULL ‹talk› ‹edits› 22:50, 22 July 2012 (UTC)
- “Page titles should not use diacritics in the English wiki because people can't type them in (although obviously redirects should exist using them), but the article should employ diacritics.” Ogress smash!
- “I believe wikipedia should try to use the English commonname , if there are sources that do not use the diacritics, then nor should wikipedia.” BritishWatcher (talk) 13:55, 1 August 2012 (UTC)
- Seems like a no-brainer for Vietnamese names to follow the general usage of English-language reliable sources. Fyunck(click) (talk) 07:21, 12 August 2012 (UTC)
- There is also good reason to use the most common name WP:COMMONNAME, because doing otherwise means that a lot of page searches are sent through a redirect. It is also inappropriate to simply add diacritics to shortened romanized names that sportspeople use internationally, as the Manuel Sanchez (tennis) example shows, without a little research (like on a foreign-language Wikipedia) to confirm whether such usage (of the shortened name with diacritics) is considered acceptable in the person's home country. LittleBen (talk) 11:36, 12 August 2012 (UTC)
Other
- "We stick with current practice, which is to apply the general MoS on diacritics" Itsmejudith (talk) 16:35, 23 July 2012 (UTC) [I note that MOS:FOREIGN says, "adopt the spellings most commonly used in English-language references for the article."]
After I looking at the comments above about the Vietnam's English language press, I took walk around to the local newsstands and discovered that Viet Nam News has been largely replaced by diacritic-free Saigon Times. So I am certainly not seeing any kind of trend in favor of increased use of diacritics in English-language publications. Kauffner (talk) 18:00, 4 August 2012 (UTC)
- As above (1) this RfC is somewhat compromised by the way you sent out selective invitations, but (2) it would still be interesting to hear from those opposed to Vietnamese diacritics as to how this fits into the broader en.wp context in relation to other Latin-alphabet languages. Beyond that I don't see this exercise acheiving anything greatly useful, other than generating smoke. In ictu oculi (talk) 23:27, 4 August 2012 (UTC)
More Talk page deletions: Neither I nor anyone else should not have to follow an editor to 2RR to restore my own talk on a talk page. This RfC is already something of a charade given the context in which it was launched, the way the leading question was phrased. But it would have been useful, as above to understand how those opposing the same treatment for Vietnamese as the de facto treatment of Czech etc. contextualise Vietnamese in relation to other East European languages. I would prefer it if another editor restored my talk here which Wolbo has again deleted.
In ictu oculi (talk) 00:05, 5 August 2012 (UTC)
- if one rigged summary is allowed then another riggged summary should be allowed as well. The above summary is flawed as it contrasts the use of diacritics with "use English" when it has long been established that stripping diacritics does not make something English, just incomplete. As such I am for using diacritics even for Vietnamese as it is a Latin script as long as there are no established alternatives akin Hanoi. Agathoclea (talk) 16:02, 5 August 2012 (UTC)
- I agree. The "Use English" heading should be changed to "don't use diacritics" or something like that (so I changed it). Many of the points listed there are nuts, in my opinion, like "titles should not use diacritics in the English wiki because people can't type them in" (irrelevant, as there's no need for anyone to type them in when the article is titled with them), and "if there are sources that do not use the diacritics, then nor should wikipedia" (appeal to the bottom), and "It is not normal for English language sources to use Vietnamese diacritics" (a bold claim, considering the diversitiy of actual practice). Dicklyon (talk) 19:25, 5 August 2012 (UTC)
- Agathoclea, following above 2 posts I've restored the box and text that Wolbo twice deleted (I've never seen text deleted like that off an Talk page before). As regards "rigged" - I'd genuinely hoped it is not in any way rigged, the purpose is to contextualise the Vietnamese diacritics with the convenient broad MOS types per Prolog's page - (A) Academic, Schwenkel example, (B) National Geo, (C) NY Times/Economist (D) tabloid/html. The rigging in this RfC is more that this is a survey of those Kauffner selectively invited to answer a misleading and overstated (Sai Gon) question. But having got this far we might at least find out whether there is a scale. I thought User:Quis seperabit's comment interesting and clear enough to add in National Geo MOS. There's always the (E) "Other" box, for other. In ictu oculi (talk) 02:12, 6 August 2012 (UTC)
- Kauffner has reverted Don't use diacritics back to follow English sources - this is WP:GAME again, setting up an "RfC" on a slanted comment. At best this should say "follow the majority of English sources" - since counting sources is the basic rationale in so many of the "Don't use diacritics" discussions. The heading "follow English sources" is misleading since always a minority (academic sources, Viet News etc.) do use Vietnamese diacritics, same as for French Czech, whatever. Let's have at least one thing here that is not GAMED please. In ictu oculi (talk) 02:18, 6 August 2012 (UTC)
- Folks, please don't interfere with the consensus building process by altering other editors comments on the talk page. In ictu oculi should not have to revert other editors just to make something he posted visible. It doesn't matter if you think it is inappropriate, or if you think it doesn't conform to Wikipedia policy (personally, I think that the posted argument summary is fine and in no way violates policy.); rather than deleting his post, please explain why you consider the post inappropriate. I'm sorry, but reverting an editor's comments on a talk page is, quite frankly, rude. As is reverting the discussion title currently summarizing the RFC to reflect a contested point, Kauffner. I'm sorry to call you out directly, but honestly given the controversy I think you should have known better; making this change anyway opens you up to accusations of violating WP:GAME. This is not a political debate; both sides can't style themselves "Pro-This" and "Pro-That". The positions are "Use Diacritics" and "Do not use Diacritics." (Your new title, "Follow English Language Sources", isn't even correct for those arguments -- not all of them argue to do that!) That's all there is to it; in accordance with that, I have changed the discussion heading to the proper title. Please do not change it again. Zaldax (talk) 14:19, 7 August 2012 (UTC)
- Kauffner has reverted Don't use diacritics back to follow English sources - this is WP:GAME again, setting up an "RfC" on a slanted comment. At best this should say "follow the majority of English sources" - since counting sources is the basic rationale in so many of the "Don't use diacritics" discussions. The heading "follow English sources" is misleading since always a minority (academic sources, Viet News etc.) do use Vietnamese diacritics, same as for French Czech, whatever. Let's have at least one thing here that is not GAMED please. In ictu oculi (talk) 02:18, 6 August 2012 (UTC)
- Agathoclea, following above 2 posts I've restored the box and text that Wolbo twice deleted (I've never seen text deleted like that off an Talk page before). As regards "rigged" - I'd genuinely hoped it is not in any way rigged, the purpose is to contextualise the Vietnamese diacritics with the convenient broad MOS types per Prolog's page - (A) Academic, Schwenkel example, (B) National Geo, (C) NY Times/Economist (D) tabloid/html. The rigging in this RfC is more that this is a survey of those Kauffner selectively invited to answer a misleading and overstated (Sai Gon) question. But having got this far we might at least find out whether there is a scale. I thought User:Quis seperabit's comment interesting and clear enough to add in National Geo MOS. There's always the (E) "Other" box, for other. In ictu oculi (talk) 02:12, 6 August 2012 (UTC)
- I agree. The "Use English" heading should be changed to "don't use diacritics" or something like that (so I changed it). Many of the points listed there are nuts, in my opinion, like "titles should not use diacritics in the English wiki because people can't type them in" (irrelevant, as there's no need for anyone to type them in when the article is titled with them), and "if there are sources that do not use the diacritics, then nor should wikipedia" (appeal to the bottom), and "It is not normal for English language sources to use Vietnamese diacritics" (a bold claim, considering the diversitiy of actual practice). Dicklyon (talk) 19:25, 5 August 2012 (UTC)
Also, given Kauffner's ongoing belligerence (seriously dude, take a step back), I just thought it would be interesting to point out this comment from near the top of the page:
- "Diacriticals are used for all the European languages. See Gerhard Schröder, Horst Köhler, Hermann Göring, Göttingen, Lübeck, Finistère, or Lech Wałęsa. Finding the articles? That's what redirects are for. Kauffner (talk) 19:09, 25 July 2009 (UTC)"
So, uh, when did that change? Zaldax (talk) 22:03, 7 August 2012 (UTC)
- I did answer this question in my 19 July and 24 July posts above, you know. It seems that you are on auto-fulminate -- the setting where you don't have read someone's writing before denouncing it. Kauffner (talk) 06:07, 12 August 2012 (UTC)
- I'm not even going to dignify that with a response, Kauffner. If you aren't interested in discussion, that's fine, but please don't accuse other editors harboring some sort of vendetta against you. I came to this issue via the RFC board, and was invited by neither side; I'm not involved in any of the disputes surrounding you. So please, refrain from personal attacks; they can't possibly help anything, or anyone. With regards to the comment, in an RFC of this size anyone is bound to miss something; it's important to
strike-througha comment to show other editors that you retract it, so as to avoid misunderstandings such as this one. Cheers, Zaldax (talk) 15:53, 14 August 2012 (UTC)
- I'm not even going to dignify that with a response, Kauffner. If you aren't interested in discussion, that's fine, but please don't accuse other editors harboring some sort of vendetta against you. I came to this issue via the RFC board, and was invited by neither side; I'm not involved in any of the disputes surrounding you. So please, refrain from personal attacks; they can't possibly help anything, or anyone. With regards to the comment, in an RFC of this size anyone is bound to miss something; it's important to
- I did answer this question in my 19 July and 24 July posts above, you know. It seems that you are on auto-fulminate -- the setting where you don't have read someone's writing before denouncing it. Kauffner (talk) 06:07, 12 August 2012 (UTC)
- Obiwan/KarlB sent invitations to everyone I missed, plus others. So I don't think the "Use Vietnamese" side can claim an invitation disadvantage. Has anyone seen another RfC that consists of kilobyte after kilobyte of accusations and opposition research against a particular user? I would like to clarify something in case anyone is confused: I am not running for president, U.S. senate, city council, dogcatcher, admin, or rollbacker. Kauffner (talk) 14:37, 11 August 2012 (UTC)
- I've changed the heading back to the proper title again. Kauffner, do NOT change it again. Also, stop trying to play the victim card; you're pretty seriously violating WP:GAME now...So discuss this in a proper manner, and support your arguments properly, rather than deflect attention and belittle other users comments. I'm getting tired of it, as is everyone else. Cheers, Zaldax (talk) 16:18, 11 August 2012 (UTC)
- @Zaldax, Kauffner put "Follow English-language sources" back in. Given that it's a distorted title in a distorted summary it probably doesn't make any difference to leave it. But I've added in "[disputed summary title, see below]" to note that it has been objected to. I think we all, including Kauffner, know very well that it means "Follow majority of English-language sources", which even with French and Czech inevitably = "follow diacritic use in non-diacritic enabled English-language sources", which en.wp doesn't do for other languages.
- @Kauffner, as far as "consists of kilobyte after kilobyte of personal accusations" - the two particular complaints about this RfC relating to (a) slanted wording about "English sources," scare tactics about "Sài Gòn," etc. - (b) selectively notifying first only Users who'd opposed Vietnamese in the past, then quite bizarre canvassing,, are behaviours which any RfC would complain about. As for other issues regarding your behaviour, you are the one who directly mentioned the Sockpuppet Investigation. That doesn't have any direct bearing on your behaviour here.
- Back to the actual RfC, to the extent that any result here is usable it indicates that, even following extensive selective canvassing including for example to WikiProject Conservatism etc. (what was that for?) the above shows:
- A slender majority here favour of use of diacritics, i.e. treating Vietnamese as French or Czech
- A sizable minority here are against use of diacritics in Vietnamese, without indication of how this relates to French and Czech.In ictu oculi (talk) 04:25, 12 August 2012 (UTC)
- It's possible that the bash-Kauffner aspect of this RfC has affected the vote count as well. Various editors seem to be here more for the purpose of putting together vicious phrases and threats than to discuss diacritics, like this is the new alt.flame. I would also appreciate it if you did not describe my position as "do not use diacritics." I have responded to this accusation before: I support the use of the native-language full name boldface in the opening, per WP:FULLNAME. Kauffner (talk) 15:13, 12 August 2012 (UTC)
- (1) I see no bash-Kauffner here. Other than that Users are not awarding you barnstars for your behaviour, neither your behaviour in relation to the set-up/canvass of this RfC, nor in relation to your edits/undiscussed moves/admin proxies/redirect locks of articles, nor in relation to the Sock Puppet Investigation you introduced above (which you appear to be deliberately prolonging). Why would anyone expect to be rewarded or complimented for these behaviours? Each of these behaviours individually would be a concern.
- (2) Obviously "I support the use of the native-language full name boldface in the opening" is against the use of diacritics in the title and the text of article. Is there anyone who opposes one mention of the Vietnamese spelling in the lede? Your view is clear on edits to Âu Cơ, (plus usual pattern of G6 proxying of an uninvolved admin and redirect locked)
- Since you complain about "bash-Kauffner", and since also you introduced the subject of the Sockpuppet investigation above, then it seems appropriate to repeat again here the question you have been asked before: Are the IPs you? In ictu oculi (talk) 05:10, 13 August 2012 (UTC)
- There's no need for a rule like, "ignore COMMONNAME, ignore COMMONSENSE, and use diacritics universally in English titles for articles relating to all Latin-alphabet languages". It makes more sense to use the COMMONNAME (without diacritics, if this is more common) plus context, e.g. "(tennis)" as discussed in my post at the end of the Survey above. Even just the example of people named Manuel Sanchez on English Wikipedia looks sloppy and inconsistent; surely the correct COMMONNAMEs for the other tennis articles have also not been adequately researched. It doesn't matter if the diacritics are right if they are inappropriate, e.g. if the name used with diacritics in the article title would be considered wrong or incomplete in the native language (Spanish, Czeck or whatever). That's "low fidelity reproduction of a person's name"—even more so than omitting diacritics. LittleBen (talk) 16:15, 12 August 2012 (UTC)
- I've changed the heading back to the proper title again. Kauffner, do NOT change it again. Also, stop trying to play the victim card; you're pretty seriously violating WP:GAME now...So discuss this in a proper manner, and support your arguments properly, rather than deflect attention and belittle other users comments. I'm getting tired of it, as is everyone else. Cheers, Zaldax (talk) 16:18, 11 August 2012 (UTC)
Anyone support the abandon of diacritics please remember the example of GV about the "Nguyên" and "Nguyễn". There are many words in Vietnamese without diacritics look exactly the same, and that can cause many misinformations during the process of identifying the Vietnamese name, both people's name and geographic name. Михаил Александрович Шолохов (talk) 15:34, 14 August 2012 (UTC)
- I understand the discussion as being whether it's appropriate to always use diacritics in titles. Nobody seems to object to showing versions of romanized names (that are in article titles) both with and without diacritics in the head of the body of articles. LittleBen (talk) 12:35, 15 August 2012 (UTC)
Moving forward? (Continuing Discussion)
- I believe the arguments in favor of using diacritics when there is not an established, commonly used English name omitting them (ie. Saigon, Ho Chi Mihn) are the stronger ones, and there appears to be a rough consensus in favor of that proposal. Given the length and vehemence of the discussion so far, I'm not sure where this discussion could possibly go from here; it seems that positions are becoming rather fixed. What do we need to do to move this issue forward? If the rough consensus is indeed in favor of limited diacritics use (and again, I believe that those arguments are indeed stronger), how do we go about formulating the guideline. If there is no rough consensus, how can we go about attracting fresh voices to this discussion, in hopes of achieving one? This discussion has lasted for ages; I think it's time we move towards some sort of resolution; especially because this is the sort of RFC where "no consensus" isn't really a viable outcome. Cheers, Zaldax (talk) 15:50, 14 August 2012 (UTC)
- This discussion isn't getting anywhere. Both sides have their own sense. But it looks like that Vietnamese people support the use of diacritic. You guys should let the Vietnamese people to decide on this issue. It is however their language and I think they have the right for their voices to be heard. Or let just make a voting. Majority won. The only to solve the problem here.65.128.144.159 (talk) 00:54, 15 August 2012 (UTC)
- @Zaldax, I believe that RfCs are supposed to run for 30 days. I don't know where that's written but User:Joedecker mentioned it to me a couple of days ago. The conclusion would at this point be treat Vietnamese entirely as Czech and French, but there may be some other options.
- @65.128.144.159, on a simple "majority won" it's evident that there's a visible majority for treating Vietnamese as any other Latin alphabet language. Which is not too surprising since that is where these articles were created prior to the undiscussed moves. Having said that, it isn't a large majority, and also it would be really very helpful if those indicating opposition to Vietnamese titles could please put that in context of whether it is objection to Vietnamese only or to diacritics per se (Czech, French - which en.wp uses). At this point we don't know and going forward it would be helpful to know. In ictu oculi (talk) 12:57, 15 August 2012 (UTC)
- Just took a look at the WP:RFC page; the default length for a RFC discussion is 30 days, but they can be ended earlier or extended if consensus is or isn't achieved by that date. That being said, if a consensus exists right now, it's definitely only a rough one. Since there's only a few more days until the 30-day base length, we might as well keep going at least until then.
- That being said, to move this conversation forward I think we need to do two things. First, if those in opposition to diacritics use would state and clarify their positions, that would help a great deal; y'all seem to have disappeared from this thread over the past couple of days, but I'd appreciate hearing your continued input as events progress. Second, if the rough consensus is indeed in favor of using Vietnamese diacritics, we need to firmly establish how we plan on doing so; as French and Czech, as per some other language, or whatever. (I think it's indisputably agreed that we aren't using diacritics in a case where an unquestionably widely-used name sans diacritics exists, i.e. Ho Chi Mihn and Saigon. We need to decide if there are any other articles that fall under this scope; other possible examples are Viet Cong and Hanoi) I suggest that we draft the naming guideline on diacritics here, before we insert it into the MOS, once we have determined our course of action. Also, if someone wants to start a new section soon, that might make editing easier for everyone...Cheers, Zaldax (talk) 14:04, 15 August 2012 (UTC)
- @Zaldax, this makes sense and I for one would be more than happy with you starting a new section. But perhaps wait a day or two until one or two who are in the ballpark of the majority consensus above surface again.
- Otherwise sounds good. In ictu oculi (talk) 15:38, 15 August 2012 (UTC)
- My idea of an ideal for the handling of foreign languages is the way that Japanese is handled: (1) Japanese is NOT used in English Wikipedia article titles. (2) Template:Nihongo can be used to show both the Japanese equivalent of an English word, and the Japanese pronunciation: "{{Nihongo|English|英語|eigo}}" gives "English (英語, eigo)". You can omit the English meaning and use this template to show just the Japanese and romanized versions of a word, so "{{Nihongo||東京|Tokyo}}" gives "Tokyo (東京)". It is not useful to stuff Wikipedia full of foreign words without giving either or both their meaning and/or pronunciation: this is like having *either* an English or Japanese map rather than a bilingual English / Japanese map. If it's a Japanese-only map then maybe you can't read it, and if it's English-only then it's no use showing it to a Japanese who doesn't read English. It's only useful if it's bilingual. LittleBen (talk) 14:15, 15 August 2012 (UTC)
- Chinese articles like the Pearl S. Buck article illustrate another similarly educational approach, linking of "words" or Chinese characters to Wiktionary. Quote: Pearl Sydenstricker Buck (June 26, 1892 – March 6, 1973), also known by her Chinese name Sai Zhenzhju (Chinese: 賽珍珠; pinyin: Sài Zhēnzhū)... LittleBen (talk) 14:51, 15 August 2012 (UTC)
- Isn't Vietnamese an Asian language rather than a European language? LittleBen (talk) 14:56, 15 August 2012 (UTC)
- LittleBenW, this comment I think puts the finger on part of the problem.
- The deciding factor according to WP:AT is whether a script is Latin-alphabet or not, not whether the person speaking it or the language is "Asian" or "European."
- Hawaiian, Tagalog, Indonesian and Vietnamese are Latin-alphabet.
- Russian, Georgian, Armenian and Greek are not Latin-alphabet.
- These are the criteria we use to decide whether to adopt a romanization. Not race. In ictu oculi (talk) 15:18, 15 August 2012 (UTC)
- But even without this, you appear to be opposed to Spanish names too. So why are you raising that Vietnamese is primarily spoken in Asia? In ictu oculi (talk) 15:19, 15 August 2012 (UTC)
- I am simply pointing out that blindly applying "rules" without properly researching if they are appropriate is neither "intelligent" nor "educational". Arbitrary rules will never be a substitute for adequate research and commonsense. For example—as mentioned above—it is inappropriate to simply add diacritics to shortened romanized names that sportspeople use internationally, as the Manuel Sanchez (tennis) example shows, without a little research (like on a foreign-language Wikipedia) to confirm whether such usage (of the shortened name with diacritics) is considered acceptable in the person's home country. If you search for Manuel Sanchez articles on English Wikipedia you will see what an inconsistent shambles things have become. A consistent naming scheme like "Manuel Sanchez (tennis)" as the article title, and the formal full name—including diacritics—in the head of the article, would make a lot more sense to me. "Always use diacritics in article titles and article body, and remove the romanized form, regardless of national customs and commonsense" is a big problem, rather than a panacea. The reverse—"Always remove diacritics"—is equally stupid. LittleBen (talk) 15:52, 15 August 2012 (UTC)
- LittleBenW, you may not realise this but you are advancing a version of the view which was thoroughly rejected at WP:TENNISNAMES RfC. I'm sorry but this view is out of the ballpark here. If you object to Spanish names, you are not, I'm sorry, going to add value to a discussion on Vietnamese. In ictu oculi (talk) 16:43, 15 August 2012 (UTC)
- But even without this, you appear to be opposed to Spanish names too. So why are you raising that Vietnamese is primarily spoken in Asia? In ictu oculi (talk) 15:19, 15 August 2012 (UTC)
- This discussion isn't getting anywhere. Both sides have their own sense. But it looks like that Vietnamese people support the use of diacritic. You guys should let the Vietnamese people to decide on this issue. It is however their language and I think they have the right for their voices to be heard. Or let just make a voting. Majority won. The only to solve the problem here.65.128.144.159 (talk) 00:54, 15 August 2012 (UTC)
Dear Ben, I guess you have already know that, the Latin-based Vietnamese alphabet relies heavily on diacritics. In most cases, diacritics are one of the key elements which determine the pronounciation, and in many cases, two words can be told apart by the appearance of diacritics - in other words they look exactly the same without diacritics (exp: Nguyễn & Nguyên, Tiên & Tiền, Đan & Dân, Phố & Phở, Trấn & Trần,...). I understand the English-users may face great difficulties with diacritics, and some names are so well-known without the use of diacritics; but, as a Vietnamese, I believe that, in most cases, removing the diacritics in Vietnamese names is inadvisable. (well you also said that "The reverse—"Always remove diacritics"—is equally stupid") Михаил Александрович Шолохов (talk) 16:10, 15 August 2012 (UTC)
- @Михаил. English speakers don't face any difficulties with accents on Renée Zellweger, so no reason that any English speaker should face difficulty with Thanh Hà. I mean they will just read "Thanh Ha", but the actual accent doesn't cause difficulty. In ictu oculi (talk) 16:43, 15 August 2012 (UTC)
- Names and terms that are frequently used with simple diacritics in English sources, and so are recognizable—like "Pelé" and "Bête noire"—are surely not a problem in article titles. Foreign names and foreign terms that the great majority of English Wikipedia users cannot read, write, remember, or pronounce should NEVER be used in article titles: that's surely little different from using Chinese, Japanese, or Korean names in article titles. The great majority of English Wikipedia users will need a romanized title, name, or term in order to be able to search for it in Wikipedia—they should not be forced to try to guess one (or read the body of an article to find one). The whole idea of a title or slogan is to make it as easy as possible for the majority to read and remember. By "majority" we should be thinking "majority of users", not "majority of the few editors who spend too much time playing politics and trying to force their POV on the rest of the world".
- If you are going to use foreign names with diacritics—like Manuel Sánchez (tennis)—in article titles, then surely the absolute least that you should be doing first is to check the corresponding article title in the foreign Wikipedia to confirm that such usage is considered acceptable in his or her native country.
- When romanized words are used in titles, then it will be educational for some people to see the foreign and romanized equivalents together in the body of an article—like Tsunami (津波). Note that the Japanese article on Tsunami explains that "tsunami" used to be called tidal wave in English, which is inappropriate because they are earthquake induced—the Japanese article even has a bilingual diagram to explain the term—and the shorter, simpler, and more appropriate term "tsunami" has now been widely adopted in English, it has become the COMMONNAME. The English Wikipedia article Tsunami does not explain in the first sentence of the article that earthquake-induced "tsunami" used to be called "tidal wave" in English. The English Wikipedia DAB page Tidal wave has a misleading or wrong explanation of "tsunami", and doesn't link to the English Wikipedia Tsunami page. This looks sloppy and unprofessional.
- In English Wikipedia, the only acceptable usage of little-recognized foreign words that the great majority of English Wikipedia users cannot read, write, remember, or pronounce—like 津波—is surely NOT to use them in article title (use the romanized equivalent) but rather to show the foreign and romanized equivalents together in the head of the article body. :*Template:Lang is the preferred method for tagging foreign-language words in English Wikipedia—it inserts language tag in the HTML markup—and surely the "Nihongo" template above also has the same function. LittleBen (talk) 01:57, 16 August 2012 (UTC)
- We use the "Ř" for Czech names and accents, and as a native English speaker who speaks some Czech I can tell you that it is near-impossible for most English speakers to pronounce, let alone read correctly. Most think it's pronounced similarly to the letter "R", when in fact it is pronounced "Rzh"; the "R" is rolled, as well. That's only the best possible transcription; a closer explanation would be to say its pronounced as if you're saying a rolled "R" and "zh" at the same time. There is no direct equivalent of the sound in the English language, and it is indisputably the hardest letter for English speakers to learn. We still use it in English for that very reason. A good example of why it might help to use diacritics even in this case is the name "Jiři". If spelled "Jiri", an English speaker would not think anything special of it; if correctly spelled "Jiři", when pronounced it becomes much clearer that "Jiři" is the Czech equivalent of "George". Cheers, Zaldax (talk) 14:05, 17 August 2012 (UTC)
- National Geographic advises against the used of Vietnamese diacritics as they are distracting.[7] Perhaps they know more than someone who thinks that the Vietnamese alphabet is based on French. Kauffner (talk) 00:31, 16 August 2012 (UTC)
- Kauffner, as before P.T. Aufrette already addressed your National Geographic argument. As for "based on French" since no one has said that, why should anyone respond? The French colonial administration made the Latin alphabet official - but the script was based, in part, on Portuguese. But we all know that don't we, you're gaming, as usual. Can you not say anything straight?
- When does the Project community get a word of explanation or apology from you regarding your use of IPs? In ictu oculi (talk) 01:20, 16 August 2012 (UTC)
- So the French colonial authorities were the real Vietnamese nationalists? I had no idea. BTW, alphabetic script became official under Bao Dai in 1945. The French authorities wanted to create a French-speaking Indochine, so the plan was to deprive Vietnam and Vietnamese of all official status. Kauffner (talk) 05:39, 16 August 2012 (UTC)
Dear Kauffner:
- According to my knowledge, the first officially recognized creator of Vietnamese alphabet is a Portuguese priest. And, Latinization of Vietnamese was a long process which had the participation of a lot of people and generations, many of them were not French and probably did not have any connection with France.
- Every coin have two sides. The heavy reliance on diacritics of Vietnamese causes a lot of diacritics appear on Vietnamese words, and it may cause inconveniences to English speakers, as Kauffner said. HOWEVER, also due to this heavy reliance on diacritics, as I have already said, you have really high chance of misread and misidentify the diacritic-ommited Vietnamese names. Even we Vietnamese, who are really familiar with our names's patterns, in some cases also misread our language's name in a very painful way.
- Let us have a fun example. Mr Joseph Cao's Vietnamese name is Cao Quang Ánh, without diacritics it becomes Cao Quang Anh. The probem is, both Cao Quang Anh and Cao Quang Ánh are viable names in Vietnamese, and both their patterns are equally common. Another viable, and not uncommon, one is Cao Quang Ảnh. Now assume that we did not know anything about Joseph Cao's Vietnamese real name, imagine the difficulties that we had to face when trying to identify this name, and also to pronounce his Vietnamese name (diacritics in Vietnamese are also the key element which determine the pronounciation). According to my experiences, as a Vietnamese, 99% I might think his name is Cao Quang Anh, not Cao Quang Ánh.
- About the cases non-diacritic names which are very common (such as Ho Chi Minh, Saigon, Hanoi, Vietnam), of course I agree that we should not add diacritics to these names. Михаил Александрович Шолохов (talk) 01:28, 16 August 2012 (UTC)
In short, well, Kauffner, your concern is reasonable. But please take what I said in mind when you consider this problem. Михаил Александрович Шолохов (talk) 01:28, 16 August 2012 (UTC)
- As a reference work, Wiki should follow the usage of other reference works, or at least that's what guidelines like WP:DIACRITICS and WP:EN suggest. Published reference works do not use Vietnamese diacritics, not even the references intended for specialists. See Britannica, Encyclopedia of Modern Asia (2002), Southeast Asia: A Historical Encyclopedia (2000), The Cambridge History of Southeast Asia (2004), or Corfield's The History of Vietnam (2008). It should not matter why the marks are not used. But the fact that news sources based in Vietnam, including Saigon Times and Voice of Vietnam, do not use them either suggests that the reason does not relate to staffing or technology, but rather to the readability issue cited by National Geographic. The diacritics will still be given in the opening, whatever happens with the titles. Kauffner (talk) 02:39, 16 August 2012 (UTC)
- Well, in light of the arguments put forth so far, especially by Михаил Александрович Шолохов, maybe those reliable sources are, in fact, wrong in this case. Maybe it is a staffing and technology issue (it takes a good bit more effort to type Vietnamese accents on an English keyboard than, say, French or Spanish). Maybe Wikipedia can actually do it better than those other sources. That's what I think. Cheers, Zaldax (talk) 13:09, 16 August 2012 (UTC)
- You think that writers at English-language publications in Vietnam may have trouble typing the necessarily diacritics into their computer keyboards? I'm not sure this post is meant to be taken seriously or not, but I'll play it straight. It's certainly a subject I know a thing or two about. There is a program called UniKey which is pretty standard for computers in Vietnam. Vietnamese all learn to type with it. Here's how I do it for publication. There is a Vietnamese word in the article that's italicized and given with diacritics. But place names and so forth are all anglicized. Kauffner (talk) 14:57, 16 August 2012 (UTC)
- It was a serious comment, actually. While it's quite interesting to learn that how they've worked around the problem (thanks for the program link; maybe that will come in handy someday), my statement was primarily referring to individuals reporting on Vietnam from outside the country. News organizations such as the BBC, National Geographic, etc. may or may not use diacritics when discussing Vietnam-related topics, but much of their coverage is written on English-language keyboards outside the country. Assuming that they, as well as our other references omitting diacritics, are doing some portion of their work in a standard word processor such as Microsoft Word, I would not be surprised if the decision to omit diacritics was solely a labor-saving one. Compare the amount of time it takes to type "D" and "Đ" (one key vs. three; ctrl+'+D) , or "O" and "Ơ" (one key vs. at least three). I literally just spent ten minutes trying to figure out the first, and wasn't able to figure out the latter. Given that the shortcut for Spanish, French, Italian, etc. accents is slightly more intuitive (ctrl+`+letter, ctrl+'+letter, ctrl+,+letter, etc.), it's no surprise that news organizations and other sources use them while omitting them for Vietnamese topics. Furthermore, along the same lines Czech accents are also less intuitive shortcuts, and yet we use those on Wikipedia. In any case, those who don't write solely on Vietnamese-based topics can't be expected to learn the keyboard shortcuts, and since those who don't have to go through "Insert Special Character" to type them, their omission is unsurprising. Fortunately, this isn't a problem for the reader, as that's what redirects are for.
- Kauffner, I'm interested in hearing your thoughts on Михаил Александрович Шолохов's recent comments, particularly the issue of possible "false friends" with diacritic-stripped names; i.e. the given example of "Cao Quang Anh" vs. "Cao Quang Ánh". Hypothetically, if we were to strip all diacritics from article titles, how would we handle this possible redundancies? Would we create a disambiguation page for each conflict, with a summary of each entry? It seems like doing so might create a great deal of unnecessary confusion, but I'm interested in hearing what your solution to this problem might be. Cheers, Zaldax (talk) 15:55, 16 August 2012 (UTC)
- The BBC has a Vietnamese edition, so they obviously have the technical ability to present diacritics. For sites based in Vietnam, the original version of the story is usually written in Vietnamese, and the diacritics are removed by the translator. They mean nothing to vast majority of English speaking readers. No is going to learn Vietnamese from English-language news reports (or from Wikipedia). Title clashes are a common problem, and there are various workarounds. The long term solution is software that handle more than one instance of a title. Kauffner (talk) 04:23, 17 August 2012 (UTC)
- Well, in light of the arguments put forth so far, especially by Михаил Александрович Шолохов, maybe those reliable sources are, in fact, wrong in this case. Maybe it is a staffing and technology issue (it takes a good bit more effort to type Vietnamese accents on an English keyboard than, say, French or Spanish). Maybe Wikipedia can actually do it better than those other sources. That's what I think. Cheers, Zaldax (talk) 13:09, 16 August 2012 (UTC)
Continuing Discussion (Arbitrary Break)
- I guess after all it is a matter of choice: convenience or correct ? If not omitting the diacritics, these "annoying marks" may cause uncomfortable to the readers who are still not familiar with Vietnamese names. If omitting, there is a high chance of misreading and misinterpreting the names. To me, sacrificing some conveniences for preventing the high chance of misinterpreting and for preserving the full information of the name is still worthy enough. For more information, Kauffner, I believe that most of English and VNmese newspapers - both online and hard-copy ones, omit the diacritics in French, Czech, Spanish, German,... names, too. But, if my memory is not wrong, many of these diacritics names still exist in en.wikipedia. Михаил Александрович Шолохов (talk) 07:38, 17 August 2012 (UTC)
- I strongly agree with the above sentiment. I want to particularly emphasize that you can't anglicize a word just by stripping the diacritics; think of a letter with diacritic as an entirely separate letter, not just a letter with a funny symbol attached to it. Not only does a diacritical mark change the pronunciation, but it changes the meaning as well. One of the first examples I learned in Czech was "byt = flat (apartment)" whereas "být = 'to be'". Thousands more examples exist in languages worldwide. My question is, would it really be worth all of the extra trouble of disambiguation pages, debating WP:PRIMARYTOPIC, confusion on the part of those familiar with Vietnamese names, debating the proper anglicization of the name, etc. when we could solve the entire issue just by including the diacritics? Zaldax (talk) 13:31, 17 August 2012 (UTC)
- I don't think anybody is saying that diacritics should be stripped, just that (1) If they are not frequently used and well established in English sources then they should NOT be used in article titles, (2) Both versions with and without diacritics should be used together in the article body, using a language template that properly adds semantic tags to foreign words. LittleBen (talk) 05:34, 18 August 2012 (UTC)
- I strongly agree with the above sentiment. I want to particularly emphasize that you can't anglicize a word just by stripping the diacritics; think of a letter with diacritic as an entirely separate letter, not just a letter with a funny symbol attached to it. Not only does a diacritical mark change the pronunciation, but it changes the meaning as well. One of the first examples I learned in Czech was "byt = flat (apartment)" whereas "být = 'to be'". Thousands more examples exist in languages worldwide. My question is, would it really be worth all of the extra trouble of disambiguation pages, debating WP:PRIMARYTOPIC, confusion on the part of those familiar with Vietnamese names, debating the proper anglicization of the name, etc. when we could solve the entire issue just by including the diacritics? Zaldax (talk) 13:31, 17 August 2012 (UTC)
- Foreign names and terms that are frequently used with simple diacritics in English sources, and so are recognizable—like "Pelé" and "Bête noire"—are surely not a problem in article titles of English Wikipedia. But foreign names and foreign terms that the great majority of English Wikipedia users cannot read, write, remember, or pronounce should NEVER be used in article titles. In choosing a title or slogan, the whole idea is to make it as easy as possible for the majority to read, recognize, remember—and to type in and search for—it.
- Terms and names that are not unique (like Manuel Sanchez) can always be disambiguated. LittleBen (talk) 14:19, 16 August 2012 (UTC)
- The entire issue could still be avoided in many cases just by including the diacritics, which will not only properly inform but will also reduce unnecessary disambiguation pages. Zaldax (talk) 13:31, 17 August 2012 (UTC)
- Kauffner, this is going in circles, every one of these semi-correct statements has been made before and answered before. At this point instead of trying to dominate discussion by repeating the same arguments which have already been responded to, you should instead be using bytes to explain your use of IPs, your canvassing, your 1,700 or so undiscussed moves, and your use of redirect edits to lock your moves from being reverted. In ictu oculi (talk) 02:54, 16 August 2012 (UTC)
- To which Kauffner's response was to move and strip Vietnamese text from another biography and to resume editing redirects. It's pretty evident that this User is only interested in getting his own way - whether it's by use of undiscussed moves, G6 proxies, IPs, edit redirects, whatever, and the majority here for treating Vietnamese like any other Latin-alphabet language don't count for anything. In ictu oculi (talk) 09:06, 17 August 2012 (UTC)
- I thought there was a gentleman's agreement to refrain from changing article titles while this conversation was ongoing? Since I don't think it has been said explicitly before, and since any implicit arrangement has broken down, I'll go ahead and say it: Please refrain from making any edits to article titles involving diacritics until this discussion has come to a close. I would say "controversial edits", but seeing as how they all are, no sense being redundant. Continued attempts to change article titles (by anyone, in any way) will probably be seen as attempts to circumvent consensus. I don't care if the edit is to add diacritics, or remove diacritics*, continued attempts to circumvent this discussion may result in escalation to WP:WQA or WP:DRN. *Unless the edit is to restore an article title that has been changed since this message was posted. I recognize that I can't stop anyone from doing this, but in the interest of allowing the process to do its job I'm putting my stern hat on for a moment.
- With regards to the conversation "going in circles", perhaps its time we get some more eyes on it? I recommend posting another message at the Village pump, to attract editors with no previous relation to the topic to provide a more moderating voice. Otherwise, I'm not sure if we'll end up with an ironclad result. Cheers, Zaldax (talk) 13:31, 17 August 2012 (UTC)
- Zaldax, not trying to be an ass, but posting to village pump again won't make a difference. The diacritic war on the wiki has been going on for the entire 8 years I have been around the wiki and I am sure longer. It isn't likely to end any time soon. These discussions happen at least once a year if not multiple times a year (I have been in atleast 3 RfCs on the matter this year alone), there is never really any true consensus in any of them. Posting won't get more people to show up to the discussion because most people on the wiki are totally sick of the discussion. There have been entire site wide RfCs on the matter with hundreds of participants and the result is always the same with about a 50-50 split. If its an ironclad result you are looking for, this isn't a topic area that will have one. -DJSasso (talk) 14:35, 17 August 2012 (UTC)
- No worries, you're not being an ass! I didn't realize that they'd been waging for that long, damn. Well, in any case, it looks to me that the "diacritic wars" are mostly settled for the time being; this appears to be the last major holdout left. I'd argue that there is a rough consensus (and stronger arguments) so far in favor of using diacritics in circumstances where there is no commonly-used name without them (the frequently-stated Hanoi, Saigon, etc). If more people aren't coming, than we need to draft the proposal, and close the issue, as this RFC is starting to come to a close; otherwise, this war will just drag on and on and on. Cheers, Zaldax (talk) 15:00, 17 August 2012 (UTC)
- Hey, Zaldax, would you like to be forced to learn to read Japanese and Chinese in order to read English Wikipedia? If not, don't you think that a majority of users would not want to be forced to learn to read Vietnamese and other diacritics in order to read English Wikipedia? Should everybody on Wikipedia be forced to adopt your religion as well? LittleBen (talk) 15:09, 17 August 2012 (UTC)
- That is a strawman, not even remotely similar. These are both latin-based alphabets. The diacritics in most cases do not change the readability of the name. You don't need to learn another language to read them. If you choose not pay attention to them you can in most cases still read the word. Albeit in a misspelled/pronounced way. Japanese and Chinese etc use a completely different set of characters and would not be readable by English readers. Vietnamese words with diacritics on the other hand are. -DJSasso (talk) 15:29, 17 August 2012 (UTC)
- Take it easy there, buddy. There's a big difference between a Latin alphabet such as Vietnamese and other scripts such as Japanese and Chinese. While I might not know how a letter like "Đ" is pronounced, I can certainly recognize it as a D, as I'm sure everyone else does. However, I, and everyone else, would also know that it isn't the same thing; maybe that'll inspire me to go and look up "Vietnamese alphabet" and learn something. Keep in mind that Vietnamese isn't just based on the Latin alphabet (see Cherokee alphabet for an example of one that is); it uses the Latin alphabet, and modifies it with diacritics. So, perhaps a better question would be "Should everybody on Wikipedia be forced to adopt the 26 unmodified letters of the English alphabet?" My religion has nothing to do with it. Cheers, Zaldax (talk) 15:29, 17 August 2012 (UTC)
- No worries, you're not being an ass! I didn't realize that they'd been waging for that long, damn. Well, in any case, it looks to me that the "diacritic wars" are mostly settled for the time being; this appears to be the last major holdout left. I'd argue that there is a rough consensus (and stronger arguments) so far in favor of using diacritics in circumstances where there is no commonly-used name without them (the frequently-stated Hanoi, Saigon, etc). If more people aren't coming, than we need to draft the proposal, and close the issue, as this RFC is starting to come to a close; otherwise, this war will just drag on and on and on. Cheers, Zaldax (talk) 15:00, 17 August 2012 (UTC)
- Zaldax, not trying to be an ass, but posting to village pump again won't make a difference. The diacritic war on the wiki has been going on for the entire 8 years I have been around the wiki and I am sure longer. It isn't likely to end any time soon. These discussions happen at least once a year if not multiple times a year (I have been in atleast 3 RfCs on the matter this year alone), there is never really any true consensus in any of them. Posting won't get more people to show up to the discussion because most people on the wiki are totally sick of the discussion. There have been entire site wide RfCs on the matter with hundreds of participants and the result is always the same with about a 50-50 split. If its an ironclad result you are looking for, this isn't a topic area that will have one. -DJSasso (talk) 14:35, 17 August 2012 (UTC)
@Ben: On my opinion, I think you should replace "Japanese" and "Chinese" by "French", "Spanish", "German", etc. We have to take note that, just like the Western European countries, our Vietnamese alphabet now is Latin-based one. At the moment, Vietnameses do not use kanji, kana, joseongul or whatever "weird" characters, we use Latin characters such as a, b, c, d, e,... toghether with several diacritcs, just like some European languages. There are some people who suggest that Vietnam should adds the letter "z", "j" and "w" to our alphabet. So, I believe it is more fair to replace "Japanese" and "Chinese" by some languages of European countries, which use Latin characters toghether with diacritics. Yours faithfully, Михаил Александрович Шолохов (talk) 15:25, 17 August 2012 (UTC)
- Most English Wikipedia users can ignore simple diacritics if they want to, but most English Wikipedia users cannot read, write, or pronounce complex diacritics—just as they cannot read, write, or pronounce Chinese, Japanese, Korean or Vietnamese. So (1) it is not beneficial to the majority of Wikipedia users to embed Chinese, Japanese, Korean, or complex diacritics in Wikipedia article titles, and (2) it is not beneficial to use them in the body of articles without using a language template such as Template:CJKV or a link to Wiktionary to properly explain them. Forcing readers to try to read lots of complex diacritics is surely as annoying and time-wasting for the majority of users as trying to read articles that are written in ALL CAPITALS, or an article that is stuffed with lots of Chinese, Japanese or Korean—if there are no language templates or links to Wiktionary to explain the foreign words. People who prefer foreign languages can read the corresponding foreign-language Wikipedia. LittleBen (talk) 16:10, 17 August 2012 (UTC)
- I totally agree with the points made by LittleBen. This is the English language wikipedia and that clearly English language users should be taken into account. BritishWatcher (talk) 18:47, 17 August 2012 (UTC)
- I also agree with LittleBen on this. Fyunck(click) (talk) 19:13, 17 August 2012 (UTC)
- Once again, stripping the letters of diacritical marks means you are actually changing the spelling of the word. In other words, not only is the word potentially confused with others, and stripped of information, it's actually spelled wrong. I know that some readers may feel a tad uncomfortable seeing unusual diacritics, but they'll still be able to just ignore them. Besides, this is an encyclopedia; our mission is to record information, not "dumb things down." Cheers, Zaldax (talk) 02:00, 18 August 2012 (UTC)
- I'm not doing anything since I go with English sources. If they have diacritics that's what we should be using. If they don't then we don't. Makes it easy to find and add the proper English sources. And it's one thing to let readers know how a name is spelled in a foreign language, it's another to thrust upon us a foreign spelling if it isn't used in English sources. Fyunck(click) (talk) 05:26, 18 August 2012 (UTC)
- Once again, stripping the letters of diacritical marks means you are actually changing the spelling of the word. In other words, not only is the word potentially confused with others, and stripped of information, it's actually spelled wrong. I know that some readers may feel a tad uncomfortable seeing unusual diacritics, but they'll still be able to just ignore them. Besides, this is an encyclopedia; our mission is to record information, not "dumb things down." Cheers, Zaldax (talk) 02:00, 18 August 2012 (UTC)
- Did you seriously just go Canvass a user who you know is very anti-diacritics? -DJSasso (talk) 19:23, 17 August 2012 (UTC)
- This is in fact canvassing; after a bit of digging, the only individuals he has notified are known for opposition to diacritical use. I have left a notice on Ben's talk page here. If you notify editors, be sure to notify all potentially interested parties. Cheers, Zaldax (talk) 01:55, 18 August 2012 (UTC)
- Wikitravel:Romanization travel guide from Wikivoyage (Wikitravel:Romanization) seems pretty sensible to me — much simpler and clearer than Wikipedia. LittleBen (talk) 19:19, 17 August 2012 (UTC)
- That page doesn't exist. Were you trying to make a point there, or was that a mistake? Zaldax (talk) 02:01, 18 August 2012 (UTC)
- Try the Wikitravel:Romanization link that I just added. LittleBen (talk) 02:59, 18 August 2012 (UTC)
- @BritishWatcher, Fyunck. You seem to be proposing (D) Daily Express MOS. Is that correct? If it is correct would you mind please signing in box D to help get a more representative capture of the variety of views. I take it what you want to see is
- Example = title: Truong Tan Sang, lede: "Truong Tan Sang (Vietnamese Trương Tấn Sang)" body text: Truong Tan Sang
- Is this correct?
- @LittleBenW, thanks for signing in the census box. Do you also mean
- Example = title: Truong Tan Sang, lede: "Truong Tan Sang (Vietnamese Trương Tấn Sang)" body text: Truong Tan Sang
- Or did you mean something else?
- Quote: "Did you mean Truong Tan Sang, lede: "Truong Tan Sang (Vietnamese Trương Tấn Sang)" body text: Truong Tan Sang?" See the Wikitravel romanization guide. CJKV languages (Chinese, Japanese, Korean and Vietnamese) have long-established standards for romanization. Use them! Also when embedding any foreign language in English Wikipedia please use templates like the Template:CJKV or Template:Lang templates that add the correct language-related semantic markup to the page and ensure that the browser uses fonts appropriate for the language (fonts that look professional). For example, if English embedded in a Japanese page is not marked up as English, then the browser displays it using a Japanese font, which looks really ugly. Also, Chinese and Japanese share some so-called "commmon characters" that look quite different depending on whether they are displayed in Chinese or Japanese fonts. Web standards are there for a reason. As explained by no less an authority than Wikipedia: in many countries, organizations may be prosecuted for not abiding by such web accessibility standards. LittleBen (talk) 03:12, 18 August 2012 (UTC)
- Quote: "@LittleBenW, thanks for signing in the census box". @IIO, thanks for fraudulently modifying the contents of the census after the fact from the "Diacritics in Article titles of not" RfC subject to "ALL DIACRITICS, NO ENGLISH (in article title and body) or vice versa". LittleBen (talk) 04:24, 18 August 2012 (UTC)
Okay LittleBen you have one last chance to start behaving in a civil way before I request admin assistence for personal attacks here. You have repeatedly had it explained to you that I clarified my own census form for your benefit, among others, since you seem to have communication difficulties. Let me tell you that I, unlike some other anti-foreign name editors whom I see with no moral or personal standards at all do not do things "fraudulently". What I am trying to do is modify my census form again in a way that suits your LittleBenW's dictates if you will respond in a clear and coherent way when being asked a straight question. Now the link you gave had "Xilin Pagoda (西林塔 Xīlíntǎ)" which seems to confirm the previous. So, now I'm going to repaste the question again in good faith, and this time instead of launching a personal attack, look at the question and try and answer:- Actually I'm done with saying anything to this user. If he can't see the difference between "Manuel Sánchez" and "西林塔" he's out of the ballpark, it's a waste of time engaging with him, with or without the personal attacks. We'll be going through Zoë Baird again next. In ictu oculi (talk) 05:19, 18 August 2012 (UTC)
To those who continue to object to diacritic use in article titles, are you willing to come up with a proposal for transliterating those names into the standard 26 letters of the English alphabet? Simply stripping the article title of diacritics and pretending that the letters are the same is not an encyclopedic standard, in my opinion. Cheers, Zaldax (talk) 02:14, 18 August 2012 (UTC)
- There's no need plus that would be original research to just plop up a transliteration without it being backed up with English sources. If most usage/sourcing in common English strip the diacritics, for whatever reason they choose, that's what the title should be sitting at. That would be the common English rendering. Again you make sure the foreign spelling is in there too, so our readers are informed of all spellings. That would only make sense since we don't want to censor it...it exists whether we like a spelling or not. And Encyclopedia Britannica sometimes strips out the diacritics... but they then usually show the foreign and English spelling next to each other. Fyunck(click) (talk) 05:39, 18 August 2012 (UTC)
- Try the Wikitravel romanization link that I just added above. LittleBen (talk) 02:59, 18 August 2012 (UTC)
- In response to the personal attacks from LittleBenW I have further detailed my census form - which no one among the oppose section but Yopienso and Quis/RMS have actually given a clear answer to - in the hope that it may still generate some clarity. For those who've been round this subject before with Spanish and Czech the usual default of a accentless title is this, and the example is presented in good faith. For those who still need "Other" the other box is still there.
- Example:
- Title = Truong Tan Sang
- Lede = Truong Tan Sang (Vietnamese Trương Tấn Sang)
- Body = Truong Tan Sang
- If this isn't clear I don't know what will be. LittleBenW think very very carefully before responding with another personal attack. In ictu oculi (talk) 05:34, 18 August 2012 (UTC)