Jump to content

Template talk:Lang/Archive 10

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 5Archive 8Archive 9Archive 10Archive 11Archive 12Archive 13

Language articles

Can someone add a parameter for languages that don't have the word 'language' in their article name like Afrikaans, Amharic, Bislama, Bokmål, Church Slavonic, Dzongkha, Esperanto, Haitian Creole, Hindi, Interlingua, Kannada, Kinyarwanda, Kirundi, Latin, Lingala, Luganda, Malayalam, Nynorsk, Old Church Slavonic, Pali, Pashto, Sanskrit, Scottish Gaelic, Standard Tibetan, Tok Pisin, Twi, Urdu and Volapuk? -- PK2 (talk) 05:33, 9 June 2020 (UTC)

Why? Are there any {{lang-??}} templates for the languages you name that do not link to the correct en.wiki article?
Trappist the monk (talk) 10:38, 9 June 2020 (UTC)
Are you able to edit the {{language with name}} template to produce the sequence [[x]] instead of [[x language|x]] in the cases of languages like Latin and Old English, whose article names do not end in language, or failing that, templates like {{lang-ang}}, {{lang-am}}, {{lang-bi}}, {{lang-nb}}, {{lang-cu}}, {{lang-dz}}, {{lang-eo}}, {{lang-ht}}, {{lang-hi}}, {{lang-ia}}, {{lang-kn}}, {{lang-rw}}, {{lang-rn}}, {{lang-la}}, {{lang-ln}}, {{lang-lg}}, {{lang-ml}}, {{lang-nn}}, {{lang-pi}}, {{lang-ps}}, {{lang-sa}}, {{lang-gd}}, {{lang-bo}}, {{lang-tpi}}, {{lang-tw}}, {{lang-ur}}, {{lang-vo}}, and {{lang-??}} for languages whose article names don't end in language? -- PK2 (talk) 11:05, 10 June 2020 (UTC)
Perhaps I can, but you haven't answered the two questions I posed: Why? Are there any {{lang-??}} templates for the languages you name that do not link to the correct en.wiki article? Here is your second list rewritten as the various template calls. Rephrasing my second question: Do any of the language labels rendered by the templates in this list, link to some place that they should not?
  • {{lang-ang|lang-ang}} → {{lang-ang|lang-ang}}
  • {{lang-am|lang-am}} → {{lang-am|lang-am}}
  • {{lang-bi|lang-bi}} → {{lang-bi|lang-bi}}
  • {{lang-nb|lang-nb}} → {{lang-nb|lang-nb}}
  • {{lang-cu|lang-cu}} → {{lang-cu|lang-cu}}
  • {{lang-dz|lang-dz}} → {{lang-dz|lang-dz}}
  • {{lang-eo|lang-eo}} → {{lang-eo|lang-eo}}
  • {{lang-ht|lang-ht}} → {{lang-ht|lang-ht}}
  • {{lang-hi|lang-hi}} → {{lang-hi|lang-hi}}
  • {{lang-ia|lang-ia}} → {{lang-ia|lang-ia}}
  • {{lang-kn|lang-kn}} → {{lang-kn|lang-kn}}
  • {{lang-rw|lang-rw}} → {{lang-rw|lang-rw}}
  • {{lang-rn|lang-rn}} → {{lang-rn|lang-rn}}
  • {{lang-la|lang-la}} → {{lang-la|lang-la}}
  • {{lang-ln|lang-ln}} → {{lang-ln|lang-ln}}
  • {{lang-lg|lang-lg}} → {{lang-lg|lang-lg}}
  • {{lang-ml|lang-ml}} → {{lang-ml|lang-ml}}
  • {{lang-nn|lang-nn}} → {{lang-nn|lang-nn}}
  • {{lang-pi|lang-pi}} → {{lang-pi|lang-pi}}
  • {{lang-ps|lang-ps}} → {{lang-ps|lang-ps}}
  • {{lang-sa|lang-sa}} → {{lang-sa|lang-sa}}
  • {{lang-gd|lang-gd}} → {{lang-gd|lang-gd}}
  • {{lang-bo|lang-bo}} → {{lang-bo|lang-bo}}
  • {{lang-tpi|lang-tpi}} → {{lang-tpi|lang-tpi}}
  • {{lang-tw|lang-tw}} → {{lang-tw|lang-tw}}
  • {{lang-ur|lang-ur}} → {{lang-ur|lang-ur}}
  • {{lang-vo|lang-vo}} → {{lang-vo|lang-vo}}
Though included in your second list, this one is a member of the set-of-language-article-names-ending-in-'language' so is not included in the list above:
  • {{lang-ka|lang-ka}} → {{lang-ka|lang-ka}}
Trappist the monk (talk) 12:09, 10 June 2020 (UTC)
Sorry, I accidentally forgot to correct {{lang-ka}} (for Georgian) to {{lang-kn}} (for Kannada) before, and as for your questions:
1. Because I think it's better for templates and articles to link directly to the main links, where the article's or template's main content is, rather than to their redirects or even disambiguation pages (formerly in the case of {{lang-lij}} (for Ligurian (Romance language).
2. The {{lang-??}} templates that I listed above and more; they correctly link to their respective articles but via their redirects, and not directly?
Anyway, I'm sorry for not answering your two questions initially? -- PK2 (talk) 03:24, 11 June 2020 (UTC)
I agree that these templates should not link to disambiguation pages. If you know of any that do, please report them here so that they can be fixed.
Redirects are not bad. Wikipedia uses them a lot and has an efficient mechanism for handling them. The templates named in this discussion rely on that mechanism to link to target language-articles via redirects. This works regardless of how those target articles are actually named. Is there a more substantive reason for your request than what appears to be personal preference? Are there technical or reader-facing issues that will be resolved were we to somehow code-around the use of redirects?
Trappist the monk (talk) 11:38, 11 June 2020 (UTC)

bugfix request: problem displaying Meitei script (mni) in Chrome on Windows

The font settings for the Meitei script (mni) need updating. I've noticed a bug with displaying the script. On both en.wikipedia and on Meta Wikimedia. I have fonts installed that contain the script: Noto Sans Meetei Mayek and the Microsoft font family en:Nirmala UI, but it's displaying in something else that gives me tofu text. I get the problem using chrome on windows, a very common setup, so it's probably not just me. And if it requires the reader to change settings there should be a note to explain how maybe?

example
output format the output i see on:
chrome on meta.wikimedia Chrome on en.wikipedia edge firefox
ꯏꯟꯗꯤꯌꯥ font-family:'Noto Sans Meetei Mayek','Nirmala UI','Noto Sans';
ꯏꯟꯗꯤꯌꯥ {{lang|mni|ꯏꯟꯗꯤꯌꯥ}}
ꯏꯟꯗꯤꯌꯥ {{Script|Mtei|ꯏꯟꯗꯤꯌꯥ}}
ꯏꯟꯗꯤꯌꯥ none

Irtapil (talk) 16:11, 9 June 2020 (UTC)

Sandboxed:
{{lang/sandbox|mni|ꯏꯟꯗꯤꯌꯥ}} → {{lang/sandbox|mni|ꯏꯟꯗꯤꯌꯥ}}
{{lang/sandbox|mni|ꯏꯟꯗꯤꯌꯥ}}
{{lang-mni/sandbox|ꯏꯟꯗꯤꯌꯥ}} → {{lang-mni/sandbox|ꯏꯟꯗꯤꯌꯥ}}
{{lang-mni/sandbox|ꯏꯟꯗꯤꯌꯥ}}
Trappist the monk (talk) 22:51, 13 June 2020 (UTC)
Related discussion at Template talk:Lang-he-n § Correct markup for Niqud with transliteration?.
Trappist the monk (talk) 23:20, 13 June 2020 (UTC)

Error in documentation

There is an error in the documentation. Near the end of the "Applying styles" section it shows {{lang|de|Victor jagt zwölf Boxkämpfer quer über den Sylter Deich}} displayed in blackletter/fraktur. This is wrong, as de defaults to de-Latn, and blackletter requires de-Latf markup (on systems that support this script). See ISO 15924 for the script codes Latn and Latf, and specifically the IANA Language Subtag Registry which explicitly stipulates that the "Suppress-Script" code for subtag de is Latn. Love —LiliCharlie (talk) 01:27, 16 June 2020 (UTC)

Documentation can almost always be improved. If you can improve this template's documentation, please do.
Trappist the monk (talk) 11:06, 16 June 2020 (UTC)

most lang-?? templates switched to the module (revisited)

Because of these:

Template talk:Lang § bugfix request: problem displaying Meitei script (mni) in Chrome on Windows
Template talk:Lang-he-n § Correct markup for Niqud with transliteration?

I am minded to revisit this list of {{lang-??}} templates that were not switched to use Module:Lang for the stated reasons. Here is the reorganized and updated list more-or-less grouped according the reason they were not included in the original conversion:

Unique:

Specialized styling support created as a result of the above-mentioned discussions should allow these to be converted. I suspect that the best mechanism for those that can be converted is to create private-use-tag versions of these templates ({{lang-he-n}}{{lang-he-x-styled}} or some such standardized tag), replace transclusions in article space, and then update the replaced templates to use Module:lang ({{lang-ku}} and {{lang-su}} already exist so {{Lang-ku-Arab}} and {{Lang-su-fonts}} could be deleted).

  • {{Lang-he-n}} – special version of {{lang-he}} to use {{script/Hebrew}} to render Hebrew text with Niqqud diacritical marks; not sure what to with this one – 3521 4990 transclusions
    Module:Lang/sandbox supports this:
    {{lang/sandbox|he-x-niq|אֵם קְרִיאָה}} → [אֵם קְרִיאָה] Error: {{Lang}}: unrecognized private tag: niq (help)
    to be done is to settle on the standardized private use tag name and create the appropriate {{lang-he-x-??}} template
  • {{Lang-khb}} – calls {{script|Talu|{{{1}}}}} which calls {{Script/New Tai Lue}} to wrap {{{1}}} in <span>...</span> tags with several fonts – 1 7 article transclusions
  • {{Lang-ksw}} – calls {{Script/ksw-Mymr}} to wrap {{{1}}} in <span>...</span> tags with several fonts – 31 ksw transclusions
  • {{Lang-ku-Arab}}{{Script/Arabic}} to wrap {{{1}}} in <span>...</span> tags with several fonts – 11 transclusions
  • {{Lang-mnw}} – calls {{Script/mnw-Mymr}} to wrap {{{1}}} in <span>...</span> tags with several fonts – 50 77 transclusions
  • {{Lang-nod}} – calls {{Script/Tai Tham}} to wrap {{{1}}} in <span>...</span> tags with several fonts – 25 29 transclusions
  • {{Lang-shn}} – calls {{Script/shn-Mymr}} to wrap {{{1}}} in <span>...</span> tags with several fonts – 20 42 transclusions
  • {{Lang-su-fonts}} – wraps {{{1}}} in a <span>...</span> tag that applies special fonts and sizing; does not provide labeling in the manner of most other {{lang-??}} templates – 39 67 transclusions
  • {{Lang-vi-hantu}} – calls {{vi-nom}} which calls {{lang}} with text wrapped in <span>...</span> tags with several fonts; uses Hán tự as a label (a redirect to History of writing in Vietnam) which is inconsistent with the style of {{lang-??}} templates; a non-English label seems inappropriate at en.wiki. – 23 152 transclusions

The special handling mentioned for these templates has been added so these should be converted

  • Green tickY {{Lang-pra}} – IANA/ISO 639 define code pra as 'Prakrit languages', a collective of individual languages; special handling in Module:lang is required for collections – 2 9 article transclusions
    {{#invoke:lang|lang_xx_inherit|code=pra|text=text}}Prakrit languages: text
  • Green tickY {{Lang-sal}} – IANA/ISO 639 define code sal as 'Salishan languages', a collective of individual languages; special handling in Module:lang is required for collections – 1 4 article transclusion
    {{#invoke:lang|lang_xx_inherit|code=sal|text=text}}Salishan languages: text
  • Green tickY {{Lang-sla}} – IANA/ISO 639 define code sla as 'Slavic languages', a collective of individual languages; special handling in Module:lang is required for collections – 4 8 article transclusions
    {{#invoke:lang|lang_xx_inherit|code=sla|text=text}}Slavic languages: text

The special handling mentioned for these templates has been added. These have been converted:

  • {{Lang-roa}}IANA/ISO 639 define code roa as 'Romance languages', a collective of individual languages; special handling in Module:lang is required for collections – no article transclusions; delete?
    {{lang-roa|text}} → {{lang-roa|text}}
  • {{Lang-son}}IANA/ISO 639 define code son as 'Songhai languages', a collective of individual languages; special handling in Module:lang is required for collections – no article transclusions; delete?
    {{lang-son|text}} → {{lang-son|text}}
  • {{Lang-wen}}IANA/ISO 639 define code son as 'Sorbian languages', a collective of individual languages; special handling in Module:lang is required for collections – 8 article transclusions
    {{lang-wen|text}} → {{lang-wen|text}}

Transliterations and other rendering not standard to the {{lang-??}} templates:

  • {{Lang-ka}} – has support for automatic transliteration when {{{2}}} is set to tr; an insource search finds 83 instances of the template that use this functionality; not sure what to do with this one – 3819 4588 transclusions
  • {{Lang-mnc}} – has support for two simultaneous transliteration renderings – 47 79 transclusions
  • {{Lang-os}}has support for IPA rendering plus transliteration none of which is documented and may only be used in a very few articles – 197 transclusions – converted
  • {{Lang-rus}} – has support for IPA rendering plus transliteration none of which is documented and may only be used in a very few articles – 2073 2729 transclusions
  • {{Lang-tt}} – provides labeling for simultaneous rendering of Cyrillic, Latin, and Arabic scripts; this functionality apparently never documented – 402 488 transclusions
  • {{Lang-ug}} – provides for simultaneous rendering of multiple transliterations – 235 261 transclusions

Templates with language code issues:

These miscellaneous templates have been resolved or deleted:

  • {{Lang-lij}}one of two Ligurian languages officially 'Ligurian' but the en.wiki article is at Ligurian (Romance language) (the other officially is 'Ligurian (Ancient)' and its article is at Ligurian language (ancient) – there is no {{lang-xlg}}); may require article naming of the creation of suitable redirects to make this template work with Module:lang – 26 transclusions – resolved
  • {{lang-sh2}}has support for automatic transliteration when {{{2}}}, mechanism is different from that used in {{lang-ka}} – 3 article transclusions – deleted

Opinions? Comments?

Trappist the monk (talk) 15:22, 14 June 2020 (UTC)

In the above list, those marked with Green tickY have been converted.
Trappist the monk (talk) 13:06, 31 August 2020 (UTC)

Turn off link?

Is there a way to have a parameter that turns off the link to the language's article? For example, I use {{lang-ka}} a number of times in Dali (goddess), which creates a link to Georgian language every time, causing a bit of an overlinking issue. It would be nice to have a link=no parameter so I could turn that off. ♠PMC(talk) 22:40, 1 September 2020 (UTC)

{{Lang-ka}} is not one of the {{lang-??}} templates supported by Module:Lang. It is not supported because it has peculiar functionality so was never converted (see Template talk:Lang § most lang-?? templates switched to the module (revisited). So, you can edit that template to add |link= pass-through to {{Language with name}} or, you can spoof {{lang}} which does support |link=no:
{{lang|fn=lang_xx_inherit|code=ka|text=დალი|link=no}} → Georgian: დალი
|fn=lang_xx_inherit is the lua function in Module:Lang that would be called where {{lang-ka}} converted to use the module.
Trappist the monk (talk) 23:02, 1 September 2020 (UTC)
Hi Trappist the monk, sorry about the long delay in responding - I got busy and forgot about this. I apologize, I don't know what you mean by editing the template to add |link= pass-through. I'm a bit wary of editing hi-vis templates like that, and I don't want to mess it up. ♠PMC(talk) 04:40, 12 September 2020 (UTC)
That's why I suggested using:
{{lang|fn=lang_xx_inherit|code=ka|text=დალი|link=no}} → Georgian: დალი
with that, no need to edit a protected template with all that that entails...
Trappist the monk (talk) 10:17, 12 September 2020 (UTC)

update to the live modules

The primary purpose of {{lang}} and the {{lang-??}} templates is to create properly formed html markup for non-English text in Wikipedia articles.

According to html, the lang= attribute, must be an IETF language tag. The components of language tags are defined in the IANA language-subtag-registry file. The subtag-registry file excludes three-character language codes (ISO 639-2, -3, -5) that are synonyms of the two-character ISO 639-1 language codes. The subtag-registry file also excludes the deprecated ISO 639-3 code hbs (Serbo-Croatian) and excludes all of the ISO 639-2B language codes.

The current Module:Lang uses language code data from four distinct sources:

Module:Language/data/iana languages – derived from the IANA subtag-registry file
Module:Language/data/ISO 639-3 – derived from data available from the ISO 639-3 custodian
Module:Language/data/wp languages – hand curated data used to override standard names to en.wiki preferred names; unclear provenance and ofttimes invalid IETF tags
Module:Lang/data – hand curated data used to override standard names to en.wiki preferred names and to override the invalid IETF-like tags in ~/wp languages

A legacy module, Module:Language/name/data, assembles the first three of these data modules into a single data module at runtime. This is an unnecessary duplication of data.

Because ~/iana languages already has the ISO 639-3 data and because of the flaws in ~/wp languages, I have changed Module:Lang/sandbox to use only the language data from ~/iana languages and from ~/data. Some of the overrides in ~/wp languages are valid so I have added those data into ~/data. Here is a list of those data that were not added:

Module:Language/data/wp languages tags / names not added to Module:Lang/data/sandbox
tag ~/wp languages name iana name
these ~/wp languages names are identical to the iana names; redundant
an Aragonese Aragonese
kk Kazakh Kazakh
km Khmer Khmer
mzn Mazanderani Mazanderani
naq Khoekhoe Khoekhoe
nci Classical Nahuatl Classical Nahuatl
pi Pali Pali
ro Romanian Romanian
rw Kinyarwanda Kinyarwanda
si Sinhalese Sinhala
skr Saraiki Saraiki
za Zhuang Zhuang
names associated with collective codes include the word 'languages'; presumably in ~/WP languages to return only the language name portion
Module:Lang handles these automatically
ber Berber Berber languages
bh Bihari Bihari languages
cel Proto-Celtic Celtic languages
gem Proto-Germanic Germanic languages
myn Mayan Mayan languages
nah Nahuatl Nahuatl languages
pra Prakrit Prakrit languages
sal Salish Salishan languages
sla Slavic Slavic languages
son Songhay Songhai languages
wen Sorbian Sorbian languages
these ~/wp languages names were (presumably) included to quash parenthetical disambiguators
Module:Lang handles parenthetical disambiguators automatically
ain Ainu Ainu (Japan)
ang Old English Old English (ca. 450-1100)
brx Bodo Bodo (India)
chm Mari Mari (Russia)
enm Middle English Middle English (1100-1500)
frm Middle French Middle French (ca. 1400-1600)
fro Old French Old French (842-ca. 1400)
gmh Middle High German Middle High German (ca. 1050-1500)
goh Old High German Old High German (ca. 750-1050)
grc Ancient Greek Ancient Greek (to 1453)
ia Interlingua Interlingua (International Auxiliary Language Association)
knn Konkani Konkani (individual language)
kok Konkani Konkani (macrolanguage)
mga Middle Irish Middle Irish (900-1200)
ms Malay Malay (macrolanguage)
ne Nepali Nepali (macrolanguage)
oc Occitan Occitan (post 1500)
ota Ottoman Turkish Ottoman Turkish (1500-1928)
peo Old Persian Old Persian (ca. 600-400 B.C.)
sga Old Irish Old Irish (to 900)
sw Swahili Swahili (macrolanguage)
war Waray Waray (Philippines)
these ~/wp languages names were (presumably) included to override synonymous ISO 639-2, -3 code/name mapping
Module:Lang automatically promotes synonymous ISO 639-2, -3, -5 codes to equivalent ISO 639-1 codes
abk Abkhaz promotes to ab
deu early German promotes to de – misuse of ISO 639-3 code
ell Modern Greek promotes to el
fry West Frisian promotes to fy
mol Moldovan deprecated
oci Provençal promotes to oc – misuse of ISO 639-3 code; added to ~/data/sandbox as IETF tag oc-provenc
rus Russian promotes to ru
Module:Lang handles most IETF tags automatically
sr-cyrl Serbian Cyrillic

It is my intention to update the live modules:

Module:Lang
abandons use of Module:Language/name/data
refines explicitly cited English-language category naming to use names from the override data
abandons use of variant tag descriptions when creating language labels, tool tips and category names:
{{lang|fn=category_from_tag|ja-latn-hepburn}} → Category:Articles containing Japanese-language text
{{lang/sandbox|fn=category_from_tag|ja-latn-hepburn}} → Category:Articles containing Japanese-language text
Module:Lang/data
adds data module loading
adds override data described above
removes overrides applied to override the overrides applied in ~/wp languages
Module:Lang/name to tag (to be moved to Module:Lang/tag from name to match the name of the function that uses the data)
Module:Lang/name to tag is rewritten to abandon use of Module:Language/name/data
Module:Lang/name to tag is rewritten because:
  • There are a number of ISO 639 language names that have parenthetical disambiguators. In normal use, Module:Lang discards the disambiguator when creating a language-link label (the {{lang-??}} templates) and retains the disambiguators for tool-tips and category names. The module function name_from_tag() uses the reverse-look-up data created by Module:Lang/name to tag to fetch the tag associated with a language name. When creating the reverse-look-up tables, Module:Lang/name to tag creates both disambiguated and de-disambiguated entries:
    ["ancient greek (to 1453)"] = "grc",
    ["ancient greek"] = "grc",
    
  • there are occasions when it is not possible to create both disambiguated and de-disambiguated entries because multiple language codes refer to language names using the same base name:
    ["yaka (central african republic)"] = "axk",
    ["yaka (congo)"] = "iyx",
    ["yaka (democratic republic of congo)"] = "yaf",
    
  • a de-disambiguated language name can only refer to one language tag so in the above case, would be wrong for two of the languages. There are base language tags that refer to a language name without disambiguator and other tags that refer to the same base language name with disambiguator:
    ["Marwari"] = "mwr",
    ["Marwari (India)"] = "rwr",
    ["Marwari (Pakistan)"] = "mve",
    

Without objection, I shall update the live modules to make these changes.

Trappist the monk (talk) 13:10, 25 September 2020 (UTC)

deprecated ISO 639 language codes

There was a TfD about {{lang-eml}}. eml is a deprecated ISO 639-3 code. As it currently exists, {{lang-eml}} renders:

{{lang-eml|text}}{{lang-eml|text}}

The peculiar error messaging occurs because {{lang-eml}} uses a mixture of {{ISO 639 name}} and {{Language with name}} (which calls {{lang}}). {{lang-eml}} was not migrated as most of the other {{lang-??}} templates because eml is a deprecated code.

I have tweaked Module:Lang/sandbox and Module:lang/data/sandbox to add support for deprecated ISO 639 codes:

{{lang/sandbox|fn=lang_xx_italic|code=eml|text=text}} → [text] Error: {{Lang-xx}}: unrecognized language code: eml (help)
{{lang/sandbox|eml|text}} → [text] Error: {{Lang}}: unrecognized language code: eml (help)

I think that some sort of error messaging is required per the TfD though it isn't clear to me what that messaging should look like. In the examples, the text string '(deprecated)' is made part of the language prefix for {{lang-??}} templates and part of the tool-tip for {{lang}} templates. I anticipate adding category support for these templates when deprecated ISO 639 language codes are used.

Since some sort of error messaging is apparently required for deprecated ISO 639 codes, what should that messaging look like?

Trappist the monk (talk) 13:20, 17 September 2020 (UTC)

There are three things people might want to do with lang-xxx templates that use deprecated codes:
1. Continue using them if there are contexts in which the most appropriate label is the language name associated with the former code. In such cases, the visible output of the template should be no different from other templates: for example, it shouldn't append (deprecated) after the language name, as the sandbox version currently does, because the name is not deprecated, it's the code that's deprecated and that only matters under the hood.
2. Deprecate their use and delete them.
3. Deprecate their use, but leave behind a template that outputs an error message along with suggestions for correct templates that can be used instead. That's currently done, in a somewhat crude way, by {{lang-pun}}.
Of these I guess only #1 is really relevant for the set-up of {{lang}}. The third scenario, in the rare cases where it happens, can be handled by the individual templates. – Uanfala (talk) 17:25, 17 September 2020 (UTC)
I am having second thoughts about this whole idea of deprecated ISO 639 code support in {{lang}} and {{lang-??}}. The purpose of {{lang}} and {{lang-??}} is to create proper html for browsers and screen readers. These user agents expect html that uses language codes defined in the IANA language-subtag-registry file. That file has (as of this writing):
all of ISO 639-1
all of ISO 639-2T except ISO 639-1 synonyms
none of ISO 639-2B (all synonymous with ISO 639-2T and ISO 639-1)
all of ISO 639-3 except ISO 639-1 synonyms and iso639-3:hbs
none of ISO 639-3 (dep)
all of ISO 639-5 except ISO 639-1 synonyms
Because html constrains language code use to those codes specified in the language-subtag-registry file and proper html is the purpose of {{lang}} and {{lang-??}}, it is improper to support deprecated language codes in these templates.
There is no such constraint for Module:ISO 639 name. While it doesn't presently support deprecated codes, there is no reason why it cannot. Given this, as a viable alternative, I am going to remove deprecated code support from Module:lang/sandbox. and add it to Module:ISO 639 name.
Trappist the monk (talk) 22:23, 17 September 2020 (UTC)
Since they are used for html browsers, then continue using them would not be helpful and since deleting one of them failed at TfD, then it is safe to assume that the others won't be deleted. That leaves us with option #3 which is to deprecate their use and leave some kind of error message. I agree that Module:ISO 639 name could use them, as they might have valid uses there. --Gonnym (talk) 12:28, 18 September 2020 (UTC)
I am not at all sure that I understand your first sentence. Which they? If by they you mean 'deprecated ISO 639 codes', we cannot expect browser or other user agents to recognize the deprecated codes. If by they you mean templates that would use the deprecated codes to create underlying html, such templates should not continue to be used as {{lang-??}} templates because they will not produce valid html. The templates should be recreated with a name not related to {{lang-??}}. Recreated templates can use Module:ISO 639 name but must not produce html output that includes the lang= attribute. The recreated templates replace transclusions of the invalid {{lang-??}} templates. The invalid {{lang-??}} templates go back to TfD.
Trappist the monk (talk) 13:33, 18 September 2020 (UTC)
They meaning the deprecated templates. --Gonnym (talk) 13:36, 18 September 2020 (UTC)
Somewhat related to this, are the templates in Category:Lang-x templates with other than ISO 639 supported by web browsers? If not, what is the difference between the deprecated codes and these? --Gonnym (talk) 13:38, 18 September 2020 (UTC)
Some might be converted to use Module:Lang
  • {{Lang-1bd}}1bd not an ISO 639 code
  • {{Lang-1ca}}1ca not an ISO 639 code
  • {{Lang-ast-leo}}leo not a valid IETF extlang
  • {{Lang-az-Arab}} – can be converted to use Module:lang; or deleted because {{lang-az|text|script=Arab}}
  • {{Lang-est-sea}}sea not a valid IETF extlang; inside uses vro (Võro)
  • {{Lang-fra-frc}}frc not a valid IETF extlang
  • {{Lang-fra-que}}que not a valid IETF extlang
  • {{Lang-gsw-als}}als not a valid IETF extlang
  • {{Lang-ku-Cyrl}} – can be converted to use Module:lang; or deleted because {{lang-ku|text|script=Cyrl}}
  • {{Lang-lat-med}}med not a valid IETF extlang
  • {{Lang-lom-ber}}lom is Loma (Liberia) (perhaps lmo Lombard); ber not a valid IETF extlang
  • {{Lang-nds-NL}} – can be converted to use Module:lang; or deleted because {{lang-nds|text|region=NL}} ([text] <span style="color:#d33">Error: {{Langx}}: invalid parameter: &#124;region= ([[:Category:Lang and lang-xx template errors|help]])</span>) (Hmm; bug? language name should be Dutch Low Saxon)
  • {{Lang-nl-BE}} – can be converted to use Module:lang
  • {{Lang-roa-nor}}nor not a valid IETF extlang
  • {{Lang-sco-smi}}smi not a valid IETF extlang
  • {{Lang-sq-definite}}definite not a valid IETF variant
  • {{Lang-uniturk}}uniturk not an ISO 639 code
  • {{Lang-vi-hantu}}hantu not a valid IETF variant
We can, if required, create private use IETF tags for those that can't be handled in another way.
Trappist the monk (talk) 15:06, 18 September 2020 (UTC)
The bug might be related to the changes between sandbox and live maybe? Module talk:Lang/testcases has mostly failed tests. --Gonnym (talk) 15:23, 18 September 2020 (UTC)
Fixed that. The problem with |region= is tied to the data set and how we search for tag matches. What we really need to do is simplify the data set. Above, I showed that all of the standard ISO 639 language codes that we need are already in the IANA data. We have two 'override' modules: Module:Language/data/wp languages and Module:Lang/data; we only need one; ~/wp languages is a mishmash of stuff that has unknown provenance. We can abandon Module:Language/name/data and copy what we need from ~/wp languages into ~/data.
I think I'll do that ... after I remove the support for deprecated codes.
Trappist the monk (talk) 16:32, 18 September 2020 (UTC)
That sounds good, you don't need to convince me of that. I'd like for us to eventually be able to reduce the amount of modules that have an override for ISO languages and make it much clearer. --Gonnym (talk) 16:41, 18 September 2020 (UTC)
Gonnym, #3 is not the only option: a different TfD may well result in deletion (#2), and #1 is independent of the status of the codes. What option #1 entails is that editors have decided to use – as the label in the visible template output, as well as in the name of the template – a given deprecated code. It does not entail that this same code will be passed on as a lang attribute. A lang attribute can use the nearest superordinate code available: for example, if {{lang-pmu}} is created, then the label may use the language name associated with the former ISO code, but format the string itself with {{lang|phr|...}} using the code that pmu was merged into [1]. – Uanfala (talk) 13:31, 19 September 2020 (UTC)
Deprecated ISO 639 code support removed from the sandbox and live module updated to reflect other changes. Module:ISO 639 name now supports deprecated ISO 639 language codes:
{{ISO 639 name|eml|link=yes}}Emiliano-Romagnolo
so templates like {{lang-eml}} can be recreated under a non-lang associated name and their {{lang-??}} predecessors deleted.
Trappist the monk (talk) 14:52, 19 September 2020 (UTC)

deprecated ISO 639 language codes (v.2)

So I misspoke (no!, really?) Yeah, really. I wrote that none of ISO 639-3 (dep) were in the IANA language-subtag-registry file. That was wrong. IANA does include some deprecated language codes. There are 351 deprecated ISO 639-3 codes. Of those, 162 are are not included in the subtag registry. There are three deprecated ISO 639-2B codes (jaw, scc, scr); none are in the subtag registry. There is one deprecated ISO 639-2T code (mol); it is not in the subtag registry. There are five deprecated ISO 639-1 codes (in, iw, ji, jw, mo, sh); all are in the subtag registry.

Because we can legitimately support a subset of the deprecated ISO 639 codes, I have tweaked the sandbox to do so. The most common deprecated codes used at en.wiki appear to be mo and mol:

{{lang/sandbox|mo|text}}text
{{lang/sandbox|mol|text}} → [text] Error: {{Lang}}: unrecognized language code: mol (help)
{{lang/sandbox|fn=name_from_tag|mo}} → Moldovan
{{lang/sandbox|fn=category_from_tag|mo}} → Category:Articles containing Moldovan-language text
{{lang/sandbox|fn=tag_from_name|Moldavian}}Error: language: Moldavian not found

This does not fix the {{lang-eml}} problem because code eml is not in the subtag registry.

{{lang}} and {{lang-??}} templates using deprecated codes will emit a hidden maintenance message, code: $1 is deprecated, and will categorize into Category:Lang and lang-xx using deprecated ISO 639-1 codes.

Trappist the monk (talk) 19:53, 1 October 2020 (UTC)

Looks good. Can you give me the source of the ones lang uses so I can create the /testcases? --Gonnym (talk) 20:15, 1 October 2020 (UTC)
Module:Language/data/iana languages/sandbox now has two tables: active{} and deprecated{}
Trappist the monk (talk) 20:46, 1 October 2020 (UTC)
Live modules updated to accept the IANA-known deprecated ISO 639 language subtags.
Trappist the monk (talk) 14:15, 3 October 2020 (UTC)

Template:Lang-ikt

The {{Lang-ikt}} currently redirects to Inuvialuktun by way of Inuvialuk language. However, the reader is presented with Inuvialuk (as in {{Lang-ikt|[[Angakkuq]]}}) which is the singular for Inuvialuit, the people who speak Inuvialuktun. The template needs changing to show Inuvialuktun to the reader. By the way Inuvialuk language is a phrase made up for Wikipedia. Trappist the monk pinging you as you seem to be active here. CambridgeBayWeather, Uqaqtuq (talk), Huliva 20:35, 30 September 2020 (UTC)

History:
At ISO 639-3 custodian, the language name associated with ikt was changed from Inuktitut to Inuinnaqtun February 2012; see https://iso639-3.sil.org/request/2011-168
ikt with the name Inuvialuk was added to Module:Language/data/wp languages with this edit
ikt with the name Inuvialuk was copied from Module:Language/data/wp languages to Module:Lang/data today with this edit
The source of the code / name association is not known.
The the infobox at Inuvialuktun lists:
ISO 639-1: iu – ISO 639-1 name: Inuktitut
ISO 639-2: iku – ISO 639-2 name: Inuktitut
ISO 639-3: ikt – ISO 639-3 name: Inuinnaqtun
Not listed is:
ISO 639-3: iku – ISO 639-3 name: Inuktitut
So it would seem that the ikt reference at the Inuvialuktun article may be incorrect. Regardless, Module:Lang should not associate Inuvialuk with ikt.
We have an article: Inuinnaqtun. For me, the correct target for {{lang-ikt}} should be that article, not to Inuvialuktun.
Trappist the monk (talk) 22:41, 30 September 2020 (UTC)
Trappist the monk I had forgotten about that. See Talk:Inuvialuktun#ISO codes. I agree with you. CambridgeBayWeather, Uqaqtuq (talk), Huliva 01:30, 1 October 2020 (UTC)
I'm not going to insert myself into that argument. Linguists and editors who are deeply immersed in languages apparently can discern subtleties between 'integer' codes that are lost on me. The ISO 639 codes have assigned names so Module:Lang should report those names or the en.wiki-preferred synonymous names. Certainly Module:Lang should not point to someplace else.
I have disabled the override so that {{lang-ikt}} uses Inuinnaqtun from the IANA registry data when creating the link-label:
{{Lang-ikt|[[Angakkuq]]}} → {{Lang-ikt|[[Angakkuq]]}}
Trappist the monk (talk) 10:35, 1 October 2020 (UTC)

I don't know if this helps, but: The ISO 639-3 standard, which can be downloaded from https://iso639-3.sil.org/code_tables/download_tables, consists of four files. According to those, iku (=iu in 639-1) is the macrolanguage "Inuktitut" that comprises the individual living languages ike ("Eastern Canadian Inuktitut") and ikt ("Inuinnaqtun"), and file iso-639-3_Name_Index_202000515.tab says that ikt has multiple names and is also known as "Western Canadian Inuktitut". Love —LiliCharlie (talk) 09:41, 1 October 2020 (UTC)

I'll point out that {{ISO 639 name|ikt|link=yes}} links to Inuinnaqtun, so this should probably be consistent. --Gonnym (talk) 10:02, 1 October 2020 (UTC)
IANA get their data from the ISO 639 custodians. Module:Lang gets its data from the IANA subtag-registry file. Here is the IANA subtag-registry file record for ikt

Type: language
Subtag: ikt
Description: Inuinnaqtun
Description: Western Canadian Inuktitut
Added: 2009-07-29
Macrolanguage: iu

and for iu (IANA promotes 3-character codes to their synonymous 2-character, ISO 639-1, equivalents so there is no iku in the subtag-registry file):

Type: language
Subtag: iu
Description: Inuktitut
Added: 2005-10-16
Scope: macrolanguage

Trappist the monk (talk) 10:35, 1 October 2020 (UTC)
Trappist the monk.. Thanks. CambridgeBayWeather, Uqaqtuq (talk), Huliva 18:44, 1 October 2020 (UTC)
Note that the IANA database for BCP 47 subtags prefers registering 2-letter ISO 639-1 codes to the 3-letter ISO 639 in many cases, but the ISO 639-1 standard is broken in many cases, not stable. Still the ISO 639-3 standard fixed that for ISO 639-1 (also foir ISO 639-2) for some codes, by formalizing the inclusive/exclusive codes and categorizing macrolanguages and remapping some ISO 639-1 and ISO 639-2 as collection/family codes (in ISO 639-5). BCP 47 is not limtied to just the IANA database, you have to read the RFCs (and its updates), where it is stated that even for languages that are mapped with 2-letter codes, their 3-letter codes and possible ISO 639-2/B codes (for bibliographic purpose) can be handled in BCP 47 as aliases mapped to the recommended shorter code (or technical codes for mapping bibiographic codes): the IANA database then does not contain these aliases (you should know that ISO 639-1 is permanently frozen, as it cannot be fixed). If you look at the CLDR project, all these aliases are implemented, so it is not invalid to use ISO 639-3 codes for all languages or macrolanguages or even language families (however the legacy short 2-letter codes have to use their historic exclusive semantic and not the inclusive semantic in ISO 639-3, which are still VERY useful for language fallbacks in BCP 47). And so for example, you can then use "fra" (ISO639-2/B) or "fre" (ISO639-2/T) instead of "fr" (legacy ISO 639-1, still used in BCP 47 and kept for compatiblity because of its stability requirements).
Unfortunately Mediawiki is still not confiming to BCP47 and does not implement fallbacks and code aliases as it should (its fallback lists are defective, you cannot just use them and need to add a module to properly extend it; the module:Fallback in English wikipedia, also imported in Meta-Wiki and Mediawiki-Wiki, is still wrong, it is only correct in Commons where it uses the Mediawiki fallbacks and extend them for full BCP 47 compliance: this was done in Commons a long time ago, reported to Mediawiki developers that have still not fixed it; we know that Mediawiki and Wikimedia in general is very long to use the BCP47 rules even though they were fully published and standardized since long; only CLDR complies to BCP47, but Mediawiki still does not use CLDR for returtning compliant fallbacks: it is still wrong for Chinese, wrong with many legacy incorrect codes; and I'm not speaking at all about the interwiki codes which are used for another purpose, and certainly not about domain names for projects or internal wiki database names, which do not need to comply to BCP47, but only speaking about the use of language tags inside HTML with lang="", or inside CSS with "lang()" selectors, or inside Wikidata for language tagging of translated labels and properties with multiple monoligual text values: all these should be fully conforming to BCP47 but there's lot of garbage there assuming that these tags would be valid in BCP47). However nothing is invalid with codes that are aliased between parts of the ISO639 standard. verdy_p (talk) 00:32, 6 October 2020 (UTC)
I am at a loss to understand how this relates to {{lang-ikt}} specifically or to Module:Lang more generally. If you have issues with MediaWiki, Meta, or any other entity, it would be best if you took up those issues with those entities at their appropriate talk pages.
Trappist the monk (talk) 21:44, 6 October 2020 (UTC)

question

What is the templaet or module that does:

{{some template | lang=fr}} → French

?. -DePiep (talk) 19:11, 28 October 2020 (UTC)

{{in lang|fr}}(in French)
or:
{{ISO 639 name|fr}} → French
or:
{{lang|fn=name_from_tag|fr}} → French
Trappist the monk (talk) 19:51, 28 October 2020 (UTC)
Thankss! (as in: plural). -DePiep (talk) 20:04, 28 October 2020 (UTC)

A few "collective" categories not populating correctly?

I don't know what the discussion is really about, but there is an implication in this VPT discussion that this template/module somehow needs a wee adjustment as a result of one or more CFDs. – Jonesey95 (talk) 04:06, 3 November 2020 (UTC)

(moments later...) I think I got it sorted at the old category level and at the article level, but the newish "in XXXic languages" categories are in need of some improvement. They all fail to show a description, and show an obscure error message. – Jonesey95 (talk) 04:23, 3 November 2020 (UTC)
The language categories should be fixed. --Gonnym (talk) 08:07, 3 November 2020 (UTC)

Something is broken

{{Lang-uk}} in Holodomor is broken. Sławobóg (talk) 14:33, 9 November 2020 (UTC)

Urdu-Nastaliq proposed changes

Request to include {{Nastaliq}} to {{Lang-ur}} and {{Lang|ur}} by default so that users don’t have to type Urdu language text in Nastaliq script like {{lang-ur|{{Nastaliq|text}}}}, rather just {{lang-ur|text}} or {{lang|ur|text}} should do the job. It was previously proposed in November 2013 and the reasons for denying the request seem to have been exhausted. Urdu language is normally always written in Nastaliq and users on Wikipedia have to manually add {{nq}} inside {{Lang-ur}}, {{Template:Lang|Lang]]]]|ur}} etc every time. Idell (talk) 13:55, 24 August 2020 (UTC)

 Not done for now: please establish a consensus for this alteration before using the {{edit template-protected}} template. Izno (talk) 14:06, 24 August 2020 (UTC)
Reopened Idell (talk) 12:05, 31 August 2020 (UTC)
Reopened: The changes to both the templates {{Lang|ur|...}} and {{Lang-ur|...}} should be applied, as an example, to reflect this markup per example #4 in the first comment below:
{{lang-ur|{{Nastaliq|آزاد جموں و کشمیر}}}} ← {{lang-ur|{{Nastaliq|آزاد جموں و کشمیر}}}}
Idell (talk) 11:32, 3 November 2020 (UTC)

Discussion

Editors are requested to say whether and why do they support or oppose the changes proposed above:

  1. آزاد جموں و کشمیر ← plain text
  2. {{lang-ur|آزاد جموں و کشمیر}} ← {{lang-ur|آزاد جموں و کشمیر}}
  3. {{lang-ur/sandbox|آزاد جموں و کشمیر}} ← {{lang-ur/sandbox|آزاد جموں و کشمیر}} → {{code|{{lang-ur/sandbox|آزاد جموں و کشمیر}}}}
  4. {{lang-ur|{{Nastaliq|آزاد جموں و کشمیر}}}} ← {{lang-ur|{{Nastaliq|آزاد جموں و کشمیر}}}} → {{code|{{lang-ur|{{Nastaliq|آزاد جموں و کشمیر}}}}}}

To me, 3 and 4 above look the same but that's with this browser and a complete ignorance of how Nastaliq is supposed to look.

Trappist the monk (talk) 00:42, 25 August 2020 (UTC)

I am not sure if this is what the {{lang}} template is there for. My understanding is that it should not override browser settings for the language in question, but only add language markup using valid IETF language codes. Also note that our language templates do not only serve representational purposes. For example, our bots also use them to detect Latin script text that is exempted from typo correction, as {{lang}} can also be applied to a user-friendly romanized version of a language that is usually/natively written in a different script. That is, {{lang|ur|...}} does not necessarily imply {{lang|ur-Arab|...}} (Arabic script), but it can also be used for {{lang|ur-Latn|...}} (Latin script). Love —LiliCharlie (talk) 01:22, 25 August 2020 (UTC)
P.S.: The clean way to tell browsers to select a Nastaliq font is to use the ISO 15924 script code Aran for "Arabic (Nastaliq variant)" rather than to pass them a list of fonts. That is to say, if we choose to prescribe a script type at all we should use {{lang|ur-Aran|...}}. One drawback of a font list is that browsers will always select the first one listed that is available, even if that is a bad choice. For example, my system uses Noto fonts by default, but the above CSS code prevents my browser from selecting Noto Nastaliq Urdu because another of my Nastaliq fonts happens to be higher on the list. This results in a strange font mix that isn't uniform when it could and should be. Another drawback is that such a list can never be exhaustive, so not all users who have a Nastaliq font installed on their system will see what we expect them to see. Besides new Nastaliq fonts keep being released, and I don't think it is possible to keep track of them. Love —LiliCharlie (talk) 05:15, 25 August 2020 (UTC)
All four of these versions "look" the same on Safari mobile website, except for their sizes that seem to be increasing successively. On the desktop version of the website, all of them look the same to me. So in my case, it doesn’t override browser settings for the language. I suppose that is why the font size is set to 110% in the fourth example to make up for something in {{Nastaliq}} that makes the text smaller. However, like LiliCharlie, it does make me question the purpose of this template. Idell (talk) 04:35, 25 August 2020 (UTC)
  • Support I have little knowledge of the technical limitations regarding this, but it would be a very useful feature to add if possible. While Nastaliq is not necessarily compulsory, it is very commonly used on Wikipedia across most articles with Urdu text, as the "real" Urdu is always rendered in Nastaliq font right to left, which the Template:Lang-ur does not currently support on its own (it displays just the Perso-Arabic characters/text). The Nastaliq style is the conventional writing style used in Urdu newspapers, media, books, poetry, literature, calligraphy, and everyday language, so it would be good if Wikipedia could incorporate it somehow for the benefit of its Urdu readers. Mar4d (talk) 02:17, 25 August 2020 (UTC)
  • Support Urdu is mostly written in Nastaliq, however because of less recognition outside the same Naskh script used for Arabic mostly was used to render Urdu as well. As of now Nastaliq is being used in digital space and it would be better to use Nastaliq instead of Naskh for Urdu all over the Wikipedia. USaamo (t@lk) 19:10, 5 September 2020 (UTC)
  • There seems to be consensus for this request. @Trappist the monk: could you action it because you seem familiar with the template? Thanks — Martin (MSGJ · talk) 12:52, 11 September 2020 (UTC)
    Umm, if you reread this discussion you will see that the test change that I wrote was rejected by one editor and the proposer declared it ineffectual. So I reverted those changes in the sandbox. The next day, proposer reopened this request.
    Above it is suggested that the inclusion of the ISO 15924 script code Aran (|script=Aran) is the correct way to markup Urdu text so that browsers select a Nastaliq font. To my untrained eye on my browser, doing that changes the rendering but since I don't read Urdu, I cannot say if the browser is doing the correct thing or not:
    1. Urdu: آزاد جموں و کشمیر ← plain text [[Urdu language|Urdu]]: <span style="font-size:110%">آزاد جموں و کشمیر</span>
    2. {{lang-ur|آزاد جموں و کشمیر|script=Aran}} ← {{lang-ur|آزاد جموں و کشمیر|script=Aran}}
    3. {{lang-ur|{{Nastaliq|آزاد جموں و کشمیر}}}} ← {{lang-ur|{{Nastaliq|آزاد جموں و کشمیر}}}}
    I'm sure that you are correct that there is a general agreement that Urdu is supposed to be rendered with a Nastaliq font. How to make Module:Lang do that is the question. If my earlier test wasn't correct and using |script=Aran isn't correct then if there is a way, I don't know what it is.
    Trappist the monk (talk) 13:34, 11 September 2020 (UTC)
    Okay, thanks for confirming the situation. I admit the above is quite confusing for those unfamiliar. I have disabled the request, as more discussion is needed. — Martin (MSGJ · talk) 09:32, 14 September 2020 (UTC)
    Trappist the monk, out of these three, number 3 works the best. I've looked at them using
    1. Google Chrome browser on an Android device
    2. Samsung internet browser on another Android device
    3. Safari browser on an iOS mobile device
    4. Google Chrome browser on a laptop running Windows 10
    In cases i and ii, all of the versions look similar but none of them look exactly as Nastaliq text should. So, I'm going to rule that out as a general issue, as that's just how Urdu text seems to be rendered there. In case iii, all of the markups result in proper Nastaliq text with very minute differences. Case iv is the deciding factor for me, where markup 1 and 2 look like monospaced fonts or Naskh (script) that is used for Arabic etc., whereas markup 3 is the only one which looks like proper Nastaliq. Idell (talk) 11:16, 22 September 2020 (UTC)
    @Trappist the monk: can number 3 be implemented in the module somehow? @LiliCharlie: do you have any comments on the technical implementation? Just trying to push this towards a resolution because the request has been open a long time — Martin (MSGJ · talk) 08:45, 18 November 2020 (UTC)
    Number three is essentially what I initially suggested – including the font styling from {{Script/Nastaliq}} in Module:Lang/sandbox/styles.css. That was rejected or determined to be ineffectual.
    Trappist the monk (talk) 11:33, 18 November 2020 (UTC)
    I give up. Closing the request (again) ... — Martin (MSGJ · talk) 12:22, 19 November 2020 (UTC)
  • Support per what USaamo said above. Nastaliq is something we find convenient as well. ─ The Aafī (talk) 12:28, 3 October 2020 (UTC)
  • I wonder if it wouldn't be better to submit a task upstream if in all cases the particular font family shouldn't be preferred. --Izno (talk) 13:13, 3 October 2020 (UTC)

Wikt-lang, wt, and language tags

{{Wikt-lang}} and {{wt}} recently went through TFD with the intention to merge the latter into the former. The overarching problem, though, is that {{wikt-lang}} puts a lang= tag around the entry, while {{wt}} doesn't. The issue comes up when either template is placed in a {{lang}} template, as the HTML attributes get duplicated if {{wikt-lang}} is used. See User:Bradv/sandbox/wt for a breakdown of the issues.

Bradv and I have been discussing this, and we think the only way to fix the issue is to have Module:Lang not add the lang= tag if the value being passed to it already has one. Neither of us are sure how to do that (or even if we can do it), hence this post. Any and all help would be appreciated. Primefac (talk) 21:09, 30 November 2020 (UTC)

I'm not sure if this ever happens, but using this solution, how would the template handle cases where text tagged as one language includes text tagged as another language (potentially nested multiple times)? Suppose French text contains Arabic text that itself contains French text, each level tagged using {{lang}}: {{lang|fr|...{{lang|ar|...{{lang|fr|...}}...}}...}}. All three levels should apply language tagging. For {{lang|ar}}, the module could check for specifically lang="ar" in the text (rather than just any lang attribute) and apply tagging, but the outer {{lang|fr}} would see lang="fr" and fail to apply tagging. The module could do more complex analysis of the start and ends of tags containing lang attributes, but that's inefficient and best avoided. (This could be done in a subst phase, but that's tedious and wouldn't work in cases where {{lang}} is applied by another template.) Perhaps this could be fixed by using a parameter that tells the module to skip the check and apply language tagging anyway. — Eru·tuon 21:56, 30 November 2020 (UTC)
But actually the solution is technically not correct, though I'm not sure how often it matters. It only works when the outer and inner text is the same. But say you want to link a single word in a language-tagged text: {{lang|la|Hic est {{wikt-lang|la|verbum}}.}} {{lang}} should language-tag; {{wikt-lang}} should not. This is an uncommon case, but in general it's the inner template that should turn off its language tagging. When the text in the outer template and inner template is the same ({{lang|la|{{wikt-lang|la|verbum}}}}), then it happens that turning off tagging in the outer template has the same effect as turning it off in the inner template. Granted this type of case is probably more common than cases where the outer and inner text is different, as in my silly example above. — Eru·tuon 22:32, 30 November 2020 (UTC)
Your later example is a reasonable one, and if that's a use-case we should provide for (where the whole text is in another language, but only one word should link to wiktionary), then the solution is simple: Overturn the TfD, and go through hundreds of articles to make sure {{wt}} is only ever used inside {{lang}}, and {{wikt-lang}} is only ever used outside of it. That appears to have been the point of the two different templates, although they have since been used interchangeably. – bradv🍁 23:42, 30 November 2020 (UTC)
Technically, the close doesn't have to be overturned, just updated; if the issues cannot be resolved, then the "after" clause never comes into effect and the templates stay separate. I suppose I should have said "if" rather than "when" in the original close, but such is life (I guess I figured the problem was solvable). Primefac (talk) 23:44, 30 November 2020 (UTC)
Alternately, if we aren't concerned about the duplicate nested tags, we could just redirect {{wt}} to {{wikt-lang}}. I'm not sure how important the problem of nested tags is, but I don't believe it affects the functionality in the slightest. – bradv🍁 23:47, 30 November 2020 (UTC)
I'm sure I could wrangle some ACCESS/MOS folks to comment... but yeah, on a "list of problems" this seems pretty far down the priority list (and if someone does complain, then we can fix it). Primefac (talk) 23:58, 30 November 2020 (UTC)
I don't see {{lang|la|Hic est {{wikt-lang|la|verbum}}.}} as a valid use-case. It should be {{lang|la|Hic est}} {{wikt-lang|la|verbum}}., which then makes it a non-issue. — Preceding unsigned comment added by Gonnym (talkcontribs) 06:53, December 1, 2020 (UTC)
Why should one run of Latin text not be enclosed in just one HTML tag with lang attribute? The equivalent of your code with italics syntax instead of language tagging and a plain link instead of a Wiktionary link would be ''before link ''<nowiki/>''[[link]]''<nowiki/>'' after link''. That seems super weird to me and not how HTML typically works. Why split a run of text that's logically one unit into multiple sister HTML tags that don't have a unique parent not shared by the text around them? (I wonder how adjoining tags with the same lang attribute would affect a screen reader. But maybe not at all if it only cares what language the words are in.) — Eru·tuon 20:27, 2 December 2020 (UTC)
HTML has no issue with content with a lang having a sub-element with its own lang, much less if that inner lang is the same as the outer.
The larger question is the italics one that {{wt}} presents, but that looks like a non-issue if all it does is match its parent content. --Izno (talk) 00:28, 1 December 2020 (UTC)
Izno, redirecting {{wt}} to {{wikt-lang}} will result in Latin words getting italicized, but isn't that what the MOS recommends anyway? – bradv🍁 01:37, 1 December 2020 (UTC)
The ones absorbed into common English use, no (technically; for example, "et al" is not italicized), though if it's already in a block of other-language text I don't think anyone should/will go out of their way to unitalizice the text for some rule like that. But maybe that is just the particular language you picked as an example. ;) No, I don't see issue now with simply redirecting. --Izno (talk) 03:50, 1 December 2020 (UTC)
If an italics tag inside of an italics tag is ever supposed to have different CSS properties from the top-level tag, for instance if it is meant to un-italicize, the nesting of tags would have to be fixed. But I haven't heard of such a CSS rule on Wikipedia. (I dislike the inelegance of nested tags, but that's not compelling reasoning.) — Eru·tuon 04:00, 1 December 2020 (UTC)
It doesn't exist today, but I've been thinking it maybe should (to fix an issue I've observed with the citation modules). Still, lang related styling can get around that by presence of the lang tag probably... --Izno (talk) 05:36, 1 December 2020 (UTC)
I don't know what you mean exactly (italicizing only if there's not a lang attribute in the nested italics tag?), but the trouble is that some nested italics tags should not be italicized while others should. Say you've got an italicized quotation from a German grammar of a Latin-script language, say Latin. And it has Latin words that are italicized in the original (or, if the German is in Fraktur, presented in roman type, equivalent to italicization) so they should be un-italicized in the quotation. And some of the German words, not italicized in the original, are linked to Wiktionary (maybe they're unusual or something), and they should not be unitalicized. Now both the Latin words and the German words use the italics tag, and I don't see how CSS can figure out (so to speak) that it needs to un-italicize one and not the other. CSS doesn't have a "different value for lang attribute than closest parent lang attribute", but even that wouldn't work in all cases because a German word could also be italicized in a German text such that it should be un-italicized in an italicized quotation. (Say a German text on historical phonology is comparing the stop consonants in High German and Low German words.) So in theory a nested language tag, in any Latin-script language, could go either way as to un-italicization. — Eru·tuon 08:17, 1 December 2020 (UTC)
Right now, the citation modules italicize like so: ''WorkTitle'' producing <i>WorkTitle</i>. When the work title contains something typically italicized (such as the name of a species or a ship), as an editor you are expected to do something like |title=Part A ''Ship Name'' Part B. However, this formulation produces HTML like <i>Part A </i>Ship Name<i> Part B</i>, when the appropriate HTML would be <i>Part A <i>Ship Name</i> Part B</i> where the inner <i> would be styled like normal text. An inverse of the comment you made above regarding splitting the text for a Wikisource link, as it were. --Izno (talk) 21:30, 2 December 2020 (UTC)
Regarding the initial comment. I've been working the past few weeks on streamlining Module:Language/sandbox and Module:Lang/sandbox so many of the functions Module:Language duplicated are now instead used via Module:Lang, including handling of the language tags. It should be almost complete (I think one of the last issues is to add the private-use language tags to the lang/data module). --Gonnym (talk) 12:53, 1 December 2020 (UTC)
Glad to hear it. Primefac (talk) 14:29, 1 December 2020 (UTC)
Could we add the |italic=no option from Module:Lang into Module:Language/sandbox? Then code like this would be possible: {{wikt-lang|fr|bonjour|italic=no}}. – bradv🍁 14:42, 1 December 2020 (UTC)
The module already has an option to handle italics (wasn't added by me), at lines 152-153:
local italics = args.italics or args.i
italics = not (italics == "n" or italics == "-")
Which is documented at Template:Wikt-lang. So setting |i=n, |i=-, |italics=n, or |italics=- will disable italics. But it's probably better to use the same parameter syntax as Module:Lang so it will be easier for editors. --Gonnym (talk) 15:59, 1 December 2020 (UTC)

Lang templates in opening paragraph when multiple languages need transliteration

How do I properly use Lang in an article when I want to transliterate two languages? See my hacks in Water roux to get the Chinese transliteration to appear. KarenJoyce (talk) 16:28, 8 December 2020 (UTC)

{{lang-zh}}, alas, is not one of the templates that is supported by Module:Lang and that template does not support |translit=. To get round that, you can write this:
{{lang|fn=lang_xx_inherit|code=zh|text=湯種|translit=tāngzhǒng}}
Chinese: 湯種, romanizedtāngzhǒng
Trappist the monk (talk) 16:38, 8 December 2020 (UTC)

Mongolian Unicode on iOS 14.2 and Safari

Had an issue at Inner Mongolia University of Finance and Economics with the Mongolian text appearing as a single unbroken line of characters. Posted to talk page of Mongolian Unicode template, and mentioning here in case it is useful to someone. I am told it looks fine on other platforms, so this doesn’t require troubleshooting; I figure I probably need to down load some browser extensions or something. Elinruby (talk) 15:45, 13 December 2020 (UTC)

@Elinruby: Maybe you only need to install another font, one that is capable of vertical display. See the note on Noto Sans Mongolian version 1 (later versions are okay) at Help:Multilingual support#Mongolian, but that's not the only Mongolian font that causes trouble. There are also plenty of downlod links on that page. I also recommend you save the currently installed font(s) to disk and uninstall them, so your browser won't use them any longer. Love —LiliCharlie (talk) 17:32, 13 December 2020 (UTC)

ms-Latn should be supported

The Malay language has at least three writing systems (Latin and Arabic in wide use, and Thai script in Thailand). This should also affect the three-letter codes may-Latn (which is synonymous), and msa-Latn (a superset). This would not affect the subset in AKA ind (Indonesian language, written only in the Latin alphabet).  — SMcCandlish ¢ 😼  05:39, 14 December 2020 (UTC)

IANA disagree. Because the purpose of these templates is to render correct html, they disallow certain language code / script pairs as the subtag registry directs. Here is the entry for subtag ms:
Type: language
Subtag: ms
Description: Malay (macrolanguage)
Added: 2005-10-16
Suppress-Script: Latn
Scope: macrolanguage
Subtag may is an ISO 639-2B language tag that is not supported by IANA (where there is a synonymous ISO 639-1 language tag, IANA uses that so these templates use the shorter form). Similarly subtag msa is an ISO 639-2T language tag that has an ISO 639-1 synonym so is not supported by IANA and these templates.
If you disagree with IANA's determination, the place to discuss that is at IANA.
Trappist the monk (talk) 12:02, 14 December 2020 (UTC)

Western Abnaki

{{Lang-abe}} (Western Abnaki) currently does not put things in italics (for example, {{Lang-abe|[[Quebec City|Kephek]]}}). Esszet (talk) 01:36, 26 December 2020 (UTC)

lang code from script code?

In overview. In a template, I have (I know) the ISO 15924 script code, like "Arab", "Cyrl". (Nice overview here; there are ~200 script codes). A script can be used in many languages. I want to use that known script code to add {{lang}}, as a wrapper, so as to help all browsers.

Question now: is there a template or module that can translate "When the script is Xxxx, most common lang is yy; so use {{lang|yy}}".

Same question in detail: {{Unichar}}, now in redevelopment/reconstruction, uses Module:Unicode data. It handles non-Latin characters by essence. So to add the enveloping {{lang}}, we need lang. Then, instead of requiring a manual input edit |lang=ar (to auto add {{lang|ar}}), WP could deduce a lang (e.g., the most helpful, most commonly used lang for that script) from the script I think. -DePiep (talk) 00:54, 15 January 2021 (UTC)

  • It might be nice to have a preview notice and/or tracking category for uses of {{lang}} where the script supplied is the presumed script for that language. Which is what DePiep is basically asking for. --Izno (talk) 19:04, 15 January 2021 (UTC)

Change request

I made some changes to the make_text_html() function to make use of mw.html. I believe it makes everything easier to read and harder to break by mis-concatenating a string. The new version is here and only breaks some tests due to the order of the attributes. josecurioso ❯❯❯ Tell me! 00:49, 21 January 2021 (UTC)

Consistent hyphenation in tooltip text

Compound adjectives need hyphens. The module’s output is inconsistent. Examples:

Markup Renders as
{{lang-uk|Українська|transl=Ukrainska}}

{{lang-uk|Українська|translit=Ukrainska}}

{{transl|uk|Ukrainska}}

Ukrainska

{{transl|Cyrl|Ukrainska}}

Ukrainska

{{lang|uk|Українська}}

Українська

The tooltips displayed are:

  1. Ukrainian-language romanization
  2. Ukrainian-language romanization
  3. Cyrillic-script transliteration
  4. Ukrainian language text

The last needs a hyphen, to make it read as a compound adjective plus noun: “Ukrainian-language text.” Right now it is adjective noun noun, and doesn’t really make sense.

Safer if someone familiar with the code attempts this, but I’d be willing to give it a shot myself.

(Also odd that italicization is unexplainably different in nos. 1/2 and 3.) —Michael Z. 01:14, 21 January 2021 (UTC)

Also also odd that 1 and 2 are “romanization” but 3 “transliteration.” They are all the same (transliterations and also more specifically romanizations), so why are the labels different? —Michael Z. 01:16, 21 January 2021 (UTC)
And 1 lacks a tooltip but 4 has one, when they are functionally the same. —Michael Z. 01:21, 21 January 2021 (UTC)
{{lang-uk|Українська|translit=Ukrainska}} → {{lang-uk|Українська|translit=Ukrainska}}
Українська does not have a tool tip because because it has a link to Ukrainian language
Ukrainska|translit= uses the same code as {{transl}} so has a tool tip: 'Ukrainian-language romanization'
{{transl|uk|Ukrainska}}Ukrainska
Ukrainska is a romanization so has a tool tip: 'Ukrainian-language romanization'
{{transl|Cyrl|Ukrainska}}Ukrainska
example is malformed; Cyrlic script cannot be a romanization so 'transliteration' is more appropriate for correctly formed templates (a 'Cryillicazation' of Latin-script text ...)
{{lang|uk|Українська}}Українська
does not render a link to Ukrainian language so has a tool tip: 'Ukrainian language text' but needs a hyphen. I can attend to that tomorrow.
Trappist the monk (talk) 02:08, 21 January 2021 (UTC)
Hyphen added.
Trappist the monk (talk) 13:42, 21 January 2021 (UTC)
Thank you!
Markup Renders as
{{lang-uk|Українська|transl=Ukrainska|label=none}}

{{lang-uk|Українська|translit=Ukrainska|label=none}}

{{transl|Cyrl|Ukrainska}}

Ukrainska

Example 1: the lang-xx templates should add the “XX-language text” tooltip when the label is hidden, for consistency.
Example 2 is not malformed. The script code indicates the source script for the purpose of composing the tooltip (just like a language code indicates source), and is not used to add an HTML language tag like lang="und-Cyrl". It can be used for passages with non-linguistic content (e.g., codes or nonsense, I guess), of undetermined language, or with multiple languages. The rendered HTML is only <span title="Cyrillic-script transliteration">Ukrainska</span>; perhaps it would be clearer if it were “romanization from Cyrillic,” but that should be part of another conversation.
Do we even have templates usable for transliteration into anything other than Latin script, i.e., Arabification, Cyrillization, etcetera? —Michael Z. 16:03, 21 January 2021 (UTC)
Example 1 fixed.
Not really sure why {{transl}} is used with script-only tags. If we are writing about nonlinguistic content, why do we need to wrap that content in a template? The Module:Lang version of {{transl}} is an enhanced version of the original wikisource template in that it adds the title= and lang= attributes and italicizes non-English romanizations. When given just an ISO 15924 script tag, the module version does more-or-less what the wikitext version did except that it also creates a title= attribute for whatever that is worth. I suppose that we could have the module create a lang=und-<script tag> (undetermined) or lang=zxx-<script tag> (No linguistic content) attribute when given only a script tag. I don't recall if I thought about this when I converted the wikitext to Lua; if I did I suspect that it got pushed off the queue by more important stuff.
Still, why do we need this template for nonlinguistic content? If we are to keep it, I agree that something like romanization from [<script>] might be a better choice for a title= attribute.
I do not know of any transcription templates for Arabification, Cyrillization, etcetera.
Trappist the monk (talk) 00:18, 22 January 2021 (UTC)

Error message

https://wiki.riteme.site/wiki/Category:Articles_containing_traditional_Chinese-language_text

It has a error message Error: traditional Chinese is not a valid ISO 639 or IETF language name. Please see Template talk:Lang for assistance.

How do you fix the error — Preceding unsigned comment added by 64.222.180.90 (talk) 18:29, 15 March 2021 (UTC)

The error happens because the module can't find the corresponding code for "traditional chinese". As I understand it the solution would be to add an entry in the override table in Module:Lang/data like ["zh-hant"] = {"Traditional Chinese"}, but the page can only be edited by admins and template editors. josecurioso ❯❯❯ Tell me! 19:46, 15 March 2021 (UTC)
Just checked and Category:Articles containing simplified Chinese-language text has the same problem so probably add another one with ["zh-hans"] = {"Simplified Chinese"},. josecurioso ❯❯❯ Tell me! 19:48, 15 March 2021 (UTC)
I have replace the inappropriate {{Non-English-language text category}} (because these categories are not populated by Module:Lang) with more-or-less appropriate wikitext.
Trappist the monk (talk) 21:32, 15 March 2021 (UTC)

Type of space used after "lit."

In {{Lang-??}} templates, change the space or non-breaking space following "lit." (|lit=) to a thin space. The value is output inside quotation marks by default and there's a full stop on the other side of the space; it takes up too much space as it is. Idell (talk) 17:32, 23 March 2021 (UTC)

 Not done for now: please establish a consensus for this alteration before using the {{edit template-protected}} template. The current code renders <small>[[Literal translation|lit.]]&nbsp;</small>&#39;translation&#39; (the #39; HTML entity is a single quote mark), which looks like: lit. 'translation'.
Is there a MOS section or other guideline that advises the use of thin spaces in cases like this? – Jonesey95 (talk) 19:52, 23 March 2021 (UTC)
MOS:ACRO1STUSE recommends the use of thin spaces where "... the awkwardness of the abbreviation ... can be reduced by replacing the full-width spaces with thin-space characters." That is, to better group the letters into a unit in the example there. meta:Help:Newlines and spaces recognizes the use, giving SI units and values as an example: "depending on the language- and typographic-specific rules, which sometimes require a non-breaking thin space instead (half of a normal space)." Finally, both {{circa}} and {{floruit}} use thin spaces in situations much similar to what we have here. Idell (talk) 21:47, 23 March 2021 (UTC)
I find the consistency with c. and fl. a persuasive argument. Circa has used thinsp since 2009, AFAICT. I have no objection to extending that usage to this template. – Jonesey95 (talk) 00:04, 24 March 2021 (UTC)
(edit conflict)
Here are some examples using the markup from this template instance:
{{lang-es|casa|lit=house}} → {{lang-es|casa|lit=house}}
The template output is:
[[Spanish language|Spanish]]: <i lang="es">casa</i>, <small>[[Literal translation|lit.]]&thinsp;</small>&#39;house&#39;
and the examples:&#8239
Spanish: casa, lit. 'house' ← uses &nbsp; (template output as currently written)
Spanish: casa, lit.'house' ← uses &thinsp;
Spanish: casa, lit.'house' ← uses &#8201; (thin space)
Spanish: casa, lit.'house' ← uses &#8239; (narrow no-break space)
Spanish: casa, lit.'house' ← uses U+2009 Unicode thin space character
Spanish: casa, lit. 'house' ← uses U+0020 Unicode space character
Spanish: casa, lit.'house' ← uses U+202F Unicode narrow no-break space
Spanish: casa, lit. 'house' ← uses width:0.1em css
My browser and OS (Chrome latest, win10), no matter how close I zoom in, there is no difference among the above examples. Here are the same examples with the various spaces moved outside the <small>...</small> tags:
Spanish: casa, lit. 'house' ← uses &nbsp;
Spanish: casa, lit. 'house' ← uses &thinsp;
Spanish: casa, lit. 'house' ← uses &#8201; (thin space)
Spanish: casa, lit. 'house' ← uses &#8239; (narrow no-break space)
Spanish: casa, lit. 'house' ← uses U+2009 Unicode thin space character
Spanish: casa, lit. 'house' ← uses U+0020 Unicode space character
Spanish: casa, lit. 'house' ← uses U+202F Unicode narrow no-break space
Spanish: casa, lit. 'house' ← uses width:0.1em css
Again, for me, no visible differences.
Trappist the monk (talk) 00:28, 24 March 2021 (UTC)
On Safari, &thinsp;, &#8201; and U+2009 give the exact same result within or outside <small>...</small> tags. Nbsp and the Unicode space character result in wider but same amount of spacing whether within or outside <small> tags.
How are you going to prevent wrapping? Idell (talk) 14:04, 24 March 2021 (UTC)
I've added U+202F Unicode narrow no-break space to the above.
Trappist the monk (talk) 14:30, 24 March 2021 (UTC)
Added css alternative.
Trappist the monk (talk) 14:58, 24 March 2021 (UTC)
Amount of whitespace widths as they appear to me: &#8239;=U+202F(NNBSP) < &#8202;=&hairsp; < width:0.1em < &#8201;=U+2009=&thinsp;
If being closest to thin space is the goal, then I recommend the CSS code. I suppose that would be consistent across devices as well, avoiding the rendering differences. Idell (talk) 15:35, 24 March 2021 (UTC)
On my screen, the full-size spaces look bigger than the narrow/thin spaces. The CSS is just slightly narrower. I prefer the thin space options. – Jonesey95 (talk) 16:33, 24 March 2021 (UTC)

Then to prevent wrapping, I'll recommend the code: <span style="white-space:nowrap">&thinsp;</span>. But I still prefer a narrower character as I've suggested above. There appears to be a lot of space particularly because there is a single-quotation mark after the space, in addition to the full stop before it (lit. 'house'). Idell (talk) 17:28, 24 March 2021 (UTC)

Having compared the results on different screens I’ve decided: We should use &thinsp;. It seems to be the common practice. It is the one of the most consistent options across different devices. To prevent line breaking, we should use white-space:nowrap as I've suggested above; it's better than nothing. On many devices, <small> tags have no effect on some spacing characters including thin space so let it remain inside. No reason to put the thin space outside them. Idell (talk) 14:39, 25 March 2021 (UTC)
It looks like this change has been implemented, but there is a minor bug when some wikilinks are used in the |lit= parameter. The closing span tag is put in the wrong place. Example: {{Lang-ru|Феникс|lit=[[Phoenix (mythology)|phoenix]]}}. The closing span tag is placed after "Phoenix" and before " (mythology)", in the middle of a wikilink. – Jonesey95 (talk) 05:31, 29 March 2021 (UTC)
Withdrawn.
Trappist the monk (talk) 13:32, 29 March 2021 (UTC)
Sticking to the basic point of the edit request: replace the space with a simple &thinsp;. Idell (talk) 14:07, 29 March 2021 (UTC)
I have reinserted the thinsp without attempting to modify the wrapping. – Jonesey95 (talk) 15:57, 29 March 2021 (UTC)