Module talk:WikidataIB/Archive 3
This is an archive of past discussions about Module:WikidataIB. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 | Archive 5 | → | Archive 8 |
New parameter for getValue sought to avoid attempt to resolve redirects
Line 371 and 372 perform expensive operations that record tranclusions to pages. This occasionally results in the module outputting a plain label when an article exists but is not a redirect (Module assumes is a DAB page). However, because of artitle.id and artitle.isRedirect, even though no clickable link is produced, a page link is still registered, causing the page to pop up on Disambiguation pages with links. I would like a parameter to pass to getValue that will disable line 371/372. If no valid sitelink exists, go ahead and link to Wikidata, without checking for a local redirect. Terribly verbose but something like "noRedirectResolution". Alternatively, disable this section entirely until such time we can test for page existence without registered a page link. -- ferret (talk) 00:32, 30 May 2018 (UTC)
- The reason for that code is that Wikidata refuses to allow an entry like archaeologist (Q3621491) to have a sitelink to Archeologist, merely because the latter is a redirect (the English Wikipedia discusses 'archaeologist' as part of the 'Archaeology' article). If it were not for that code, biographical infoboxes would not have links to many of the occupations, for example. Who would make use of a parameter that disabled the code? Would any infobox designer use a parameter that disabled linking for an unpredictable number of values or fields? I seriously doubt it. There is no problem with expensive calls in the module – the call mw.title.new(id) is not used. The problem is that mw.title.new(label, namespace) – which is called when there is a label, but no sitelink – incorrectly makes an entry in the table used for "What links here". That affects every Lua module using that call, not just WikidataIB, and that is the problem that needs to be fixed. I'll investigate alternate methods of deciding whether a label represents a linkable article on the local Wikipedia. In the meantime, anybody can bypass the problem by supplying a local value. --RexxS (talk) 15:07, 30 May 2018 (UTC)
- @Ferret: I've been trying to minimise the use of mw.title.new(label, namespace) in the sandbox, but I'm having difficulty testing because I can't find examples of the effect you are referring to. Could you list perhaps half-a-dozen articles where the problem occurs, please? and I'll see if I can improve matters at all. --RexxS (talk) 15:50, 30 May 2018 (UTC)
- @RexxS: mw.title.new itself is not an issue, it is calling .id or .isRedirect on the resulting object. These are the ones that register page links. Even if the resulting output does not contain a link, soon as .id/.isRedirect are used, a transclusion is registered. I have not been able to find any way to test for page existence that does NOT cause a page link to register. The source of complaint resulting in this request is at Template talk:Infobox video game#The series field and Template talk:Infobox video game#Skate (video game) and Template:Infobox video game (series= field). In both cases, editors who are attempting to resolve DABLINK reports have came across cases where the series parameter is pulling from Wikidata, and the linked item does not have an enwiki article. Line 371 and 372 take the linked item's label and attempt to discover if its a redirect (neither case was), and then outputs a plain label (assuming the article to be a DAB). On the surface, this is exactly the behavior you would want. Unfortunately, calling .id and .isRedirect register a pagelink anyways, and the article is pushed into the DABLINK reports. See the back and forth edit history on Skate (video game). As a test case, you can look at Module:Sandbox/Ferret. If you check Whats Links Here on User:Ferret/sandbox, you'll see that User:Ferret/sandbox2's use of the module and artitle.isRedirect (Well, redirectTarget, but same result) cause it to register as a linked page. I used maketitle but new works the same way. It's the call to the title object that actually makes the link, not mw.title.new. -- ferret (talk) 16:42, 30 May 2018 (UTC)
- @Ferret: Sure, that's true. But as soon as I create the object, I test for its id and isRedirect properties, otherwise there's no point in creating it, so we end up with the same result. I've re-written the sandbox code so that if there is no sitelink, I check whether the value itself (which is a wikibase-item) has a property instance of (P31) equal to Wikimedia disambiguation page (Q4167410). If so, I give the plain label and skip checking for existence of an article and whether it's a redirect. Now, if a wikibase entity exists which has no sitelink to enwiki, has a English label, is not an instance of a dab page, has an article title on enwiki the same as the label, but is not a redirect, then I don't know what it is, but it will turn up in Disambiguation pages with links. However, I hope there won't be many of those once the sandbox code is implemented.
- As for your sandbox and the rest, I'm aware of the spurious linking, and I understand how it happens, but I'm still waiting for somebody to give me a few examples of actual articles using WikidataIB that provoke the problem. Then I can test whether the sandbox code fixes the problem. I'm not keen to modify the main module until I can show that the sandbox code does what I'm expecting it to do. Can you help with that? --RexxS (talk) 18:26, 30 May 2018 (UTC)
- Skate (video game) does it. It is currently not doing it because the wikidata pull has been suppressed. Same for Eye of the Beholder (video game). These are the two that started the discussions of spurious links. Removing |series from either should do it. -- ferret (talk) 21:03, 30 May 2018 (UTC)
- @Ferret: Thanks. That let me test other possibilities. I agree that the call mw.title.new(label, namespace) alone doesn't cause the spurious link. I thought I could circumvent the issue by replacing the test using the title object's id and isRedirect properties with one that uses the title object's redirectTarget property. Unfortunately, despite the documentation not indicating it, it also creates a spurious link. I'll have another think. --RexxS (talk) 13:32, 31 May 2018 (UTC)
- I found the same issue. Several of the title object properties are not explicitly labelled as creating a site link but do. It seems to be any of the expensive ones that cause a DB retrieval though. -- ferret (talk) 13:36, 31 May 2018 (UTC)
- That's right. It's because the devs used a short-cut to record the change of state in the same table that records links. And I'm now having an argument with Anomie on this very issue at mw:Extension talk:Scribunto/Lua reference manual #Title object and spurious links because I dared to update his documentation. If you have a moment to spare, You might pitch in to emphasise just how important it is to fix this bug, because otherwise the title library is pretty useless in Wikipedia. Cheers --RexxS (talk) 17:51, 31 May 2018 (UTC)
- Anomie appears to be correct in the strictest sense. isRedirect is a link, redirectTarget is a transclusion. Both appear in "What Links Here". I do not know if DABLINK stuff tells the two apart or not. If so, it is worth trying redirectTarget to find if a redirect works without tripping DABLINK. -- ferret (talk) 19:06, 31 May 2018 (UTC)
- Well, Anomie is insistent that redirectTarget doesn't create a link, so I'll happily refer any future complaints to him to sort out. The sandbox code does indeed only produce a transclusion, so I'm happy that it's an improvement, but I'd be willing to bet that the gnomes who look for links to dab pages won't see it that way. I'm waiting for the RfC on using Wikidata in infoboxes to be concluded before rolling out the updates currently in the sandbox, but hopefully those will lay to rest many of the current issues that folks have with how we fetch Wikidata. Cheers --RexxS (talk) 20:25, 31 May 2018 (UTC)
- I've also re-enabled my test code in User:RexxS/sandbox/Wikidata so we can check whether that page shows up as a link to the dab page [ [Skate]] in the other reports. --RexxS (talk) 20:29, 31 May 2018 (UTC)
- Anomie appears to be correct in the strictest sense. isRedirect is a link, redirectTarget is a transclusion. Both appear in "What Links Here". I do not know if DABLINK stuff tells the two apart or not. If so, it is worth trying redirectTarget to find if a redirect works without tripping DABLINK. -- ferret (talk) 19:06, 31 May 2018 (UTC)
- That's right. It's because the devs used a short-cut to record the change of state in the same table that records links. And I'm now having an argument with Anomie on this very issue at mw:Extension talk:Scribunto/Lua reference manual #Title object and spurious links because I dared to update his documentation. If you have a moment to spare, You might pitch in to emphasise just how important it is to fix this bug, because otherwise the title library is pretty useless in Wikipedia. Cheers --RexxS (talk) 17:51, 31 May 2018 (UTC)
- I found the same issue. Several of the title object properties are not explicitly labelled as creating a site link but do. It seems to be any of the expensive ones that cause a DB retrieval though. -- ferret (talk) 13:36, 31 May 2018 (UTC)
- @Ferret: Thanks. That let me test other possibilities. I agree that the call mw.title.new(label, namespace) alone doesn't cause the spurious link. I thought I could circumvent the issue by replacing the test using the title object's id and isRedirect properties with one that uses the title object's redirectTarget property. Unfortunately, despite the documentation not indicating it, it also creates a spurious link. I'll have another think. --RexxS (talk) 13:32, 31 May 2018 (UTC)
- Skate (video game) does it. It is currently not doing it because the wikidata pull has been suppressed. Same for Eye of the Beholder (video game). These are the two that started the discussions of spurious links. Removing |series from either should do it. -- ferret (talk) 21:03, 30 May 2018 (UTC)
- @RexxS: mw.title.new itself is not an issue, it is calling .id or .isRedirect on the resulting object. These are the ones that register page links. Even if the resulting output does not contain a link, soon as .id/.isRedirect are used, a transclusion is registered. I have not been able to find any way to test for page existence that does NOT cause a page link to register. The source of complaint resulting in this request is at Template talk:Infobox video game#The series field and Template talk:Infobox video game#Skate (video game) and Template:Infobox video game (series= field). In both cases, editors who are attempting to resolve DABLINK reports have came across cases where the series parameter is pulling from Wikidata, and the linked item does not have an enwiki article. Line 371 and 372 take the linked item's label and attempt to discover if its a redirect (neither case was), and then outputs a plain label (assuming the article to be a DAB). On the surface, this is exactly the behavior you would want. Unfortunately, calling .id and .isRedirect register a pagelink anyways, and the article is pushed into the DABLINK reports. See the back and forth edit history on Skate (video game). As a test case, you can look at Module:Sandbox/Ferret. If you check Whats Links Here on User:Ferret/sandbox, you'll see that User:Ferret/sandbox2's use of the module and artitle.isRedirect (Well, redirectTarget, but same result) cause it to register as a linked page. I used maketitle but new works the same way. It's the call to the title object that actually makes the link, not mw.title.new. -- ferret (talk) 16:42, 30 May 2018 (UTC)
Major update, June 2018
While the Wikipedia:Wikidata/2018 Infobox RfC has been running, I've not updated the main module, but left developments in the sandbox. Now that the RfC has ended, I've updated the main module from the sandbox. This has brought some improvements both in performance and functionality.
- Performance gains will occur because the module now only loads the part of the Wikidata entry that it needs for the call, rather than the whole Wikidata entry (which was formerly the only way of accessing the data). The calls to other Wikidata entries which are not immediately associated with the page are no longer expensive.
- The getValue call now supports ranks via a parameter, offering more flexibility than using getPreferredValue (which is now just a call to getValue with rank set to "best").
- The getValue call now supports returning qualifiers as values in parentheses after the property value.
- The getValue call now allows a number of extra parameters to provide extra functionality such as limiting the number of values returned, or auto-collapsing the list of returned values if the number of values exceeds a given number. Details are in the documentation at Module:WikidataIB/doc #Parameters to getValue.
- There is a new call, getValueByQual, which works similarly to getValue, but only returns value(s) that have a particular qualifier with a given value. Details are in the documentation at Module:WikidataIB/doc #Function getValueByQual.
- There is a new call, getValueByLang, which works similarly to getValue, but only returns value(s) that have the qualifier language of work or name (P407) with a given language code as its value. Details are in the documentation at Module:WikidataIB/doc #Parameters to getValueByLang.
- The getValue call now displays its results a little differently when the value returned has a sitelink available. In those cases, the link remains to the sitelink, but the displayed text uses the site link (with disambiguation text removed) instead of the label. This is a response to the vulnerability of labels to vandalism on Wikidata.
- The wrapper template
{{wdib}}
is a convenient shortcut for{{#invoke:WikidataIB |getValue | ...}}
using the same parameters. For example the spouse (P26) of Douglas Adams (Q42):{{wdib |P26 |qid=Q42 |fwd=ALL |qual=DATES}}
gives Jane Belson (1991–2001)
There are numerous test cases/examples at Module talk:WikidataIB/testing. Please ping me if problems arise. --RexxS (talk) 18:27, 13 June 2018 (UTC)
- This module is getting ridiculously bloated. {{3x|p}}ery (talk) 21:24, 13 June 2018 (UTC)
- Don't be so rude. Nevertheless, you should feel free to make constructive suggestions about which parts of the functionality you would remove to improve the module. --RexxS (talk) 23:25, 13 June 2018 (UTC)
getPreferredValue causing problem on page without a WD item ?
I know, the simple answer is "So make the item."
{{Infobox video game}} on use at The Sinking City is not displaying the locally defined value for image=. Currently the article has no WD item linked. Substituting a random QID for testing, the image will then appear. Looks like something isn't gracefully failing when there is no QID. -- ferret (talk) 02:28, 14 June 2018 (UTC)
- I believe this is now fixed. {{3x|p}}ery (talk) 02:38, 14 June 2018 (UTC)
- Confirmed, is fixed, thanks. -- ferret (talk) 02:40, 14 June 2018 (UTC)
Utility functions
One of the important consequences of developing ways of importing Wikidata into other Wikimedia projects is that work done in one project can be put to use int other projects. This module is used on 40 other projects at present, including Commons. The utility functions such as emptyor, getLang, formatNumber, examine, etc. will find use in debugging, constructing infoboxes, or developing functionality. Several of those could be moved into one or more different modules, but that simply means that any other project that wants to use them will end up importing multiple modules for no gain. Keeping all of the utility functions with the main module also means that if I'm asked to debug code on another project, I can be reasonably sure that I have these tools available, even if I can't read the language there.
I therefore disagree with Pppery removing utility functions with no other justification than Remove unused (outside of doc page examples) function that has nothing to do with Wikidata. I use them quite often simply in preview mode just to check or debug something, so their absence from saved pages does not correlate with their usefulness. Frankly, I also don't accept the premise that only functions that are directly related to Wikidata belong in this module. It was created to implement mechanisms for including Wikidata in infoboxes. Functions like emptyor that are useful in constructing infoboxes belong in the module, even if they don't connect directly to Wikidata. Similarly, the ability to see how a number ought to be formatted in a given language, or checking the current content language code or name, becomes valuable in any multi-lingual wiki, even if it seems superfluous on enwiki. Leaving them in place, like the code that detects which project the module is on, allows one set of code to be used across multiple projects (with local internationalisation, of course).
If there were any evidence that shortening modules by removing functionality improves performance, I'd reconsider the goal of keeping one set of code across projects. But for the present, I think that better reasons need to be given for removing functionality. --RexxS (talk) 01:18, 15 June 2018 (UTC)
- My desire to keep this module from getting overly bloated isn't because I'm trying to make my module more performant, but rather a goal to keep this module from becoming a monolith of unrelated tools. Each module should only have the things that it directly needs in the place it is directly used, not random other miscellany, not things that are only useful in slightly different contexts (like other Wikis), thus things that don't contribute to this purpose are not necessary and should be purged. The English Wikipedia isn't multi-lingual, so
the ability to see how a number ought to be formatted in a given language, or checking the current content language code or name, becomes valuable in any multi-lingual wiki
is irrelevant. As I said before, this is Module:WikidataIB, and should contain only functions that actually involve wikidata and infoboxes, not other random miscellany. {{3x|p}}ery (talk) 01:41, 15 June 2018 (UTC) - Everyone would consider it ridiculous to have a template that did something like
{{#switch:{{{1|}}}|foo = <code foo>|bar = <unrelated code bar> | baz = <unrelated code baz> | quux = <unrelated code quux>}}
. How is this module somehow different? {{3x|p}}ery (talk) 01:45, 15 June 2018 (UTC) - (ec) I'll try to sort out what's going on later. I was thinking of reverting all Pppery's adjustments because they were often unnecessary and abrasive, but there might have been some points that should be retained. In particular the procedure of fiddling in a live module with many thousands of transclusions takes bold in a bad direction. The template editor right should be retained only by editors with the right temperament for collaboration in a technical area. Meaningless optimization of this module for use on enwiki in a manner that complicates its use on other projects is a bad idea. There might be some point if there were test cases that showed the optimization saved significant time or resources, but of course there aren't because there is no observable improvement. Johnuniq (talk) 01:46, 15 June 2018 (UTC)
- @Pppery: I'm not going to play edit-war games with you, but I disagree that the module is becoming "a monolith of unrelated tools". I use all of those tools when working with Wikidata integration into infoboxes, and I'm not impressed by your suggestion that they are "unrelated". They are related for me by the fact that I use them when working with this module.
"Each module should only have the things that it directly needs in the place it is directly used"
. Says who? You? Modules are fundamentally different from templates. Templates have a single point of entry, but modules are designed to have multiple independent functions; they resemble shared libraries more than a stand-alone program. If I can collect together a group of functions that are useful to me (and presumably useful to others), then why should you have a veto because of some philosophical objection, or misunderstanding of what modules are?- There is no measurable performance hit by having many independent functions in a single module, and there is no problem with maintaining such code in a well documented code source. There is no downside in having these functions in the module, and I'd like to have them available.
- Your other argument
"The English Wikipedia isn't multi-lingual, so 'the ability to see how a number ought to be formatted in a given language, or checking the current content language code or name, becomes valuable in any multi-lingual wiki' is irrelevant.
is simply blinkered. Having multilingual capability is vital for multilingual wikis and useful for reducing the amount of internationalisation needed when transferring a module to another Wikipedia. The work we do is not just for English Wikipedia, but for all of the projects who want to use it. There is therefore considerable advantage in maximising the amount of common code in terms of updates and maintenance, so I really don't want to have code forks beyond the internationalisation at the start of the module (or from a local sub-module). I'm not seeing any advantage for anyone in removing those functions and I'm going to ask you to revert yourself now. --RexxS (talk) 19:20, 15 June 2018 (UTC)- Modules are not fundamentally different from templates in the way you claim. Like I said above, one could begin a template with a
#switch
statement, and have it to unrelated things depending on the first parameter, but no one would ever code such a template. You are suggesting the exact same thing. You want this module to be equivalent to{{#switch:{{{1|}}}|getValue={{#property:{{{2|}}}|from={{{3|}}}}}|formatNum={{formatnum:{{{2|}}}}}...}}
, which is not how one codes templates in MediaWiki. The utility functions are unrelated in the sense that if one were to move them to a separate module, no line of code other than boilerplate likelocal p = {} ... return p
would need to be included in both modules, and in the sense that, unlike the rest of the module, they would work in the world in which mw:Extension:Wikibase Client wasn't installed. I'm not misunderstanding what modules are -- we simply have different ideas about what they are. - I don't dispute that having common code is a good goal, but that does not mean that every single line of code except for some predefined section needs to be exactly the same. Multi-lingual wikis can fork the code to stick appropriate {{int:lang}} in places, commons can fork it to use labels instead of sitelinks etc. Trying to have every single line of code centralized is a misguided goal that leads only to monsters like Module:Cycling race littered with code like
if wiki == "mk" or wiki == "ja" or wiki == "ru" then ...
, and thus I will not be self-reverting. {{3x|p}}ery (talk) 19:51, 15 June 2018 (UTC)- Modules are fundamentally different from templates in the way I explained. Your example would be analogous to writing a module containing one function that used a parameter to switch between different unrelated things. No one would write a module like that, either. However, the ability to collect together a group of functions into a single module as desired is a fundamental feature of the way modules are implemented on MediaWiki. I have a practical reason to make a particular collection as I explained above. You have offered no reason why your removal of functions improves the module in terms of performance, convenience, portability or any other objective consideration.
- Having every single line of code except for some predefined section does need to be the same, as it allows simple updating by a single copy and-paste. For code that is still growing and developing to meet editors' needs, and hence requires frequent updates, that is a compelling reason. There is no problem whatsoever with a few key sections having code that switches between two options based on the requirements of different wikis. This is far more than just a question of {int:lang}, as wikis like Commons have completely different page titles from enwiki, which makes the substitution of sitelinks for labels undesirable there. There is simply no advantage in forking code for different wikis when one piece of code can do the job for all of them. Your fear of "bloat" is completely unfounded. Are you or anyone else having any problems following the program flow in getValue? No? Then explain what problems having wiki-sensitive switches create.
- I think it's time to ask for a third opinion, as I don't believe you are able to grasp the points I'm making, and I don't intend to have you make my work in maintaining this module any harder merely to satisfy your desire to meddle, rather than improve. --RexxS (talk) 20:47, 15 June 2018 (UTC)
- There's definite value in keeping the same version between the different wikis rather than having different versions around that are difficult to sync with each other, and that's worth more than having a bit of redundant code here. It might be worth considering having a separate utility module at some point that this one calls functions from (in the same way that {{convert}} uses a few different sub-modules), but only when there's a clear benefit to doing so. Thanks. Mike Peel (talk) 21:36, 15 June 2018 (UTC)
- Modules are not fundamentally different from templates in the way you claim. Like I said above, one could begin a template with a
Fetching from Wikidata no longer working
@RexxS: I noticed you recently made a change to the module. @DBD: has recently brought to my attention that {{Ordination}} is no longer calling fetching Wikidata values properly. Do you know if this is related to the recent edit? Ergo Sum 01:33, 26 June 2018 (UTC)
- Discuss at Template talk:Ordination#Wikidata fault to avoid fragmentation, perhaps? --RexxS (talk) 02:25, 26 June 2018 (UTC)
Meaning of the code fragment for century
I've been trying to see how various things that pull data from Wikidata are interpreting centuries, and I came across this code fragment:
elseif dateprecision == 7 then -- century local century = math.floor((fpvdate - 1) / 100) + 1 fdate = makeOrdinal(century) .. " " .. i18n["century"]
I'm very new at Lua, but if I understand this correctly, before the operation, the 4 digits of the year have been parsed out (it would fail for 5 or more digits in the year; might be nice to document the working range) and put in a number, fpvdate. Then subtract 1 from fpvdate and divide by 100 and round down. So 2001 becomes 20. Then add 1, so it's 21st century.
Suppose the year is 2000. floor(1999/100) = floor(19.99) = 19. Add 1 and it's 20th century. But as I read mediawikiwiki:Wikibase/Indexing/RDF Dump Format and mediawikiwiki:Wikibase/DataModel/JSON the precision is treated as an order to disregard certain parts of the time stamp. Precision 7 orders you to disregard all the digits lower than the hundreds digit of the year in the timestamp, so +2000-01-01T00:00:00Z is treated as +20??-??-??T??:??:??Z and the range of years that satisfies it is 2000 to and including 29999. (By the way, are the timestamps you receive JSON, RDF, or something else?)
Indeed, since a common interpretation of century (for years >= 100) is a period starting in a year whose last 2 digits are 01, ending Dec 31 in a year whose last 2 digits are 00. Since a common understanding of "century" disagrees with the Wikidata models, maybe it would be better not to use the word "century" at all. Maybe something like "2000 to 2099 inclusive". Jc3s5h (talk) 05:23, 18 July 2018 (UTC)
- The Battle of Actium (Q160387) has a property point in time (P585) which was 2 September 31 BCE. That is stored as:
table#1 { table#2 { ["id"] = "Q160387$e2a5cc8e-4b09-83ec-21a8-cb01a5a092d9", ["mainsnak"] = table#3 { ["datatype"] = "time", ["datavalue"] = table#4 { ["type"] = "time", ["value"] = table#5 { ["after"] = 0, ["before"] = 0, ["calendarmodel"] = "http://www.wikidata.org/entity/Q1985786", ["precision"] = 11, ["time"] = "-0031-09-02T00:00:00Z", ["timezone"] = 0, }, }, ["property"] = "P585", ["snaktype"] = "value", }, ["rank"] = "normal", ["references"] = table#6 { table#7 { ["hash"] = "07d030994052a5fb9a9e5f636e3f6b6be2f015d6", ["snaks"] = table#8 { ["P248"] = table#9 { table#10 { ["datatype"] = "wikibase-item", ["datavalue"] = table#11 { ["type"] = "wikibase-entityid", ["value"] = table#12 { ["entity-type"] = "item", ["id"] = "Q24329384", ["numeric-id"] = 24329384, }, }, ["property"] = "P248", ["snaktype"] = "value", }, }, }, ["snaks-order"] = table#13 { "P248", }, }, table#14 { ["hash"] = "f8f8b2ec64c447e7f2f498666b111182114d82c6", ["snaks"] = table#15 { ["P248"] = table#16 { table#17 { ["datatype"] = "wikibase-item", ["datavalue"] = table#18 { ["type"] = "wikibase-entityid", ["value"] = table#19 { ["entity-type"] = "item", ["id"] = "Q25369108", ["numeric-id"] = 25369108, }, }, ["property"] = "P248", ["snaktype"] = "value", }, }, }, ["snaks-order"] = table#20 { "P248", }, }, }, ["type"] = "statement", }, }
- So knowing that should bypass any worries about JSON or RDF. I'm told that the database is held as RDF, but the format you see is XSD 1.0 (ISO 8601:1988): i.e. 2 September 31 BCE is returned as
"-0031-09-02T00:00:00Z"
, with precision=11. - When you enter a century manually in the Wikidata interface, you have to set precision to "century" and enter a year. Any year from 1901 to 2000 is displayed as "20. Century" in the viewing interface on Wikidata. If I enter 1925 for the year, it stores:
["precision"] = 7, ["time"] = "+1925-00-00T00:00:00Z"
. If I enter 2000 for the year, it stores:["precision"] = 7, ["time"] = "+2000-00-00T00:00:00Z"
. So the documentation is its usual load of crap, and you really need to work these things out by checking for yourself directly. - I wrote the code the way I did, because I wanted to be able at some point to support the display of any date as within the appropriate century, for example "Richard Burton was born in the 20th Century". I can't do that if I obey an instruction to
"disregard all the digits lower than the hundreds digit of the year in the timestamp"
, because both 1925 and 2000 are in the 20th Century. The documentation on MediaWiki is very poor and is incapable of being improved, so you'll have to sort things out for yourself. --RexxS (talk) 13:53, 18 July 2018 (UTC)- I think we should keep in mind that we're not dealing with the English language, we're dealing with several APIs and one interactive user interface. I would most heavily weight the API that produces the information in your example with the "examine" parameter, since that's the one we're using, but I don't know which API that is. It might be mediawikiwiki:Wikibase/Indexing/RDF Dump Format and mediawikiwiki:Wikibase/DataModel/JSON. Both of those have reasonable documentation, so I would tend to weight them heavily. If the interactive user interface has any documentation, it is well-hidden from users, and has had many bugs, some fixed and some not, and some of which you have just pointed out. How it works has to be determined by experimenting with it. So I weight that one lightly. Another consideration is that people, inside and outside Wikimedia, may have read and believed the documentation and written code to obey it. Certainly the user interface is only suitable for reading or writing small amounts of information, and most of the information is read or written through other interfaces. But it is hard to know whether the developers of the other interfaces read the documentation, experimented with the user interface, or it just seemed so blindingly obvious to them what to do that they followed whichever convention seemed natural to them without further thought.
- The JSON API gives a clear example for decade that the precision is an order to ignore certain digits. That example should be applied to 100 year precision too, which necessitates, for example, 200 falling into the 200-299 uncertainty range, because 0200 with precision 7 should be interpreted as 02??. The RDF API is less explicit, but the documents it references, ISO 8601, express precision by truncating digits, so lead to the same result.
- There is a wikidata:Wikidata:Bot requests#Normalize dates with low precision bot request to go through the database every day and convert all ambiguous dates (AD years ending in 00 for centuries or 000 for millenia) to a date near the middle of the period the bot thinks the date should resolve to (which convention to follow is still under discussion). For example, if the 2000-2099 convention is followed, the bot would convert 2000 precision 7 to 2050 precision 7. If that goes through, the problem is taken off your shoulders.
- There is an issue with how dates are written in sources. If an academic source says "Jones was born between 1925 and 1933", 1950 precision 7 will be perfect. If the source says "Smith was born in the 1700's" and the context makes it clear it's 1700 - 1799, not 1700-1709, and the JSON document is right, 1750 precision 7 is perfect. If the source says "Brown was born in the 16th century in Rome" and JSON is right, then no simple date is absolutely correct. Academic sources usually follow the XX01 - XX00 convention, so Brown could have been born in 1700 but 1650 precision 7 wouldn't include 1700. Not only that, the beginning of the period, in the academic source, is Julian while the end of the period is Gregorian.
- But if the sources author meant 1601 to 1700 and a new source showed it was really 1600, I wouldn't expect to see the author in sackcloth and ashes over it; the author probably meant it as a fuzzy number anyway. Jc3s5h (talk) 14:57, 18 July 2018 (UTC)
Date formatting
I'm thinking about porting this module to Cantonese Wikipedia but am feeling unsure because I'm not sure how to do date format localisation. Is it possible to specify a set of default date formats in the i18n submodule, like one can do on Module:Wikidata? Wikidata is currently atrocious at handling non-dmy date formatting so we must get date format localisation to work for this to be useful in East Asian languages at all. (Please ping in reply) Deryck C. 15:20, 20 August 2018 (UTC)
- @Deryck: I'm actively maintaining this module on both English Wikipedia and Commons, so I'd be interested in writing a Cantonese localisation. Unfortunately I don't speak Cantonese and have no idea how dates are formatted in Cantonese. If you could give me examples of how dates like 1 April 2018, 20 August 30 BC, and 17th century are formatted, I can try to implement them in one of the sandboxes. We can refine that until it does what you want. --RexxS (talk) 15:38, 20 August 2018 (UTC)
- @RexxS::
- 2018年4月1號
- 公元前30年8月20號
- 17世紀
- See yue:Module:Wikidata/i18n for the existing format strings on Module:Wikidata. Deryck C. 15:51, 20 August 2018 (UTC)
- @Deryck: see WP:INDENTGAP. Back to the dates, I've made an attempt in Module:WikidataIB/sandbox2 to implement Cantonese date formatting (as I'm guessing at it!). The module has a public function that exposes the private format_Date function for testing. So:
{{#invoke:WikidataIB/sandbox2 |formatDate | 1 April 2018}}
→ Script error: No such module "WikidataIB/sandbox2".{{#invoke:WikidataIB/sandbox2 |formatDate | 1 April 2018 |df=y}}
→ Script error: No such module "WikidataIB/sandbox2".{{#invoke:WikidataIB/sandbox2 |formatDate | 20 August 30 BC}}
→ Script error: No such module "WikidataIB/sandbox2".{{#invoke:WikidataIB/sandbox2 |formatDate | 20 August 30 BC |df=y}}
→ Script error: No such module "WikidataIB/sandbox2".{{#invoke:WikidataIB/sandbox2 |formatDate | January 2018}}
→ Script error: No such module "WikidataIB/sandbox2".{{#invoke:WikidataIB/sandbox2 |formatDate | 753 BCE}}
→ Script error: No such module "WikidataIB/sandbox2".
- The date of birth (P569) for Richard Burton (Q151973)
{{#invoke:WikidataIB/sandbox2 |getValue |P569 |qid=Q151973 |fwd=ALL |osd=no}}
→ Script error: No such module "WikidataIB/sandbox2".
- The date of death (P570) for Julius Caesar (Q1048)
{{#invoke:WikidataIB/sandbox2 |getValue |P570 |qid=Q1048 |fwd=ALL |osd=no}}
→ Script error: No such module "WikidataIB/sandbox2".
- I'm still looking for some examples of century handling as dates in Wikidata, but I have found an exception where it's not a date: the category combines topics (P971) for Category:10th-century establishments in Egypt (Q7213033)
{{#invoke:WikidataIB/sandbox2 |getValue |P971 |qid=Q7213033 |fwd=ALL |osd=no}}
→ Script error: No such module "WikidataIB/sandbox2".
- You might want to try out more tests and let me know how it goes. Cheers --RexxS (talk) 18:19, 20 August 2018 (UTC)
- @Deryck: see WP:INDENTGAP. Back to the dates, I've made an attempt in Module:WikidataIB/sandbox2 to implement Cantonese date formatting (as I'm guessing at it!). The module has a public function that exposes the private format_Date function for testing. So:
- @RexxS::
- I might copy the entire sandbox version as you've written it into the Cantonese Wikipedia later and test it there. I saw there was another function which builds in exceptions where site names don't match up with ISO-639 with suffixes stripped; Cantonese is one of those (site name zh_yuewiki / zh-yue.wikipedia.org ; ISO-639 yue). So we might have to tinker with that. Deryck C. 14:41, 21 August 2018 (UTC)
- Sure, Deryck. The siteID (or projID) function is there for when you need the root code name - just use an elseif to supply further exceptions. The code can be used inside a template as needed. Let me know if you need any help. --RexxS (talk) 17:07, 21 August 2018 (UTC)
- I've started testing over on the Cantonese Wikipedia. Non-date values seem to work (onlysourced and rank are useful already) but the Lua script gives up when it's run on the Cantonese Wikipedia. Let's carry on this discussion at yue:Module talk:WikidataIB. Deryck C. 14:34, 23 August 2018 (UTC)
- Sure, Deryck. The siteID (or projID) function is there for when you need the root code name - just use an elseif to supply further exceptions. The code can be used inside a template as needed. Let me know if you need any help. --RexxS (talk) 17:07, 21 August 2018 (UTC)
Prefix/postfix on wikilink items?
@RexxS: Documentation suggests that prefix/postfix can be used on wikibase items, but I am not seeing that behavior, and linkedItem() seems to only cover linkprefix/linkpostfix. Am I missing something? In my case, for a particular infobox field, I want to wrap the resulting items in italics. -- ferret (talk) 19:30, 11 September 2018 (UTC)
- @Ferret: the original use was only with Wikidata items of type string (urls, etc.), so most of the documentation specifies using the link and normal prefixes and suffixes only with string types. Eventually, somebody (probably Mike) wanted the link pre/postfix functionality on Wikibase items, so I implemented that (but not the normal pre/postfix functionality). However, I was sloppy in updating the documentation and gave the wrong impression. I apologise for that. I've now upgraded the code to implement
|prefix=
and|postfix=
on all wikibase items as well as strings:{{#invoke:WikidataIB |getValue |P106 |fwd=ALL |osd=no |qid=Q42 |prefix="<i>" |postfix="</i>"}}
→ playwright, screenwriter, novelist, children's writer, science fiction writer, comedian, writer, musician{{#invoke:WikidataIB |getValue |P106 |fwd=ALL |osd=no |qid=Q42 |prefix="<span style='font-variant: small-caps;'>" |postfix="</span>"}}
→ playwright, screenwriter, novelist, children's writer, science fiction writer, comedian, writer, musician
- Let me know if you find any problems. (Don't forget that the code strips double quotes from the parameters intentionally). Cheers --RexxS (talk) 22:10, 13 September 2018 (UTC)
- @RexxS: That got it. Thanks! -- ferret (talk) 22:17, 13 September 2018 (UTC)
Module Complex date
This module has now been updated to use the functionality of Module:Complex date.
There is a nomination for deletion for the Module:Formatnum, which is required by Module:Complex date and hence by this module. You can comment on the nomination at Wikipedia:Templates for discussion/Log/2018 September 21 #Module:Formatnum. --RexxS (talk) 20:27, 21 September 2018 (UTC)
- Also see Wikipedia:Templates_for_discussion/Log/2018_September_25#Module:Linguistic. Thanks. Mike Peel (talk) 18:45, 25 September 2018 (UTC)
Wikilinks for redirects
{{Infobox person/Wikidata}} invokes getPreferredValue in this module, which indirectly calls linkedItem. If the Wikidata item has no associated enwiki page then linkedItem looks for a page whose title matches the Wikidata label. If that page is a redirect then it is returned as a wikilink. For example, Marie L. Shedlock displays "occupation" as the redirect Story-teller, whose title matches the label of Q16023925. Unfortunately, the target of that redirect is is not an article but a disambiguation page Storyteller. Had Story-teller been an actual page rather than a redirect, then linkedItem would correctly display the plain text "Story-teller", unlinked. Is it really necessary to link redirects, even those which lead to irrelevant articles or other pages? I understand the urge to link to a Wikipedia article with a matching title in the hope that it might be relevant, but the authors of the module have resisted that temptation for actual pages, and it would be helpful if we could adopt the same policy for redirects.
I realise that Wikipedia redirects cannot be linked to Wikidata items, and therefore there are cases where the best link can't be added and has to be guessed. For that reason, I've refrained from boldly changing the module. However, I think that the harm may outweigh the good and we should reconsider giving special treatment to redirects. Certes (talk) 19:31, 23 September 2018 (UTC)
- @Certes: Many occupations (for example) have no English sitelink on Wikidata. If linkedItem gets a value of archaeologist (Q3621491) for the occupation (P106) of Howard Carter (Q133682), what is it to do with it? There's no English sitelink for that entity, even though the English Wikipedia has a perfectly good redirect at Archeologist that redirects to Archeology. That is a common case for thousands of items, so I decided to create the link from the label if it's a redirect, which allows this:
{{wdib |qid=Q133682 |P106 |fwd=ALL |osd=no}}
→ anthropologist, archaeologist, egyptologist, necropolis scholar
- You are contending that we should simply display
- Anthropologist, archaeologist, egyptologist
- But who benefits from that? In the tiny number of cases where a redirect actually leads to a dab page, it can be overridden at the article level by
|occupation=[[Storytelling|Storyteller]]
in the infobox, just as we would do whenever an automated result is sub-optimal. I simply don't agree that reducing the functionality in a vast number of cases is a good solution to what is very much an edge case. You could also solve this particular issue at a stroke by changing the English label at storyteller (Q16023925) from "story-teller" (which matches a redirect) to "storyteller" (which only matches a dab page, so won't be linked). - You are wrong in your assumption Had Story-teller been an actual page rather than a redirect, then linkedItem would correctly display the plain text "Story-teller", unlinked. It would actually display the English sitelink, linked, because if it were an actual page (not a dab or redirect), then Wikidata would have the sitelink. --RexxS (talk) 20:53, 23 September 2018 (UTC)
- @RexxS: Editing the Wikidata as you suggest seems to have worked. Thank you. As for your last point: I agree that if there were an article then Wikidata should have the sitelink and it would correctly be displayed as a link. I was referring to cases such as Storyteller where there is no article to link, just a disambiguation page. I hope that Wikidata will one day allow sitelinks to redirects such as Archaeologist, which could make it fit for this purpose. Certes (talk) 21:42, 23 September 2018 (UTC)
- User:DPL bot correctly reported the link to storyteller in Marie L. Shedlock as a WP:INTDABLINK error.
- It took me a good 5 minutes to work out what was causing the error, and to post asking for help; despite having seen such Wikidata-created errors before. When I first came across things like this, it took me 30 or 40 minutes before I gave up in despair.
- "In the tiny number of cases where a redirect actually leads to a dab page, it can be overridden at the article level by
|occupation=[[Storytelling|Storyteller]]
in the infobox, just as we would do whenever an automated result is sub-optimal." Are you serious? I know tricks to solve the problems created by at least 20 Wikipedia templates, and you're suggesting that I learn yet another wholly unnecessary one? - It is all about the readers. Bad links like that one are bad for the project. I estimate that this error has generated at least 30 minutes of work for 3 editors who could have been doing more useful things. Narky Blert (talk) 22:06, 23 September 2018 (UTC)
"Are you serious?"
Yes, of course I'm serious. If you find a problem, you fix it or ask for help to fix it. The fix in this case was trivial and I'm astonished that you can't grasp that. Nothing on Wikipedia is necessary and nobody is forcing you to edit. In any infobox, if you don't like what is displayed, you change it. That's not rocket science, and all of these templates have documentation to help anyone new to using them. If you want the box to display[[Storytelling|Storyteller]]
for the occupation, you type|occupation=[[Storytelling|Storyteller]]
into the infobox. That's the same solution no matter what infobox you're using.- These infoboxes have been in use since 2013 and this is the first time anybody has found a redirect to a dab page being displayed. You might be better off spending your time asking why we have redirects to dab pages? Every search engine I know treats "Story-teller" the same as "Storyteller" and we clearly won't link it in running prose, because we don't link to dab pages. So what is the purpose of the Story-teller redirect, other than to waste productive editors' time? Maybe you should be directing your anger at Neelix who pointed it to the dabpage as "more appropriate"? --RexxS (talk) 22:26, 23 September 2018 (UTC)
- Story-teller is there to help Wikipedia readers typing into the search box to find the article they're looking for. Neelix' edit was unquestionably correct.
- This may or may not have been the first time I have encountered this problem with infobox person specifically. It is most certainly not the first time I have encountered this type of infobox/Wikidata problem. I can recall at least one case where, after I had asked for help, another editor solved one by an inventive kludge - ugly, but effective.
- If an editor in the 100,000 Club cannot easily solve a problem like this, I submit that it is wrong to dismiss it as trivial. Narky Blert (talk) 20:48, 24 September 2018 (UTC)
"Story-teller is there to help Wikipedia readers typing into the search box to find the article they're looking for."
Complete and utter nonsense. Typing "Story-teller" into any search box will find Story-teller, Story Teller (magazine), The Story-Teller, and Storyteller without any help from the redirect at all. Creating a redirect to dab page helps nobody and is a thoughtless, worthless action.- If you really have previously encountered a problem caused by a redirect to a dab page, then give the diff, otherwise it's simply not believable – they are rarer than rocking-horse shit.
- I could train an orangutan to solve these sort of problems. If an editor in the 100,000 Club can't do it easily, then I submit they need to take the cotton wool out of their ears and put it in their mouth. --RexxS (talk) 22:16, 24 September 2018 (UTC)
- @RexxS: I suggest that you read and inwardly digest WP:AGF and WP:PERSONAL. Narky Blert (talk) 20:14, 25 September 2018 (UTC)
- @Narky Blert: Thanks for the suggestions. I'll take them under advisement. I've always found the first one to be pretty useful, and perhaps you might also find it useful, if you were to read it. The second one, unfortunately, is all too often just used as an excuse for divas to get upset over, but I'll check it out to see if it's improved lately. --RexxS (talk) 22:28, 25 September 2018 (UTC)
- @RexxS: I suggest that you read and inwardly digest WP:AGF and WP:PERSONAL. Narky Blert (talk) 20:14, 25 September 2018 (UTC)
- (edit conflict) @Certes: Apologies, I misunderstood what you meant by "actual page", but I understand you now. You're quite right: if there is no sitelink and the label represents a dab page, then the plain unlinked label is displayed.
- Annoyingly, there is already a majority of opinion that sitelinks to redirects should be allowed at d:Wikidata:Requests for comment/Allow the creation of links to redirects in Wikidata, but the folks that run the place aren't interested in delivering what folks want, just what they think they should want. --RexxS (talk) 22:09, 23 September 2018 (UTC)
- Great. So if a Wikipedia page is moved, that will create a WP:2REDIR error; or perhaps, after a WP:RM, a link to a redirect to the wrong target. Narky Blert (talk) 20:48, 24 September 2018 (UTC)
- Have you any idea about how redirects work? If you move a page that is the target of a redirect, it is the first redirect that no longer works. It has to be fixed whether or not a Wikidata entity points to it. Your argument is completely and totally irrelevant to how sitelinks from Wikidata work and you really need to think things through before spouting this sort of nonsense. --RexxS (talk) 22:16, 24 September 2018 (UTC)
- Great. So if a Wikipedia page is moved, that will create a WP:2REDIR error; or perhaps, after a WP:RM, a link to a redirect to the wrong target. Narky Blert (talk) 20:48, 24 September 2018 (UTC)
- @RexxS: Editing the Wikidata as you suggest seems to have worked. Thank you. As for your last point: I agree that if there were an article then Wikidata should have the sitelink and it would correctly be displayed as a link. I was referring to cases such as Storyteller where there is no article to link, just a disambiguation page. I hope that Wikidata will one day allow sitelinks to redirects such as Archaeologist, which could make it fit for this purpose. Certes (talk) 21:42, 23 September 2018 (UTC)
Move to mediawiki
Since the module seems language independent now after recent (amazing) updates by RexxS, would it be a good idea to move the source code to mediawiki and automate updation across other wikis? See Module:TNT for example. Capankajsmilyo(Talk | Infobox assistance) 17:44, 1 November 2018 (UTC)
- Thank you for the kind words, Capankajsmilyo. However, I'm not yet convinced that the module is fully language-independent, so I'd want to give some time for folks to feedback any issues before making that claim. Naturally I'd be pleased to hear of any failings in the code to internationalise fully, but I suspect most of that feedback will come from Commons. Anyway, there's no reason not to put a copy of the code into a central location if you think it is helpful to others. Cheers --RexxS (talk) 18:46, 1 November 2018 (UTC)
Retrieve value by source
Is it possible to retrieve a value according to a specific source ? If we have several statements using the same properties and I want to extract the value of that that property given by a specific source, it there a function for that ? The French WP developed a function using P248 has possible criterion for filtering different value candidates according to their source. Snipre (talk) 10:14, 30 October 2018 (UTC)
- @Snipre: If you look at Module:WikidataIB#Function_getValueByQual you can see how we can fetch the value of a property that has a particular value for the qualifier, as long as the qualifier value is a wikibase-entity, so the logic framework already exists. The problem for references is that there are so many different ways of stating the source: stated in (P248), inferred from (P3452), imported from Wikimedia project (P143), described by source (P1343), reference URL (P854), title (P1476) and so on (a SPARQL query for Wikidata property to indicate a source (Q18608359) show 32 ways at present). I'm loathe to write code that checks each property value for over two dozen possible means of sourcing a reference, particularly when the datatype of the source might be item or url or monolingual text, each of which requires different code to retrieve the value to match.
- Nevertheless, if you have a specific case in mind, or could share examples of what you want to do more concretely, it may be that the job turns out to be much smaller and I could knock up a function for you.
- BTW, in WikidataIB, any reference containing the word "Wikipedia" is discounted as if unsourced, and that would be easy to build on if you're interested in filtering out values. You might also find
{{#invoke:Sandbox/RexxS/WdRefs|seeRefs}}
(which can be pasted into any article section – I use References section – and previewed to show the properties available on Wikidata and their references) to be useful when examining articles for their reference sources. --RexxS (talk) 22:49, 30 October 2018 (UTC)- @RexxS: There are only two different ways to describe sources in a statement, these ways are described in Help:Sources. The other ways are really specific cases, cases where people don't want to follow WD rules or misunderstanding about what a source is. Using the property P248 as main property to filter references will cover the majority of the cases. Often using 80% of the data is enough to model the main characteristics of a situation, just trying to use the rest is waste time. Snipre (talk) 23:37, 30 October 2018 (UTC)
- @Snipre: If you look at d:Property:P248 #P2559 you'll see "Using this property indicates that the information is contained in the entity (a book, an article, etc) that's represented by the linked information source. If you want to state that the information comes from the information contained in another Wikidata item, use P3452.", which makes it pretty clear that WD "rules" include inferred from (P3452) as an obvious instance of a Wikidata property to indicate a source (Q18608359) that d:Help:Sources didn't include when it states, "Typically the property used for sources is one of two options". So my point stands.
- Nevertheless, if you just want a function to retrieve the value of a property that has a reference "stated in" a particular item, then that's feasible. I've had a look at what will be needed and I'll do the additional code tomorrow. --RexxS (talk) 01:25, 31 October 2018 (UTC)
- @RexxS: Here we have a problem of source definition: a Wikimedia project or the inferred data are not a source according to any guidelines in the Wikimedia world dealing with secondary and reference sources. You are right that a contributor could filter any dataset with any possible criteria. But the most common case is the following: you have several values for one statement and you want to extract the value according to a specific source you trust or was validated by being authoritative. Or you want to extract values from an unique source to get a homogeneous dataset. I think we should focus on that first and see what happens. So P248 is the correct starting point.
- My request is following a question from a contributor asking the method to extract value when several values are available (see here).
- Thank you in advance if you can add the getValueByRef function using P248.
- @Datawiki30: Please get in touch with RexxS to know which will be the appropriate form of the request for your data extraction. Snipre (talk) 13:05, 31 October 2018 (UTC)
- @Snipre and Datawiki30: Have written the code, I'm not so sure that stated in (P248) is as common as we thought. I'm having a lot of difficulty finding examples to test with. Anyway here's a first attempt. For England (Q21), finding the value of ISNI (P213) that is stated in (P248) BnF authorities (Q20666306), we get:
{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q21 |P213 |match=Q20666306 |fwd=ALL |osd=n}}
→
- Let me know if you have more examples that I can try out, or if you run into problems. --RexxS (talk) 17:45, 31 October 2018 (UTC)
- [Further:] Having now been pinged to the discussion at d:Wikidata:Project chat #Some fundamental questions about modeling properties for statistical data.
- For Wikidata Sandbox (Q4115189), finding the value of nominal GDP (P2131) that is stated in (P248) World Bank Open Data (Q21540096):
{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q21540096 |fwd=ALL |osd=n |rank=best}}
→{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q21540096 |fwd=ALL |osd=n |rank=preferred normal}}
→{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q21540096 |fwd=ALL |osd=n |rank=preferred normal deprecated}}
→
- For Wikidata Sandbox (Q4115189), finding the value of nominal GDP (P2131) that is stated in (P248) International Monetary Fund (Q7804):
{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q7804 |fwd=ALL |osd=n |rank=b}}
→{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q7804 |fwd=ALL |osd=n |rank=p n}}
→{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q7804 |fwd=ALL |osd=n |rank=p n d}}
→
- It looks as if it's working, but probably needs more testing. --RexxS (talk) 18:10, 31 October 2018 (UTC)
- @RexxS: Hi and thank you very much for your engagement. That's brilliant that we can access the value of the property using the source "stated in". It seems that the "stated in" property for the source "World Bank database" is in this case not sufficient, because we have values in US Dollar and in Euro for the same "stated in" source. The only systematic difference is here the URL: https://data.worldbank.org/indicator/NY.GDP.MKTP.CD?locations=FR stands for the value in US Dollar and https://data.worldbank.org/indicator/NY.GDP.MKTP.CN?locations=FR for the value in Euro. What would be the best way if we want to retrieve each of these values separately? Do we need in this case more information on each edit (for example some id as qualifier or something like this)? Besides:
- - Is it possible to retrieve the value in millions or billions of the unit? I thing that the decimal value would be too long for the infobox in Wikipedia.
- - It seems that the second value after the decimal point has been cut for some of the retrieved values from the sandbox. Cheers! Datawiki30 (talk) 23:28, 31 October 2018 (UTC)
- @Datawiki30: I try to keep the functions in WikidataIB as general as possible, so that they can be used in varied scenarios. If you want to further refine the routine to distinguish between dollars and euros, then I think you're moving away from a general-purpose function to a bespoke one, and you'll need to find someone who will devote the time to doing that for one single application.
- Unfortunately, Wikidata doesn't have a "currency" datatype. Consequently, values like "3 dollars 50 cents" are stored as "3.5 dollars", so that is what the functions return. As I don't want to create a huge lookup table of every known currency unit simply to turn that into "3.50 dollars", I think it is best for the specific application to handle nuanced displays like that using standard string formatting functions.
- The question of displaying in millions or billions of the unit is more complex. How should the routine decide for the general case whether to show "£1,000,000" or "£1 million"? If the quantity is "$1,234,567", then displaying "$1.234567 million" actually takes up more room. Of course, I could add another parameter to switch the displays between the two formats, but as there are already over 30 documented parameters (and some undocumented ones), I really need a convincing use case for adding yet another one, or a robust algorithm to make the decision automatically for all uses. --RexxS (talk) 18:32, 1 November 2018 (UTC)
- @RexxS: The use cases that I have in mind are in first place the infoboxes for all the countries, but also cities and regions. For example for UK on the english Wikipedia we could use Wikidata for Pupulation, GDP (PPP) total and per capita, GDP nominal total and per capita, Gini and HDI. For GDP PPP and nominal we would have different currencies in wikidata with different sources. For Birmingam we have also GDP estimate in USD.
- I don't favorise complex solutions. What about if we add a standard title property for for each statement in Wikidata? This should be conform with the description here: https://www.wikidata.org/wiki/Help:Sources#Databases. I have now added the proper title property to the sandbox statements in wikidata. Can you retrieve each nominal gdp value with preferred rank using a combination of "stated in" and "title" as a filter? This combination should be usually atomic enough to identify different values.
- About cutting the values: all the values here should have two digist after the decimal comma (check wikidata sandbox). For the nominal gdp is such precision absolutely not important, but maybe for other properties (does the function round the value?):
- *
{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q21540096 |fwd=ALL |osd=n |rank=preferred normal}}
→ - I have an example about the values in millions etc.: maybe we could use two parameter - the 10^x potention with wich the value has to be divided and after which decimal point the value has to be cut. The function retrieves only the value, not the unity (for example USD). In the Infobox we use then the function to retrieve the value and we have to add for example "in billions USD". This is done once, where the value updates automatically when we have new values in Wikidata. Example: function(123,456,789,012.34567 US Dollar, 10^9, 1) = 123.4. What do you think about this? Cheers! --Datawiki30 (talk) 21:54, 1 November 2018 (UTC)
- @Datawiki30: Just one issue for now: Looking at Wikidata Sandbox (Q4115189). I can now see the truncation that you spotted. I've just added three values for Sandbox-Quantity (P1106): USD 465,134,297,438.91, USD 12,465,134,297,438.91 and USD 112,465,134,297,438.91 - it seems that the function does indeed round values after 14 digits, which I can only assume happens in the mw.language formatNum() method:
{{wdib |ps=1 |qid=Q4115189 |P1106 |uabbr=y}}
→- I can investigate further, but I doubt that there's much I can sensibly do about that. Personally, I wouldn't worry about losing up to 10 cents in 2 trillion dollars. --RexxS (talk) 00:13, 2 November 2018 (UTC)
- @RexxS: There is no currency datatype, but the mentioned data are stored in numerical datatype. The numerical datatype is defined with a value and an unit, so you can used the unit value to filter and extract the dollar value and not the euro value. But you are right: performing several filterings is not possible with general functions and an infobox in lus has to be coded to performed complex extractions with special data formating. Snipre (talk) 20:20, 2 November 2018 (UTC)
- @Datawiki30: Same as above: when you want to perform complex data extraction and special data formating, better switch to an infobox coded in lua which offers more possibilities than simple lua functions. Snipre (talk) 20:20, 2 November 2018 (UTC)
- @Snipre: Yes, I mentioned above that there's no currency datatype. I also pointed out the the value was stored as a value plus a unit, so I'm aware that I could filter it so that it returned the dollar value and not the euro one. But what should it return if there is no value in dollars? What if the next application wants the amount in pounds? or euros? or rupees? or kilogrammes, or any other unit that datatype "quantity" can be in? If the request is now for a bespoke function that just fetches dollar values for nominal GDP (P2131) and formats them in a specific way, then it's perfectly possible to write that function, but that wasn't what the original request in this thread asked for. --RexxS (talk) 20:49, 2 November 2018 (UTC)
- @Datawiki30: Same as above: when you want to perform complex data extraction and special data formating, better switch to an infobox coded in lua which offers more possibilities than simple lua functions. Snipre (talk) 20:20, 2 November 2018 (UTC)
- @RexxS: There is no currency datatype, but the mentioned data are stored in numerical datatype. The numerical datatype is defined with a value and an unit, so you can used the unit value to filter and extract the dollar value and not the euro value. But you are right: performing several filterings is not possible with general functions and an infobox in lus has to be coded to performed complex extractions with special data formating. Snipre (talk) 20:20, 2 November 2018 (UTC)
- @Snipre and Datawiki30: Have written the code, I'm not so sure that stated in (P248) is as common as we thought. I'm having a lot of difficulty finding examples to test with. Anyway here's a first attempt. For England (Q21), finding the value of ISNI (P213) that is stated in (P248) BnF authorities (Q20666306), we get:
- @RexxS: There are only two different ways to describe sources in a statement, these ways are described in Help:Sources. The other ways are really specific cases, cases where people don't want to follow WD rules or misunderstanding about what a source is. Using the property P248 as main property to filter references will cover the majority of the cases. Often using 80% of the data is enough to model the main characteristics of a situation, just trying to use the rest is waste time. Snipre (talk) 23:37, 30 October 2018 (UTC)
@Datawiki30: This is what I have so far - World Bank Open Data (Q21540096), International Monetary Fund (Q7804), United States dollar (Q4917), Euro (Q4916):
{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q21540096 |unit=Q4917 |fwd=ALL |rank=p n}}
→{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q7804 |unit=Q4917 |fwd=ALL |rank=p n}}
→{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q21540096 |unit=Q4916 |fwd=ALL |rank=p n}}
→{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q7804 |unit=Q4916 |fwd=ALL |rank=p n}}
→
Is that closer to what's wanted? --RexxS (talk) 23:01, 2 November 2018 (UTC)
I try now with best rank:
{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q21540096 |unit=Q4917 |fwd=ALL |osd=n |rank=best}}
→{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q7804 |unit=Q4917 |fwd=ALL |osd=n |rank=best}}
→{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q21540096 |unit=Q4916 |fwd=ALL |osd=n |rank=best}}
→{{#invoke:WikidataIB/sandbox |getValueByRefSource |qid=Q4115189 |P2131 |match=Q7804 |unit=Q4916 |fwd=ALL |osd=n |rank=best}}
→
--Datawiki30 (talk) 23:17, 2 November 2018 (UTC)
- @Datawiki30: one more parameter,
|scale=
added. Either a numeric value 3–12 or "auto"/"a" to enable scaling to thousands, millions, billions, trillions, or automatic. - For examples, see Module talk:WikidataIB/sandbox/testing #Scaling quantities. --RexxS (talk) 15:22, 3 November 2018 (UTC)
Test datavalue?
@RexxS: I don't know how much check code you feel like building into the module but Appendix (anatomy) is currently in Category:Pages with script errors ("Lua error in Module:WikidataIB at line 426: attempt to index field 'datavalue' (a nil value)") because (I assume) Mike Peel just edited {{Commons category}} and appendix (Q9656) has a broken topic's main category (P910). The code in the article which triggers the error (not displayed in the rendered page) is {{Commons category|Vermiform appendix|Appendix (anatomy)}}
. Should an error like this be taken as a feature which will lead to the Wikidata problem being noticed and fixed, or should the module treat a nil datavalue in the same way that it does for a non-existent P910? I don't know, but if the former, a nicer error message would be good. Perhaps a function in the module could be used to call getBestStatements and get the wanted property, returning nil if datavalue is nil? Or, throwing an error with a clearer reason if invalid? BTW, I use a script to check for global variables and you might like to think about the following: bestval checkLanguage dsep dv found out pd props v1. Johnuniq (talk) 09:08, 15 November 2018 (UTC)
- Thanks for spotting that, John. I'm often unsure what's best – to catch errors and work around them; or to let the error throw and pick up the broken Wikidata entry that caused it? In this case, the entry for topic's main category (P910) in appendix (Q9656) should never have been set set to "no value", so I'm reluctant to fix the code to ignore that (I could just use
if prop910 and prop.mainsnak.datavalue then
). There isn't any way of sending a nice error message at that point without a considerable re-write. But I can't see any conceivable reason for a "no value" or "some value" entry for any topic's main category (P910) - and the same for Commons category (P373), so I think my preference in this case is to fix the Wikidata entry, which I've now done. I'll run a WDQS search for entries with P910="no value" later and see how big the potential problem is. For now, script errors in ns0 is showing just three entries. - That's a useful, script. I've been through and I think I've cleaned out all of the globals you found now. The code seems to be clean - no errors showing up in the test pages. Thanks again! --RexxS (talk) 15:57, 15 November 2018 (UTC)
updating on cywiki
Re: my new Wikidata Infobox for all geotagged articles.
I've updated Module:WikidataIB on cywiki up till 21 September, then errors creep in. Any ideas or help please? Please edit module directly (no sandoxes / tests needed). Llywelyn2000 (talk) 15:14, 15 November 2018 (UTC)
- @Llywelyn2000: In this edit to cy:Modiwl:I18n/date, you changed it completely from its original version that is used by the "Complex date" series of modules to one that creates timelines. Can you remember why? As far as I can see, the module isn't used anywhere else on cy-wiki, so I've restored the prior version. Let me know if that causes other problems and I'll try to find a different solution. I've updated the version of cy:Modiwl:I18n/complex date as the old one was causing errors. I think it's working now (and should continue to do so as you continue updating), but please let me know if you run into problems. --RexxS (talk) 18:05, 15 November 2018 (UTC)
- Hi Doug, and thanks for your help! I think I changed it as it didn't work, and opted for the one on Commons or WD itself, as I was working on that time on the species wd infobox. But I might be wrong! But a good spot, and thanks again! One quicky: the grey headers seem to be ok on the Template:Gwybodlen lle eg Guatemala (bottom left). Yet in the same infobox on the w:cy:Gwatemala article, there is but one: 'Daearyddiaeth' (Geography). Any idea why the others don't show up? Best regards... Llywelyn2000 (talk) 19:21, 15 November 2018 (UTC)
- @RexxS: Bore da Doug! The update is now effecting the w:cy:Nodyn:Gwybodlen person/Wikidata (biog template) as seen here. Llywelyn2000 (talk) 05:25, 16 November 2018 (UTC)
- Bore da, Robin! The problem was that the template w:cy:Nodyn:Gwybodlen person/Wikidata is only half-finished. Eighteen of the fields, like Diflanodd had just a placeholder ("P0") for their Wikidata property. They need somebody to work out what the proper number should be. I've managed 8 of them (I think), but I can't translate these well enough to figure out what P-values to use for most of them, or I can't find a Wikidata property that matches what I think my translation should be. You perhaps may have more luck with the remaining 10. --RexxS (talk) 11:57, 16 November 2018 (UTC)
- Thanks! Yes, i didn't switch these on, as there was hardly anything on WD at the time. I'll take a look asap, and once again - thanks Doug! Your work on WD will bear fruit on many wikipedias for years to come! Llywelyn2000 (talk) 12:06, 16 November 2018 (UTC)
- I should have said that the update uses newer calls to the Wikibase client that are more efficient, but raise an error if an invalid property number is supplied. I've amended the code now to silently reject invalid property numbers, so there's no problem with leaving in the "P0" as placeholders in the template again. Cheers! --RexxS (talk) 12:10, 16 November 2018 (UTC)
- Thanks! Yes, i didn't switch these on, as there was hardly anything on WD at the time. I'll take a look asap, and once again - thanks Doug! Your work on WD will bear fruit on many wikipedias for years to come! Llywelyn2000 (talk) 12:06, 16 November 2018 (UTC)
- Bore da, Robin! The problem was that the template w:cy:Nodyn:Gwybodlen person/Wikidata is only half-finished. Eighteen of the fields, like Diflanodd had just a placeholder ("P0") for their Wikidata property. They need somebody to work out what the proper number should be. I've managed 8 of them (I think), but I can't translate these well enough to figure out what P-values to use for most of them, or I can't find a Wikidata property that matches what I think my translation should be. You perhaps may have more luck with the remaining 10. --RexxS (talk) 11:57, 16 November 2018 (UTC)
- @Llywelyn2000: the reason why the article cy:Gwatemala shows a different infobox from that in the template documentation is that in the template documentation you have
fetchwikidata=ALL
, but in the article you don't. I've added ALL as the default for{{{fetchwikidata|}}}
(as{{{fetchwikidata|ALL}}}
) so that the sub-headers will now show up even if you miss outfetchwikidata=ALL
from the template call. --RexxS (talk) 22:40, 16 November 2018 (UTC)- And they're back! Thanks Doug! Llywelyn2000 (talk) 05:09, 17 November 2018 (UTC)
- @RexxS: Bore da Doug! The update is now effecting the w:cy:Nodyn:Gwybodlen person/Wikidata (biog template) as seen here. Llywelyn2000 (talk) 05:25, 16 November 2018 (UTC)
- Hi Doug, and thanks for your help! I think I changed it as it didn't work, and opted for the one on Commons or WD itself, as I was working on that time on the species wd infobox. But I might be wrong! But a good spot, and thanks again! One quicky: the grey headers seem to be ok on the Template:Gwybodlen lle eg Guatemala (bottom left). Yet in the same infobox on the w:cy:Gwatemala article, there is but one: 'Daearyddiaeth' (Geography). Any idea why the others don't show up? Best regards... Llywelyn2000 (talk) 19:21, 15 November 2018 (UTC)