User talk:MER-C/Wiki.java
This page is kept for historical reasons. Urgent stuff goes to User talk:MER-C, while everything else goes to the bug tracker. MER-C 03:32, 10 September 2012 (UTC) |
Changelog
[edit]Version | Diff | Comment |
---|---|---|
0.01 | diff | Initial. |
0.02 | diff | Add default constructor, getCategoryMembers(String name) .
|
0.03 | diff | Added namespace support. |
0.04 | diff | Moved to use the mediawiki api. Added category intersection. License -> GPL 3. |
0.05 | diff | Added logging, sketchy user support. Worked around silly api limitation of 500/5000 elements returned per query. |
0.06 | diff | Fields for the various mediawiki logs. Added spamsearch, getDomain() (should have done this earlier). |
0.07 | diff | Optimized for bandwidth, add userRights() caching. Debug. |
0.08 | diff | Log support. Added a few utility methods. |
0.09 | diff | Add listPages(), editing throttle, better cookies. Now uses GZIP compression. Various other fixes. |
0.10 | diff | Add persistence, getImage(), whatLinksHere(), imageUsage(), getCurrentDatabaseLag(), getRenderedText(), getTalkPage(), getProtectionLevel(), pageExists(). We now check whether a page is protected before editing it. Various fixes, including ones below. |
0.11 | diff | Add upload(), parseList(), hasNewMessages(), assertions, maxlag. Rewrite login(), intersection(). |
0.12 | diff | Add ip block list, transclusions. Exception overhaul. Various optimizations. |
0.13 | diff | Add random page, thumbnails, ability to parse arbitrary wikitext. |
0.14 | diff | Added arbitrary scriptpath support, search, statistics, some other stuff. |
0.15 | diff | Short/long pages, bug fixes. |
0.16 | diff | Added API edit, move, edit counter, various "stuff about this page" methods. |
0.17 | diff | rm screen-scrape edit; add contribs(), Revision, section editing, purge() |
0.18 | diff | More "stuff about this page" methods, condense, status check, better error handling |
0.19 | diff | Rollback, page history, bug fixes |
0.20 | diff | Image history, old images, undo, new pages, revdelete bug fixes |
0.21 | diff | diff, attempt at upload API, various bug fixes |
0.22 | diff | quick user agent fix |
Special page equivalents
[edit]See Special:Specialpages for a list of special pages. The text on special pages may be edited by editing the appropriate system message.
Special page | Equivalent code |
---|---|
Special:Allmessages | listPages("MediaWiki:", Wiki.FULL_PROTECTION, Wiki.ALL_NAMESPACES)
|
Special:Allpages | listPages()
|
Special:Contributions | contribs() (excludes Special:Contributions/newbies)
|
Special:Ipblocklist | getIPBlockList()
|
Special:Linksearch | spamsearch()
|
Special:Listusers | allUsers()
|
Special:Log | getLogEntries()
|
Special:Longpages | longPages()
|
Special:Movepage | move()
|
Special:Mypage | String title = "User:" + wiki.getCurrentUser().getUsername();
|
Special:Mytalk | String title = "User talk:" + wiki.getCurrentUser().getUsername();
|
Special:Newimages | getLogEntries(int amount, Wiki.UPLOAD_LOG) or newPages(int amount, Wiki.IMAGE_NAMESPACE)
|
Special:Newpages | newPages()
|
Special:Prefixindex | listPages()
|
Special:Protectedpages | listPages()
|
Special:Random | random()
|
Special:Search | search()
|
Special:Shortpages | shortPages()
|
Special:Statistics | getSiteStatistics()
|
Special:Upload | upload()
|
Special:Userlogin | login()
|
Special:Userlogout | logoutServerSide()
|
Special:Whatlinkshere | whatLinksHere()
|
Two Errors
[edit]Hello, There are two little Errors in your Code:
First:
In the method "getPageText(String title)" the row
text.append(line);
should be
text.append(line + "\n");
second:
the method "login" doesn't work at the german Wikipedia, the Bot log in correctly, but the Function returns false, because in the German Login-page the text "Login successful" doesn't exist.
--88.72.43.131 11:05, 14 November 2007 (UTC) I hope you can understand me. I know, my english isn't very good ;)
- Fixed both, but it would be some time before they are live - the todo list for 0.10 is quite long. (The fix for the second one is to replace
"Login successful"
with"wgUserName = \"" + username + "\""
, if you can't wait). MER-C 05:50, 16 November 2007 (UTC)
getPageText() can use API
[edit]public String getPageText(String title) throws IOException
{
// pitfall check
if (namespace(title) < 0)
throw new UnsupportedOperationException("Cannot retrieve Special: or Media: pages!");
// go for it
String URL = query + "prop=revisions&rvprop=content&titles="+URLEncoder.encode(title, "UTF-8");
logurl(URL, "getPageText");
checkLag("getPageText");
URLConnection connection = new URL(URL).openConnection();
setCookies(connection, cookies);
connection.connect();
BufferedReader in = new BufferedReader(new InputStreamReader(new GZIPInputStream(connection.getInputStream()), "UTF-8"));
String result = "";
String content = "";
// get the text
String line = "";
while ((line = in.readLine()) != null)
result += line+"\n";
if (result.indexOf("missing=\"\"") != -1)
content = "(not yet written)";
else if (result.indexOf("invalid=\"\"") != -1)
content = "(Bad title)";
else if (result.indexOf("<rev />") != -1)
content = "(empty)";
else
content = result.substring(result.indexOf("<rev>")+5,result.indexOf("</rev>"));
in.close();
log(Level.INFO, "Successfully retrieved text of " + title, "getPageText");
return decode(content);
}
— Preceding unsigned comment added by 80.143.120.164 (talk • contribs)
Sorry about the wait - I only check this page when I release a new version. The current way avoids parsing any XML. Sometimes it's harder and slower to use the API - rollback is another example. Won't fix. (I did, however, tweak the docs to detail what happens when exists(title)[0] == false
). MER-C 10:56, 22 August 2008 (UTC)
- I'm having second thoughts about WONTFIXing this, the API's resolve redirects functionality could be handy here. MER-C 12:55, 22 August 2008 (UTC)
using rights and not groups for "apihighlimits"
[edit]Use rights to chance highlimit, not group ('BOT' or 'ADMIN' are groups see 'query("meta=userinfo&uiprop=rights|groups")', but you call it right ('User.userRights()')
int limit = 500;
String result = query("meta=userinfo&uiprop=rights")
if (result.indexOf("apihighlimits") != -1)
limit = 5000; //500 per default
- This adds a query for no real reason because the result of
User.userRights()
is cached. (Just tweak the source if the default doesn't apply to you.) The method is named after Special:Userrights before I realized they were groups. Implementing the whole permissions model would result in lots ofpublic static final long
(ints aren't good enough) spam and take 500+ lines. Later. MER-C 14:01, 23 August 2008 (UTC)
Upload bug?
[edit]Hi, I'm trying your code (great BTW) to upload files. There seems to be a problem with "special" chars in the destination filename and the description (see for example http://commons.wikimedia.org/wiki/File:Test%2Bkgoiyfyktgkggukgku.jpg):
- Spaces in the dest filename will turn into "+"
- Upload will say "Successfully uploaded" but fail when the dest filename contains a German Umlaut (äöüÄÜÖ)
- Upload will say "Successfully uploaded" but fail when the dest filename contains a comma (,)
- If upload succeeds, special characters in the wikitext will turn into gibberish
I tried to add "Content-Type:text/plain; charset=utf-8;" to the upload description and/or the wpDestFile (both with and without the content-type), but no luck. Do you know a quick fix? Cheers, --Magnus Manske (talk) 23:34, 7 August 2009 (UTC)
- Update: I've managed to clean up the contents by encoding it as iso-8859-1:
try {
contents = new String(contents.getBytes("UTF-8"), "iso-8859-1");
} catch (UnsupportedEncodingException ex) {
Logger.getLogger(BArchangleView.class.getName()).log(Level.SEVERE, null, ex);
}
No luck with the dest filename yet, though. I suppose the entire request should rather be utf-8 instead of these ugly hacks... --Magnus Manske (talk) 13:18, 8 August 2009 (UTC)
- Update 2: Got it working now! Here's the code of the entire function:
public synchronized void upload(File file, String filename, String contents) throws IOException, LoginException
{
// TODO: API upload? Still in the pipeline, unfortunately.
// throttle
long start = System.currentTimeMillis();
statusCheck();
// check for log in
if (user == null)
{
CredentialNotFoundException ex = new CredentialNotFoundException("Permission denied: you need to be registered to upload files.");
logger.logp(Level.SEVERE, "Wiki", "upload()", "[" + domain + "] Cannot upload - permission denied.", ex);
throw ex;
}
// UTF-8 vodoo
try {
contents = new String(contents.getBytes("UTF-8"), "iso-8859-1");
} catch (UnsupportedEncodingException ex) {
Logger.getLogger(BArchangleView.class.getName()).log(Level.SEVERE, null, ex);
}
// check if the page is protected, and if we can upload (incorporates lag check)
String filename2 = filename.replaceAll(" ", "_");
// String filename2 = URLEncoder.encode(filename.replaceAll(" ", "_"), "UTF-8");
try {
filename2 = new String(filename2.getBytes("UTF-8"), "iso-8859-1");
} catch (UnsupportedEncodingException ex) {
Logger.getLogger(BArchangleView.class.getName()).log(Level.SEVERE, null, ex);
}
String fname = "File:" + filename2;
if (!checkRights(getProtectionLevel(fname), false))
{
CredentialException ex = new CredentialException("Permission denied: image is protected.");
logger.logp(Level.WARNING, "Wiki", "upload()", "[" + domain + "] Cannot upload - permission denied.", ex);
throw ex;
}
// prepare MIME type
String extension = filename2.substring(filename2.length() - 3).toUpperCase().toLowerCase();
if (extension.equals("jpg"))
extension = "jpeg";
else if (extension.equals("svg"))
extension += "+xml";
// upload the image
// this is how we do multipart post requests, by the way
// see also: http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4.2
String url = base + "Special:Upload";
logurl(url, "upload");
URLConnection connection = new URL(url).openConnection();
String boundary = "----------NEXT PART----------";
connection.setRequestProperty("Accept-Charset", "iso-8859-1,*,utf-8");
connection.setRequestProperty("Content-Type", "multipart/form-data; boundary=" + boundary);
setCookies(connection, cookies);
connection.setDoOutput(true);
connection.connect();
// send data
boundary = "--" + boundary + "\r\n";
DataOutputStream out = new DataOutputStream(connection.getOutputStream());
// DataOutputStream out = new DataOutputStream(System.out); // debug version
out.writeBytes(boundary);
out.writeBytes("Content-Disposition: form-data; name=\"wpIgnoreWarning\"\r\n\r\n");
out.writeBytes("true\r\n");
out.writeBytes(boundary);
out.writeBytes("Content-Disposition: form-data; name=\"wpDestFile\"\r\n");
out.writeBytes("Content-Type: text/plain; charset=utf-8\r\n\r\n");
out.writeBytes(filename2);
out.writeBytes("\r\n");
out.writeBytes(boundary);
out.writeBytes("Content-Disposition: form-data; name=\"wpUploadFile\"; filename=\"");
out.writeBytes(filename);
out.writeBytes("\"\r\n");
out.writeBytes("Content-Type: image/");
out.writeBytes(extension);
out.writeBytes("\r\n\r\n");
// write image
FileInputStream fi = new FileInputStream(file);
byte[] b = new byte[fi.available()];
fi.read(b);
out.write(b);
fi.close();
// write the rest
out.writeBytes("\r\n");
out.writeBytes(boundary);
out.writeBytes("Content-Disposition: form-data; name=\"wpUploadDescription\"\r\n");
out.writeBytes("Content-Type: text/plain\r\n\r\n");
out.writeBytes(contents);
out.writeBytes("\r\n");
out.writeBytes(boundary);
out.writeBytes("Content-Disposition: form-data; name=\"wpUpload\"\r\n\r\n");
out.writeBytes("Upload file\r\n");
out.writeBytes(boundary.substring(0, boundary.length() - 2) + "--\r\n");
out.close();
// done
BufferedReader in;
try
{
// it's somewhat strange that the edit only sticks when you start reading the response...
String line ;
// in = new BufferedReader(new InputStreamReader(new GZIPInputStream(connection.getInputStream()), "UTF-8"));
in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
line = in.readLine();
// while ((line = in.readLine()) != null) System.out.println(line);
in.close();
}
catch (IOException e)
{
// retry once
if (retry)
{
retry = false;
log(Level.WARNING, "Exception: " + e.getMessage() + " Retrying...", "upload");
upload(file, filename, contents);
}
else
{
logger.logp(Level.SEVERE, "Wiki", "upload()", "[" + domain + "] EXCEPTION: ", e);
throw e;
}
}
if (retry)
log(Level.INFO, "Successfully uploaded " + filename, "upload");
retry = true;
// throttle
try
{
long z = throttle - System.currentTimeMillis() + start;
if (z > 0)
Thread.sleep(z);
}
catch (InterruptedException e)
{
// nobody cares
}
}
I still think the iso-hack is ugly, though... --Magnus Manske (talk) 16:07, 8 August 2009 (UTC)
- Yeah. I need to rewrite it for the upload API anyway, which will be with us on the next scap (Wikimania, perhaps?). Hopefully things will be saner then. MER-C 06:51, 9 August 2009 (UTC)
Bug in move()?
[edit]// success
if (temp.contains("move from"))
in.close();
// failure
checkErrors(temp, "move");
Should be:
// success
if (temp.contains("move from"))
in.close();
else
// failure
checkErrors(temp, "move");
? --Nat3738 (talk) 03:16, 8 October 2009 (UTC)
Issue with the APIs returning blank lines before actual response
[edit]This may occur in several places, I found the problem in login and edit.
These are the changes I made to make it work
in login:
String line = in.readLine();
boolean success = line.contains("result=\"Success\"");
in.close();
becomes
String line;
boolean success = false;
while ((line = in.readLine()) != null){
if (line.contains("result=\"Success\"")) {
success = true;
break;
}
}
in.close();
in edit the call to checkErrors causes an Exception if the first returned line is blank even though subsequent lines exist with the success message; you need to loop through the returned lines to check for success.
Glen.mccormick (talk) 13:56, 12 January 2010 (UTC)
- Works for me at least on WMF sites. You're probably thinking of the XML pretty-print format. MER-C 05:35, 12 February 2010 (UTC)
Small corrections
[edit]Hello MER-C,
I took the liberty to make 2 modifications on your code:
- I corrected a bug when getCategories() is called on non existing page or page without category
- I corrected some javadoc
- I corrected a bug when getImagesOnPage() is called on non existing page or page without images
But I did not modify the changelog.
I hope you don't mind.
In all cases, thanks a lot for your library and have a happy new year.
Best regards, Liné1 (talk) 07:46, 2 January 2011 (UTC)
- Thanks for the bug fixes. MER-C 09:33, 14 February 2011 (UTC)
Android compatible
[edit]I'm using your code for some android apps i'm writing ATM. I had to change some things as android java is missing some functions native java has i.e. isEmpty on Strings had to be replaced with equals("").
So i don't have to maintain the whole thing on my own ... is there any chance i could maintain android compatibility in your repo? My mail is at Freakolowsky. 10x. —Preceding undated comment added 15:13, 23 May 2011 (UTC).
checkRights() bug
[edit]Hi, im developing Commons:VicuñaUploader and I found bug related with cookies. If someone will log in not using uppercase in first letter (eg. "myaccount"), method user.getUsername()
will return "myaccount", but cookies contatins "Myaccount" received from server. As a result CredentialExpiredException will be returned, but it should't. The same situation with spaces and underscores: server will return plus instead.
Fix below:
protected boolean checkRights(int level, boolean move) throws IOException, CredentialException { // check if we are logged out String s = user.getUsername(); s = s.substring(0,1).toUpperCase() + s.substring(1); //first to upper s = s.replace(" ", "+").replace("_", "+"); //spc to plus if (!cookies.containsValue(s)) { logger.log(Level.SEVERE, "Cookies have expired"); logout(); throw new CredentialExpiredException("Cookies have expired."); } //(...)
Cheers, Yarl ✉ 14:00, 8 September 2012 (UTC)
- Noted. MER-C 03:35, 10 September 2012 (UTC)
- OK, and is there an easy way to check upload progress? Yarl ✉ 12:59, 10 September 2012 (UTC)
- The MW API is blocking serverside, so you will need to edit upload to update whatever progress bar you have. It is not possible to monitor single chunk uploads. MER-C 08:01, 17 September 2012 (UTC)
- Might be fixed in r89 (not tested). MER-C 08:27, 17 September 2012 (UTC)
- The MW API is blocking serverside, so you will need to edit upload to update whatever progress bar you have. It is not possible to monitor single chunk uploads. MER-C 08:01, 17 September 2012 (UTC)
- OK, and is there an easy way to check upload progress? Yarl ✉ 12:59, 10 September 2012 (UTC)