Jump to content

Wikipedia:AutoWikiBrowser/Typos/to do

From Wikipedia, the free encyclopedia

Here are some tasks awaiting attention:
  • Cleanup : Keep lists sorted alphabetically by root word; e.g., put "(Un)Equal" just before "(In)Equality" among the "E" words. Don't sort by, say, ASCII character value.
  • Deletion sorting : Remove duplicates.
  • Maintain : Identify and improve rules to avoid false positives.


  • Improve efficiency
    • (s)? is faster than (s?); (s|); (|s)
    • (a|b) is fewer steps than ([ab])
    • this|that -> th(?:is|at)[1]
    • Order matters, so order by most to least likely to occur [2]
      • [aA] \(\[([A-Z])([a-z])\]\) -> ([$2$1]); lower case letter more likely than upper case
  • Expand rules to accept more suffixes (e.g., "-ing", "-ed", "-able") and prefixes.
    • Note that some regular expressions purposely correct only certain versions of a word to avoid false positives. These should be marked with an underscore character "_" at the beginning or end of the word= field.
  • Remove rare words. Note that no matches today does not mean a rule is rare, since another user may have used the rule to fix many articles yesterday.

References

  1. ^ "Regular Expression Optimizations". www.rexegg.com. Retrieved 2018-11-10.
  2. ^ "Five Invaluable Techniques to Improve Regex Performance". Log Analysis | Log Monitoring by Loggly. 2015-06-30. Retrieved 2018-11-14.