User:Chris the speller/BioRegEx
Suggested "Find and replace settings" for AWB when working on bio articles, especially of political and military people. Simply splice into your .xml settings file (make a backup of your settings file first). See below section How to splice. Then use "File/Open settings..." within AWB.
These have been well tested.
- What the rules do
There are eight Find & Replace rules:
- Changes spaced hyphen to spaced en dash after US-style date, as in ≈May 1, 1888 - June 1889≈
- Changes ≈He served from 1923-1926.≈ → ≈He served from 1923 to 1926.≈ Also changes the en dash character and "–".
- Changes ≈He served from 1923-26.≈ → ≈He served from 1923 to 1926.≈ Also changes the en dash character and "–".
- Changes ≈to 1926 and 1928-1929.≈ → ≈to 1926 and 1928 to 1929.≈ Also changes the en dash character and "–".
- Changes ≈to 1926 and 1928-29.≈ → ≈to 1926 and 1928 to 1929.≈ Also changes the en dash character and "–".
- Changes ≈She was inactive between 1933-1941.≈ → ≈She was inactive between 1933 and 1941.≈ Also changes the en dash character and "–".
- Changes ≈She was inactive between 1933-41.≈ → ≈She was inactive between 1933 and 1941.≈ Also changes the en dash character and "–".
- Changes unspaced hyphen to en dash in 4-digit year range within parentheses: ≈(1857-1904)≈ → ≈(1857–1904)≈ Note that this does not change links (piped or not) where a year range is at the end of the page title, as in [[Cuthbert P. Hipplethwaite (1640-1717)|Skippy Hipplethwaite]]
Find & replace rules 2 through 7 are governed by WP:YEAR, which says " Ranges expressed using prepositions (from 1881 to 1886 or between 1881 and 1886) should not use en dashes (not from 1881–1886 or between 1881–1886)."
The code
[edit]<Replacement> <Find>\b(January|February|March|April|May|June|July|August|September|October|November|December)\x20(\d{1,2},\x20\d{4}\x20)-\x20</Find> <Replace>$1 $2– </Replace> <Comment>May 1, 1888 - June 1889 (hyphen to en dash)</Comment> <IsRegex>true</IsRegex> <Enabled>true</Enabled> <Minor>true</Minor> <RegularExpressionOptions>None</RegularExpressionOptions> </Replacement> <Replacement> <Find>\b(F|f)rom\x20(\d{4})(?:-|–|&ndash;)(\d{4})\b</Find> <Replace>$1rom $2 to $3</Replace> <Comment>from 1991-1997</Comment> <IsRegex>true</IsRegex> <Enabled>true</Enabled> <Minor>true</Minor> <RegularExpressionOptions>None</RegularExpressionOptions> </Replacement> <Replacement> <Find>\b(F|f)rom\x20(\d{2})(\d{2})(?:-|–|&ndash;)(\d{2})\b</Find> <Replace>$1rom $2$3 to $2$4</Replace> <Comment>from 1991-97</Comment> <IsRegex>true</IsRegex> <Enabled>true</Enabled> <Minor>true</Minor> <RegularExpressionOptions>None</RegularExpressionOptions> </Replacement> <Replacement> <Find>\bto\x20(\d{4})(,?)\x20and\x20(\d{4})(?:-|–|&ndash;)(\d{4})\b</Find> <Replace>to $1$2 and $3 to $4</Replace> <Comment>to 1997 and 2001-2004</Comment> <IsRegex>true</IsRegex> <Enabled>true</Enabled> <Minor>true</Minor> <RegularExpressionOptions>None</RegularExpressionOptions> </Replacement> <Replacement> <Find>\bto\x20(\d{4})(,?)\x20and\x20(\d{2})(\d{2})(?:-|–|&ndash;)(\d{2})\b</Find> <Replace>to $1$2 and $3$4 to $3$5</Replace> <Comment>to 1997 and 2001-04</Comment> <IsRegex>true</IsRegex> <Enabled>true</Enabled> <Minor>true</Minor> <RegularExpressionOptions>None</RegularExpressionOptions> </Replacement> <Replacement> <Find>\b(B|b)etween\x20(\d{4})(?:-|–|&ndash;)(\d{4})\b</Find> <Replace>$1etween $2 and $3</Replace> <Comment>between 1992-1998</Comment> <IsRegex>true</IsRegex> <Enabled>true</Enabled> <Minor>true</Minor> <RegularExpressionOptions>None</RegularExpressionOptions> </Replacement> <Replacement> <Find>\b(B|b)etween\x20(\d{2})(\d{2})(?:-|–|&ndash;)(\d{2})\b</Find> <Replace>$1etween $2$3 and $2$4</Replace> <Comment>between 1992-98</Comment> <IsRegex>true</IsRegex> <Enabled>true</Enabled> <Minor>true</Minor> <RegularExpressionOptions>None</RegularExpressionOptions> </Replacement> <Replacement> <Find>\((\d{4})-(\d{4})\)(?![\]|#])</Find> <Replace>($1–$2)</Replace> <Comment>en dash in year range</Comment> <IsRegex>true</IsRegex> <Enabled>true</Enabled> <Minor>true</Minor> <RegularExpressionOptions>None</RegularExpressionOptions> </Replacement>
How to splice
[edit]I use the foxe XML editor from firstobject.com, which is a free download, less than a megabyte, easy to install and use. Open the .xml file that holds your saved settings from AWB, and double-click on "FindAndReplace" in the foxe tree view window (left side). Copy the above code to the clipboard, then paste it into the foxe text window (right side) below the <Replacements> tag. Optionally, you can delete replacement rules (before or after you paste the new rules); just expand "Replacements" in the tree view, then expand each replacement, click on any one and then hit the "delete" key. Save the .xml settings file. In AWB use "File/Open settings..."