Jump to content

Talk:2015 Queen's Birthday Honours (Australia)

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Extracting

[edit]

For AOs, I used the following method to extract names and citations from "Officer (AO) in the General Division of the Order of Australia" at the GG's website: Open that PDF in Adobe Acrobat, save as Word document, and run the following macro:

Option Explicit

Sub Getem()

Dim rng As Range
Dim docSource As Document
Dim docTarget As Document
Dim str As String

Set docSource = ActiveDocument
Set rng = docSource.Range(0)

Documents.Add
Set docTarget = ActiveDocument

Do While True
  Set rng = GetBoldText(rng)
  If rng Is Nothing Then Exit Do
  'Debug.Print rng.Text
  With rng
    str = Right(.Text, Len(.Text) - InStr(1, .Text, " ")) ' strip 1st word
  End With
  If Left(str, 11) = "Honourable " Then str = Right(str, Len(str) - 11)
  If Left(str, 10) = "Professor " Then str = Right(str, Len(str) - 10)
  str = "*[[" & Trim(StrConv(str, vbProperCase))
  If Right(str, 1) = "," Then str = Left(str, Len(str) - 1) ' remove trailing comma
  str = str & "]] – "
  docTarget.Range.InsertAfter str
  rng.Start = rng.End
  
  Set rng = GetUnderlinedText(rng)
  If rng Is Nothing Then Exit Do ' should not happen
  'Debug.Print rng.Text
  str = Trim(rng.Text) & vbCr
  docTarget.Range.InsertAfter str
  rng.Start = rng.End
Loop
MsgBox "Done.", vbInformation + vbOKOnly, "GetEm"
End Sub

Function GetBoldText(rng) As Range
'Debug.Print "Start rng (0): " & rng.Start
With rng.Find
  .ClearFormatting
  .Format = True
  .Font.Bold = True
  If .Execute Then
    'Debug.Print "Start rng(1): " & rng.Start
    Set GetBoldText = rng
    'Debug.Print "Start GetBoldText: " & GetBoldText.Start
  Else
    Set GetBoldText = Nothing
  End If
End With
End Function

Function GetUnderlinedText(rng) As Range
'Debug.Print "Start rng (0): " & rng.Start
With rng.Find
  .ClearFormatting
  .Format = True
  .Font.Underline = wdUnderlineSingle
  If .Execute Then
    'Debug.Print "Start rng(1): " & rng.Start
    Set GetUnderlinedText = rng
    'Debug.Print "Start GetBoldText: " & GetBoldText.Start
  Else
    Set GetUnderlinedText = Nothing
  End If
End With
End Function

The result required some manual tweaking in finding the proper links for the subjects' articles. I suppose a similar approach could work for other sections. -- Michael Bednarek (talk) 12:41, 5 August 2015 (UTC)[reply]