Jump to content

Module talk:Strip to numbers

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Current problems

[edit]
 Fixed

Obvious known issues:

  • Matches all numbers, hyphen-minuses, and dots, and will:
    • produce an invalid number from input like font-size: -10% (resolves to --10) or approx. 53.5 (resolves to .53.5)
    • the opposite of the desired number for cases like margin-left: 2.5em (resolves to -2.5)
    • probably-undesired concatenation from input like 42 chickens on 12 farms (resolves to 4212)

It needs to only match on - and . when preceding a numeral, and to stop parsing and return a value when it reaches the first end of numeric data.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  10:35, 18 July 2015 (UTC)[reply]

WOSlinker's rewrite fixed all this.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  16:53, 18 July 2015 (UTC)[reply]

For later development

[edit]

A more advanced version that could be invoked directly or required by other modules, might do something like this:

  • For any arbitrary input, trim all leading material until it hits any of:
    • a numeral; or
    • a . followed by a numeral; or
    • a - or the proper negative/minus glyph followed by either of:
      • a numeral, or by
      • . followed by a numeral
  • then retain that character;
  • proceed to next character, and retain it if matches either:
  • a numeral, or
  • a ., unless one was matched earlier;
  • repeat until that fails (i.e. a second . is found, or any other non-numeral is found);
  • then trim everything after that;
  • and do all this for multiple values passed,
  • and even do it for multiple numbers found in the same value (or discard any found after the first, or something)
  • In the division function:
    • Accept arbitrary divisors;
    • Round the results to arbitrary decimal places if necessary (default: two, as in 23.48).
A later super-badass version could add:
  • recognize ^ or e, and x or × or * when found in a context that indicates an exponent (23.5x10^8), and a few other such cases (e.g. characters used to indicate a truncated long/endless decimal)
  • find multiple numbers per input string, and separate them by something (e.g. a space or a comma) depending on which function is invoked
  • recognize simple formulae
  • recognize English words for numbers and convert to use digits

 — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  10:35, 18 July 2015 (UTC)[reply]