Jump to content

User talk:Lupin/badwords

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

A curious question

[edit]

I may, perhaps, be harder to offend than the average american, but how is "all the pies" considered a "bad word"? :) - JustinWick 08:34, 31 January 2006 (UTC)[reply]

See Who Ate All the Pies? - it's a "classic" playground football insult. Lupin|talk|popups 16:19, 31 January 2006 (UTC)[reply]
Wow, an informative response! Thanks, I learned something! - JustinWick 05:41, 12 February 2006 (UTC)[reply]

Suggestion

[edit]

You should add more variations of the bad words. I can think of some you may have missed. Evan Robidoux 09:11, 24 February 2006 (UTC)[reply]

What are you thinking of? I can add them in. -Mysekurity 09:25, 24 February 2006 (UTC)[reply]
  • Image:Human_feces.jpg
  • suks
  • kill, especially with exclamation points.
  • Variations of the word "die," especially with exclamation marks (e.g.: "Die!")

That's all I can think of right now. Evan Robidoux 09:42, 24 February 2006 (UTC)[reply]

Another suggestion

[edit]

terms youve missed are permutations of a,s,d, and f. on a qwerty keyboard if you mash the keys most people end up writing "asdasdasdf" or similar. vandal edits usually give an edit summary of mashed keys.-- Alfakim --  talk  18:02, 13 April 2006 (UTC)[reply]

Actually, vandals usually give no edit summary, or only a section edit summary. This is probably because most of them are new users who haven't noticed the summary box.--Reverting 02:49, 6 June 2006 (UTC)[reply]

Regexp

[edit]

Is it possible for this to support Regxps? It seems to me that a good number of these edits and such could be used for good (see this dif, where the word vegan was added...)? -Mysekurity [m!] 21:17, 27 April 2006 (UTC)

Yes, I've had a go at this. Note that ( is replaced with (?: - the idea is that all paren groups are transformed into non-capturing parens so that it doesn't mess up script internals. This means that backreferences aren't possible and also that you should avoid opening parens apart from using them for grouping at the moment. Also, each regexp is treated as if it's surrounded by word boundary markers, it is made case-insensitive, and flags aren't supported. To add a regexp, surround it by forward slashes and add it to badwords. I haven't tested this much, so let me know how you fare... Lupin|talk|popups 02:13, 28 April 2006 (UTC)[reply]

More Bad Words

[edit]

This is just a suggestion, I didn't add any of these.
REDIRECT--Maybe this will work against WoW, or redirect vandals.
chicks--as in "I like hot chicks.
stupid--"article is stupid--I'm surprised you don't already have this.
Also, many vandals like to type in ALL CAPS, so maybe you can do something about this.

I disagree with REDIRECT, as it will give a huge number of false-positives for every time someone moves a page, or creates a redirect. It's broad words like this that make the tool much less useful. I'm going to remove it. -Mysekurity [m!] 01:20, 10 May 2006 (UTC)

Wang

[edit]

How is Wang a bad word? It is a common Chinese family name. Andrew_pmk | Talk 02:37, 2 May 2006 (UTC)[reply]

It is also slang for penis, along with about a million other words to refer to genitalia (there's a certain stigma attached to private parts, as I understand). This is the type of situation where I think REGXPs (see above) would work well. Unfortunately, I'm not too good with them, so if you have any suggestions based on Lupin's post above, feel free to tell me and I'll change the page. -Mysekurity [m!] 02:45, 8 May 2006 (UTC)

I couldn't think of a title for this...

[edit]

What about ____ on Wheels? And they aren't all bad words. Just words vandals like to use.-Gangsta-Easter-Bunny 20:09, 5 May 2006 (UTC)[reply]

It's already there (see "On Wheels"). -Mysekurity [m!] 01:19, 10 May 2006 (UTC)

Case sensitive?

[edit]

Are the "badwords" listed here case sensitive? By that I mean will a word, say "bitch" still be detected if it is written "BITCH", for example, without a seperate entry for an all-caps version of the word having to exist?--Conrad Devonshire Talk 01:39, 9 May 2006 (UTC)[reply]

They're all case-insensitive, so the answer to your second question is "yes". Lupin|talk|popups 02:32, 17 May 2006 (UTC)[reply]
Here's the thing though... I've seen more than a few vandalous edits where the entire edit was done all in caps. is there any way that we can filter for an "all Caps" edit? Fbarton 00:19, 8 December 2006 (UTC)[reply]

Removed "fist"

[edit]

I decided to remove "fist" from the list, but if anyone disagrees with this decision, feel free to undo it.--Conrad Devonshire Talk 21:37, 9 May 2006 (UTC)[reply]

Removed "woody"

[edit]

I have decided to remove "woody" from the list of vandal terms.--Conrad Devonshire Talk 01:37, 17 May 2006 (UTC)[reply]

Moravia?

[edit]

Why is "Moravia" on the list of vandal terms?--Conrad Devonshire Talk 21:54, 28 May 2006 (UTC)[reply]

No idea :) Here's the diff. Lupin|talk|popups 01:39, 30 May 2006 (UTC)[reply]

Linkspam

[edit]

I've added three links to the list. I don't think they should be banned from Wikipedia outright, but they have been added a lot recently and I'd like to keep an eye on them. If this is not the kind of thing we want on this list, feel free to remove them. Tom Harrison Talk 14:50, 3 June 2006 (UTC)[reply]

Badwords fork

[edit]

Rather than ask for consensus every time I wanted to remove a false positive, I've split off my own badwords list which is slightly more optimized. Anyone who is interested is welcome to use it: http://wiki.riteme.site/wiki/User:Can%27t_sleep%2C_clown_will_eat_me/badwords -- Can't sleep, clown will eat me 02:32, 5 June 2006 (UTC)[reply]

Forking is fine of course, but I'd rather people were bold and changed the page as they saw false positives or missing bad words crop up instead of trying to come to some sort of consensus in advance. If there's controversy there can be discussion, but I don't want anyone to think that there's a requirement to discuss before making changes. Lupin|talk|popups 06:51, 6 June 2006 (UTC)[reply]

gabenwell.com and churnedfortaste.com

[edit]

I have added these two sites to the list. If you see a link to either one of them posted, DO NOT CLICK IT. It will cause a window with an offensive image to appear and will attempt to open tons of Outlook Express and and Instant Messenger windows and try to send e-mail to the GNAA. They were posted by now-blocked user Churnedfortaste. Another mirror of this site, hentai.net has also been spammed according to the Spam Blacklist but has since been blacklisted.--Conrad Devonshire Talk 03:06, 11 July 2006 (UTC)[reply]

Ho

[edit]

Could someone please remove "ho" from the list? I looked for it myself, but couldn't find it.--The Count of Monte Cristo Parley 10:13, 1 August 2006 (UTC)[reply]

Done. I couldn't find it either, so I wrote a script which I've included below for reference. Lupin|talk|popups 01:17, 2 August 2006 (UTC)[reply]
#!/usr/bin/env perl
# usage: findbad.pl testword < badwords
my $test=@ARGV[0];
while (<STDIN>) {
  next unless m!^/(.*)/$!;
  my $re=$1;
  if ($test =~ /$re/i) {
    print "$.: $_";
  }
}

Triple

[edit]

I have removed "triple", as it was giving lots of false-positives, and I can't imagine any bad use of it. -Goldom ‽‽‽ 11:50, 5 August 2006 (UTC)[reply]

Apparently, I haven't, cause it's still showing up. Not sure what I actually did there, in that case, so I reverted in case it was something bad. If someone else could remove it properly, unless there's a reason to keep it, that'd be great. -Goldom ‽‽‽ 11:54, 5 August 2006 (UTC)[reply]
The motivation was that Colbert vandals are saying that various populations have tripled, apparently. I have removed the line, though, and have added instructions on getting the change to take hold at the top of the page. Lupin|talk|popups 13:57, 5 August 2006 (UTC)[reply]

TTT

[edit]

Why is TTT flagged as a bad word? -- Selmo 04:33, 18 August 2006 (UTC)[reply]

nigger

[edit]

What do you think of the idea of adding nigger(s) to the black list? I saw it twice tonight Lucasbfr 02:18, 20 August 2006 (UTC)[reply]

I'm sorry, racial slurs are terrible things, etc, but that's a fairly amusing (hopefully unintentional) pun. Yes, I am that insensitive.- JustinWick 09:32, 7 December 2006 (UTC)[reply]
It's K, I was thinking the same thing. -Patstuarttalk|edits 10:05, 7 December 2006 (UTC)[reply]

queer

[edit]

I've been using your tool (which I LOVE) and a few times "queer" came up because the TV show "Queer eye for the straight guy" was mentioned. Is it possible to make that an exception to the scan for that word? Lauren 18:56, 20 August 2006 (UTC)[reply]

Regular expression idioms

[edit]

Wherever a space appears in a regular expression, it could be replaced with \s* to allow one or more spaces to match. Also useful: (e?s|[e']?d|in[g']?|ers?)? to catch verb paradigms such as pick, picks, picked, picker, pick'd, picking, pickin', and so on. Peter O. (Talk) 02:53, 23 August 2006 (UTC)[reply]

Noxious SPAMmer

[edit]

Since "datasheet4u.com" has done NOTHING but SPAM datasheet, could someone add this to the list to prevent sneaky insertions (It's already on the SPAM blacklist, but they just don't link it instead)? Thanx. 68.39.174.238 23:26, 5 September 2006 (UTC)[reply]

Regex

[edit]

How come these two rules I made to match vandalism which often involve the use of more than 2 ?'s and !'s don't seem to work? What is wrong with them and what's athe right way of matching multiple question marks and multiple exclamation mark?

/!{2,}/

/\?{2,}/

Sir Vicious 01:34, 1 November 2006 (UTC)[reply]

Regular expressions are awful. They never do what you expect them to do (or what documentation says they should do); they work differently on each system, and what's more, the huge amount of the afore mentioned documentation never seems to solve the problem. -Patstuart(talk)(contribs) 03:04, 1 November 2006 (UTC)[reply]
Thanks for the comment. So, are there better ways of matching them? I've tried /!!+/ too but it did not seem to have the desired effect, it matched a single "!" too, weird. Sir Vicious 03:50, 1 November 2006 (UTC)[reply]
Come to think of it, maybe I don't need to use regexp at all, I can just match ?? and !!, any case where more than 2 marks is used will also automatically be matched. Sir Vicious 03:59, 1 November 2006 (UTC)[reply]
I've tried some stuff in the sandbox; it's picking up Niger (I added that as a reg ex actually to pick up nigar), but it's not picking up n00b, which is on the list either, and I could have sworn it would pick up. *Sigh*. Patstuart(talk)(contribs) 04:08, 1 November 2006 (UTC)[reply]
Ha! As I typed this, look at this edit: [1]. and I thought picking up niger was bad! Patstuart(talk)(contribs) 04:09, 1 November 2006 (UTC)[reply]
Hehe, yes, there is always an idiot out there who can't even vandalize right =) Sir Vicious 04:13, 1 November 2006 (UTC)[reply]


Possible or impossible

[edit]

I don't know if this would be possible, but I've seen a lot of vandalism today where the user put their own username into an article. I found them through the badwords filter, but I wonder how much "Graffiti" we're missing because of this. Is there a way to check if the added text is equivalent to the editor's username? Fbarton 19:01, 8 December 2006 (UTC)[reply]

Innovative vandalism

[edit]

Just came across this. Not sure how to add <nowiki> and </nowiki> to this list. —Dylan Lake 02:00, 13 December 2006 (UTC)[reply]

"Chicken" and "Cum laude"

[edit]
  1. Why is "Chicken" a bad word? The vandal tool has been flagging a lot of harmless edits about KFC recently.
  2. I think that "Cum laude" should not be considered a bad word, even though "cum" is obviously one.

repetitions of hi

[edit]

I've had several vandals recently doing repetitions of hi, e.g. hihihihihi. Can this be added? BlankVerse 00:33, 11 January 2007 (UTC)[reply]

Done! Lupin|talk|popups 22:37, 11 January 2007 (UTC)[reply]

Roland?

[edit]

Why is "Roland" on the list... --Catz [TC] 14:25, 13 January 2007 (UTC)[reply]

Another bad word?

[edit]

MMM Commentaries - I've seen it inserted onto several pages (think petitiononline): 1 2 3 4 5 6 --science4sail talkcon 01:25, 23 January 2007 (UTC)[reply]

Sorted list

[edit]

Folks, I am trying to use this list to scan for entries in the WP CD release - see Wikipedia talk:Version 0.5. To try to optimise this list, I sorted it, by the longest embedded string, and put the results at User_talk:Lupin/sorted_badwords. Could this please replace the parent page ? Can people optimise the list ? Wizzy 10:17, 7 February 2007 (UTC)[reply]

Out of a list of 1991 articles, the following regular expressions were the most common to hit (and thus could use the most tailoring ..)
102     /(fried)?chicken/
94      /rap(e[sd]?|ers?|ing)/
53      /monkeys?/
53      /dumb?(ass|arse|o|m?y)?/
51      /fat(ty|ass)/
49      /lesbian(s|ism)?/
48      /sex(e[dr]?)s?/
44      /chi(ck|x)s? ?(with ?di(ck|x)s?)?/
40      /ma(de|ke[ds]?|king) out/
37      /s?su(c?k|x)(a|ing|e[rd]|y)?s?/
32      /stupid(ity|ness|er|head|ly)?s?/
32      /loo?sers?/
30      /s?su(c?k|x)(a|ing|e[rd]|y)?s? (my|your|his|her|its|their|our|each other|peter)?s?/
29      /[a@]([s$][s$]+|rse?|zz)(ban(ned)?|s?e|fuc?k|h[0o][l1][e3]|head|hat|juice|lick(e[rd])?|ram(mer|ma)?|raper?|rapper|wiper?)?[sz]?/
29      /cum(bucket|dumpster|felch(er|ing|ed)?)?s?/
26      /rect(al|ums?)/
26      /retard(s|ed(ly)?)?/
24      /sodom(i[zst](e[rd]|ing)|y)s?/
23      /butt-?(|breath|crack|fuck(e[dr]|ing)?|head|hole|lick(er|ing)|pirate|rape|sex|secks|wiper?)s?/
22      /vagina(l|s)?/
21      /an(us|al)(hole|tova|es)?/
20      /r[ai]m(job|me[dr]|ming)s?/
20      /c[o0]ck-?(|ass|bag|biter?|goggle|fucker|smok(a|e[dr]|ing|in|in')|head|face|nose|hole|suck(|a|e[dr]|ing|in|in')|thirsty?)?s?/
19      /fetish(es|ism)?/
18      /junk(ies?)?/
18      /jerk(ing|ed|y|wad)?([- ]?off)?s?/
17      /n[i1]gg?([e3]r|ar?|uh)(lover|ass)?[sz]?( stole)?/
17      /w[au][sz] here/
17      /d[a4]m[nm](it)?/
15      /beaver(juice|lick|suck|fuck)?(er|ing|ed|a)?s?/
15      /lam[eo](brain|er)?s?/
14      /testicles?/
14      /crackers?/
13      /p[3ei]n[1!iu]s(bit|lick|suck|head|fuck|face|hole(e|er|ing)?)?s?/
13      /Amerik+an?'?s?/
12      /sex(y|ier|iest) ?(babe|cunt|beast|bitche?|whore)?s?/
12      /(yo)+/
12      /nuk(e([dr])?s?|ing)/
11      /nipples?/
10      /bu(m|ng)(hole|lick(e[rd])?|wipe[rd]?|ming|chum)?s?/
10      /Japs?/
10      /((is a|are|is) )?homo(phobe)?s?/
10      /(f|ph)u(kc|c+k*|c*k+|x)(a|ass|e[rd]|ie|y|bitch|erino|head|hole|arse|arsed|face|queer|wit|in[g']?|inghell|[o0]r?|o|off|tard|wad)?s?/
10      /finger(ing|ed|pull(a|er)s?)/

Going to remove 'the'

[edit]

I don't understand why 'the' is a 'bad word'.. it just floods the tool. SgeoTC 05:18, 11 February 2007 (UTC)[reply]

Major overhaul

[edit]

Spent some time working on the list (as you can tell from the edit summary). Basically, instead of a straight alphabetical list, I made an attempt to categorize and prioritize it by level of offensiveness so that the most egregious vandals are more apparent when using the 'recent changes' tool. Also added quite a few phrases and sentence fragments based on the vandal patterns that I've been seeing. Hope it works out for everyone, and please let me know if I've either helped out or jacked something up. RJASE1 Talk 20:16, 18 February 2007 (UTC)[reply]

Punk

[edit]

The punk string appears to me to be generating huge numbers of false positives, and I have yet to see it generate a true positive. IMHO the expression should be modified to only match punk with "asse" and perhaps "buc" (I'm not sure what the buc bit is for), so that fewer articles that are genuinely about punk rock are picked up. I don't know how the regular expressions work so I'm not sure what would be best. --Jon186 13:23, 4 March 2007 (UTC)[reply]

Fixed. RJASE1 Talk 17:16, 4 March 2007 (UTC)[reply]
Thanks for that :o) --Jon186 20:19, 11 March 2007 (UTC)[reply]

What regex does this use?

[edit]

The syntax for regular expressions varies depending on the implementation used. Which regex is used here? Is there any documentation? -- kenb215 talk 21:35, 13 March 2007 (UTC)[reply]

It is the syntax used by your browser's javascript engine, which is generally something like PCRE (see the ECMAScript spec for details). There are further restrictions, however, as parens (...) are replaced internally by (?:...) which means you can't use literal parens, \1, \2 etc. Lupin|talk|popups 22:35, 13 March 2007 (UTC)[reply]

What's the best way to test a regular expression that I wish to add to the list. Is there a way to test a portion of text against the existing list to see if the vandalism is already being caught. --  callred

In theory: Make a user subpage, uncheck "Ignore my edits," open the "filter recent changes" page, add your test to the subpage, and see if it shows up (make sure you do everything in that order... except maybe the first one) In practice: There's probably a much better way to do this... maybe with the javascript: URI or something... --Thinboy00 @145, i.e. 02:29, 14 February 2008 (UTC)[reply]

April Fool?

[edit]

Should "April Fool" be added to this list? A lot of users have allready started making April Fools day edits and a lot of them contain the text "April Fool". -Mschel 21:25, 31 March 2007 (UTC)[reply]

Be bold? Too late now though. It was probably a good idea. GofG ||| Talk 13:58, 8 April 2007 (UTC)[reply]

Bot

[edit]

I'm curious, would it be allowable for a bot to use this as a secondary source for badwords when the bot is doing a different job? (e.g. newpage monitoring) Thanks! TheFearow 05:37, 15 May 2007 (UTC)[reply]

"Learn english"?

[edit]

To counteract any Stephen Colbert-related vandalism, does it make sense to add "Learn English" (just like "librarians are hiding something" was added to the list) -- Amazins490 (talk) 20:37, 25 May 2007 (UTC)[reply]

I agree, you should add that to the list. Make sure learn and english are capitalized though, there are probably a lot of instances in Wikipedia where it says "learn english".--eskimospy (talkcontribs) 03:05, 26 May 2007 (UTC)[reply]

Signatures

[edit]

Hi! I don't know much about scripting, but would it be possible to stop filtering ~~~~ and ~~~ from the list of repeated characters? It's showing up a lot in my filter. Thanks. Smaug123 06:25, 25 June 2007 (UTC)[reply]

Jimmy Wales

[edit]

What is Jimmy Wales doing on the list? I mean, just because he is the founder of wikipedia, doesn't mean that any vandal would type it in.... Coastergeekperson04 06:56, 5 July 2007 (UTC)[reply]

It may be that this is part of the MO of one or more vandals. I just added an e-mail address for the same reason - this specific e-mail addrss seems to have been used twice by the same vandal. The address I'm talking about is ignoreallrules@walla.com. Od Mishehu 09:27, 2 August 2007 (UTC)[reply]

Cum laude

[edit]

I don't really know how this list works, but is there a way to make "exceptions", or a "whitelist"? The filter just showed a page with the words "cum laude" because it matched the word "cum". Melsaran 11:30, 17 August 2007 (UTC)[reply]

!!

[edit]

!! is wikisyntax for tables, if you want to list headers one after another. I'm not sure how to edit this list, but it would kill a lot of false positives. :-) Stwalkerster talk 14:40, 17 August 2007 (UTC)[reply]

hi

[edit]

Another point: it is picking up his, white, history etc. because they contain 'hi'. :-) Stwalkerster talk 15:39, 17 August 2007 (UTC)[reply]

I'm not sure that's quite right. The diff still has to contain the identifiable word 'hi' for this to happen. If it does, then all occurrences of the string 'hi' are highlighted. The false positives come from things like '.hi.' (which occurs inside some URLs) and 'hi:', the Hindi language tag. Philip Trueman 14:27, 24 August 2007 (UTC)[reply]

New Word

[edit]

I have seen "FOOKIN" used once or twice now that hasn't been picked up. DoyleyTalk 19:18, 9 October 2007 (UTC)[reply]

Recent alterations

[edit]

Some recent alterations made to the word list broke AVT's filter recent changes page. I'm not sure which specific change broke it (though I suspect it was the fairly major changes by Rocket000 (t c)); but reverting to the Sept 29th version fixed the tool, and that's the important part. If you make changes to the word list, please double check that your changes didn't break the script—there are instructions at the top of the word list for forcing your browser to use the changes immediately. I'd suggest taking the time to make sure the script still works normally if you make a change, especially if you change a large number of entries all at once. --Darkwind (talk) 01:16, 21 October 2007 (UTC)[reply]

Jews did WTC

[edit]

I think this should be rather anti-semitic, how is "Jews did WTC" considered a "vandal term"? --Blake3522 03:35, 3 November 2007 (UTC)[reply]

An entry on this list is things we usually DON'T want in Wikipedia articles. It means vandals were writing "Jews did WTC" on Wikipedia pages, and because it's now on this list the Anti-Vandal Tool will catch that and help us remove it. --Darkwind (talk) 17:42, 3 November 2007 (UTC)[reply]
And this makes reference to the redirect: 9/11 conspiracy claims regarding Jews or Israel. --Blake3522 04:00, 10 November 2007 (UTC)[reply]
Jews did Judaism! 192.12.88.7 (talk) 02:59, 28 March 2009 (UTC)[reply]
Here, I was pointing out a potentially acceptable use of "Jews did xxxx", and this is one problem I have with filters. Also, suppose someone's kidding around and says "User such-and-such is a moron (j/k)", and clearly indicate he or she is making a joke, wouldn't that run into the filters? 192.12.88.7 (talk) 03:01, 28 March 2009 (UTC)[reply]
See there, that triggered a filter! 192.12.88.7 (talk) 03:02, 28 March 2009 (UTC)[reply]

"Ethiopian" string

[edit]

The "ethiopian" string seems to be having big numbers of false positives. Even when the article matches this string, so it is not right. --Blake3522 (talk) 07:09, 24 November 2007 (UTC)[reply]

False positive: Rotten - Rotten Tomatoes

[edit]

I don't know how to change the code, but could someone remove the false positive Rotten Tomatoes hits from the word "rotten"? Thanks! :) ~Eliz81(C)

Rotten Tomatoes is a website, and we don't want spam, do we? —Coastergeekperson04's talk@12/09/2007 01:50
I've seen it used as a source in movie articles. --Thinboy00 @759, i.e. 17:12, 21 January 2008 (UTC)[reply]
It's a really popular website for reference, and I've gotten this false positive too. I would remove it, if I could find where it was. The Evil Spartan (talk) 07:25, 22 January 2008 (UTC)[reply]
The line is /rotten[- ]?(ass|crotch)?e?s?/, which would evaluate as true for "rotten". It probably could be modified to evaluate as false for "rotten", but I would have to ask someone more informed about regex than me - i.e. User:Gracenotes >_> --Iamunknown 07:28, 22 January 2008 (UTC)[reply]
Thanks. I've changed it so the second phrase must be part of the filter. No reason to go chiming off every time we get the word rotten. The Evil Spartan (talk) 08:37, 22 January 2008 (UTC)[reply]

Cummings

[edit]

The last name Cummings seems to be coming up a lot as a false positive. If anyone with knowledge would be able to fix this. Thanks. The Evil Spartan (talk) 02:05, 20 January 2008 (UTC)[reply]

Is that as in e e cummings? --Thinboy00 @639, i.e. 14:19, 24 January 2008 (UTC)[reply]

repeated dashes

[edit]

I see a lot of <!-- this -------------> (with trailing dashes) in front of or above infoboxes. Since the repeated dash filter kept finding them, I removed it. --Thinboy00 @757, i.e. 17:10, 21 January 2008 (UTC)[reply]

Jig

[edit]

I do not know why Jig is flagged as a bad word, since it can mean a lively traditional Celtic dance commonly used in Baroque music called Gigue. Johnny Au (talk) 21:28, 27 January 2008 (UTC)[reply]

no spaces

[edit]

any better way to catch [2]? Right now the only thing that catches that is the !!! filter. We need something to catch bad words without spaces. --Thinboy00 @914, i.e. 20:55, 23 February 2008 (UTC)[reply]

In the house

[edit]

This is always something line 'in the house of commons' or similar, I've never seen it be used for vandalism. Keep 'in da house' though. Thought I'd better bring it up here first. George D. Watson (Dendodge).TalkHelp 18:37, 20 March 2008 (UTC)[reply]

repeated braces: }}}}}}}}}}}}}}}}}

[edit]

Repeated curly braces are often used in templates, is there any way to remove them from the list without removing all repeated characters? George D. Watson (Dendodge).TalkHelp 13:54, 21 March 2008 (UTC)[reply]

One more suggestion

[edit]

I suggest to add "Sieg Heil" on the list of bad words; I fear that some might use it on Israel-related or Nazism-related vandalism. Alexius08 is welcome to talk about his contributions. 01:16, 21 April 2008 (UTC)[reply]

Filter for "ard"

[edit]

This filter is matching parts of ordinary words. Is this in fact a real mark of a vandal? In the meantime, I'm enclosing it in /s so it only matches at word boundaries. --Thinboy00's sockpuppet alternate account 23:33, 9 June 2008 (UTC)[reply]

swallow filter

[edit]
/swal+ow(a|e[rd]|in[g']?)?[sz]?/

Why? --Thinboy00 @120, i.e. 01:53, 14 June 2008 (UTC)[reply]

Jug?

[edit]

Which line is blocking "jug"? grammatical error intentional --Thinboy00 @170, i.e. 03:04, 29 June 2008 (UTC)[reply]

jkl;

[edit]

Can you add "jkl;" to this list? If you mash those keys, people will end up writing "jkl;jkl;jk;ljk;l" or similar, or they give an edit summary of mashed keys. --58.178.142.64 (talk) 13:24, 8 July 2008 (UTC)[reply]

Waca

[edit]

What specific code is blocking the word Waca? Me Crtl+Fing the code didn't turn anything up, so I'm asking here. I'm not doubting the why its being blocked, but rather how.—Preceding unsigned comment added by Ilikepie2221 (talkcontribs) 13:37, 30 August 2008

Unable to save

[edit]

I'm unable to save: I get the error message: "The following link has triggered our spam protection filter: SPELLED OUT BELOW IN PHONETICS:

Dot Oscar November Dot November India Mike Papa Dot Oscar Romeo Golf. What do I need to do?

I want to add these delightful Hindi terms:

/be?hen ?chod/
/bhosdh?i/
/chuth?/
/chod/
/chooth?/
lund
/madar ?chod/
yon(i|ee|ey)

=Nichalp «Talk»= 08:57, 7 November 2008 (UTC)[reply]

Thanks :) =Nichalp «Talk»= 16:16, 8 November 2008 (UTC)[reply]

Polish bad words from pl wiki

[edit]

Here [3] you have the directory of bad words that the anti-vandal bots like "bugbot" running on Polish wikipedia use. Hope they help. Mieciu K (talk) 22:54, 8 November 2008 (UTC)[reply]

"hat"

[edit]

I don't have the energy to vgrep for the filter that's catching the word "hat". Could someone else do it? --Thinboy00 @002, i.e. 23:03, 7 December 2008 (UTC)[reply]

Fixed, thanks to the beautiful perl script above. --Thinboy00 @009, i.e. 23:13, 7 December 2008 (UTC)[reply]

suggestions

[edit]

I know some words I've seen, I don't know if their on the list or not but they should be: faggot;lying faggot;JEWS DID;SEE, IT'S TRUE;this is all a big lie;boner;SHE'S A;HE'S A;THERE GONNA;LATINOS(when next to another word); and everything on the title blacklist.--Ipatrol (talk) 19:41, 13 December 2008 (UTC)[reply]

hmmm

[edit]

Anything that's a slur should presumably be watched out for. That being said, I consider filters to be a gross violation of WP:NOT. 192.12.88.7 (talk) 02:57, 28 March 2009 (UTC)[reply]

Hmm, any specific part of WP:NOT that you're referring to? Tiggerjay (talk) 05:24, 5 September 2009 (UTC)[reply]

Wake up

[edit]

I keep finding this filter pick up "wake up" or some variant, and I can't seen to find the filter term to remove it. Could someone tell me what's with the word wake up? Overthinkingly (talk) 14:23, 5 June 2009 (UTC)[reply]

Nazi, heil hitler

[edit]

These words always appear, and always turn out to be 2nd-world-war articles. Kayau Don't be too CNN I'LL DO MY JOB uprising! uprising! 02:42, 15 February 2010 (UTC)[reply]

Word suggestions

[edit]

Sorry for not writing out most the translations, but I figure the Wikipedia spamfilter would cause problems since I'm an IP. Put any of these you don't already have into the list:

  • Pajero (Spanish for w nker)
  • Puta (Spanish for wh re)
  • Putana (Italian for wh re)
  • Bîte (Frenche for d ck)
  • Impedido (Spanish for disabled/retarded)
  • Wichser (German for w nker)
  • Wichs... (German for w nk, I would recommend putting wildcards in front of and behind "wichs")
  • Behindi (German pejorative for disabled/retarded)
  • Schwuchtel (German for f ggot)
  • Tunte (German for f ggot)
  • Spasti (German pejorative for someone suffering Tourette's syndrome)
  • Fick... (German version of f ck, I would recommend putting wildcards in front of and behind "fick")
  • Schwanz (German for d ck)
  • Titten (German for bo bs)
  • Möpse (German for bo bs)
  • Möse (German for c nt)
  • Fotze (German for c nt)
  • Muschi (German for c nt)
  • Pimmel (German for d ck)
  • Bumsen (German for f ck)
  • Rammeln (German for f ck)
  • Runterholen (German for w nk)
  • Nutte (German for wh re)
  • Flittchen (German for wh re)
  • Schlampe (German for wh re/sl t)
  • Hure (German for wh re)
  • Hurensohn (German for son of a b tch)
  • Hundesohn (German for son of a dog)
  • Kack... (German for sh t, I would recommend putting wildcards in front of and behind "kack")
  • Scheiß... (German for sh t, I would recommend putting wildcards in front of and behind "scheiß")
  • Scheiss... (German for sh t, I would recommend putting wildcards in front of and behind "scheiss")
  • Diao (Chinese for f ck)
 —Preceding unsigned comment added by 213.168.118.150 (talk) 22:47, 18 February 2010 (UTC)[reply] 

I've got some suggestions too, from Chinese; they've got a lot of attention on HK:

  • 囧 (variants: gwing, jiong, 冏)
  • 不該 (variants: bugai, but goi, bu gai)
  • 升呢 (variants: 平呢, 降呢)

Kayau Voting IS evil 07:27, 21 February 2010 (UTC)[reply]

Suggestion

[edit]

{{editsemiprotect}} I'm not comfortable editing code. Please add "b!tch". I've caught a couple of these only because the vandalism also included other terms. --N419BH (talk) 16:01, 21 April 2010 (UTC)[reply]

I've seen users use a "-" rather than an "=" in reference to the already existing /8=+(>?D)/ rule Cit helper (talk) 03:15, 10 June 2010 (UTC)[reply]

Not done: Edit request by an autoconfirmed user. SpigotMap 12:36, 10 June 2010 (UTC)[reply]

"Fuck off and die"

[edit]

/(m[ou]th[ae]r?)?(f+|ph)\W*(u+\W*(kc|[c\(]+\W*k*|c*\W*k+|x)|cuk)(a|in[g']?|e[rd]|y|)?[sz]?(m(e|y)|(yo)?ur?|his|her|it|their|our|each other)?[- ]?(ass|all|ie|y|bitch|erino|head|hole|arse|face|queer|w?it|[o0]r?|off|tard|wad|(yo)?u|me|her|him|them)?(a|e+[rsd]|in[g']?)?(a| hell| and die|him|her|up)?[sz]?/

In the filter above, I see what I think is intended to match "fuck off and die." However, when I test this regex, it's not matched (it only matches up to "fuck o"). I'm not sure what's causing this. Would someone care to point it out for me? Gawaxay (talk contribs count) 20:47, 11 June 2010 (UTC)[reply]

It does match, test it on debugex.com. All the best: Rich Farmbrough, 22:00, 19 July 2015 (UTC).[reply]

added tosser and sraka

[edit]

I added sraka, a Russian term for a chocolate starfish/ass [4] (though apparently the Slovene term means "Magpie"), and Tosser, a common English variant of "wanker" (... is a tosser) Chaosdruid (talk) 11:48, 28 June 2011 (UTC)[reply]

KFC

[edit]

Why is KFC a 'bad word'? Could someone care to explain this to me? Skunkman3118 (talk) 09:24, 31 May 2013 (UTC)[reply]

Punctuation false positives galore

[edit]

Been using this tool for a few days now -- it's great, by the way -- but I wonder if I'm the only one who finds punctuation-based hits to be almost entirely false positives. The various combinations of apostrophes and curly braces seem to fit too many templates and wikitext terms and only turn up actual vandalism in the case of emoticons. Would these be better to include in the spellchecker (as wikimarkup correction, say)? --Rhododendrites (talk) 19:45, 4 June 2013 (UTC)[reply]

"You are a" false positives

[edit]

When I'm on the tool, "You are a" generates a lot of false positives from AfC talk page messages, namely by saying "Note that because you are a logged-in user, you can create articles yourself, and don't have to post a request. However, you may continue submitting work to Articles for Creation if you prefer". Can someone add an exception or something so that this goes away? Thanks. kikichugirl inquire 21:49, 19 July 2013 (UTC)[reply]

"Rama" false positive

[edit]

The tool is catching "Rama" as a badword. I do not really understand how this works. Please configure it so that it no longer gives this false positive. Thanks.OrangesRyellow (talk) 12:15, 27 December 2014 (UTC)[reply]

F*cked

[edit]

I am looking to combine the entries f***, f*cked, and f***ed into a single regex that also covers f**k and any reasonable conjugations of the same. The difficulty lies in encoding the * as a literal rather than markup. My motivation for doing this is that f**k is not covered. –LaundryPizza03 (d) 08:02, 12 June 2019 (UTC)[reply]

@LaundryPizza03: You can always just do something like f\* which will match any occurance of the letter f followed by an asterisk; otherwise, if you want something that matches the whole word, you can use f(\*)+(cked|ck|k|ed)?, f(\*|u)+(cked|ck|k|ed)? if you also want it to match non-censored versions, although in that case you have to set it to match word boundries or else you'll get every article talking about fudge or fuschia or fungus. If you also want it to catch words where the asterisks are omitted, you can use f(\*|u|ck)+(cked|ck|k|ed)? (also requires word boundry matching). f(\*|u|k|ck|ed)+ does similar to the previous, but might have slightly more false positives. LittlePuppers (talk) 21:02, 12 June 2019 (UTC)[reply]