Jump to content

Wikipedia:Overcategorization: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m float
Line 106: Line 106:
Avoid categories that, by their very definition, will never have more than a few members, unless such categories are part of a large overall accepted sub-categorization scheme, such as subdividing songs in [[:Category:Songs by artist]] or flags in [[:Category:Flags by country]].
Avoid categories that, by their very definition, will never have more than a few members, unless such categories are part of a large overall accepted sub-categorization scheme, such as subdividing songs in [[:Category:Songs by artist]] or flags in [[:Category:Flags by country]].


Note that this cannot apply to eponymous categories: one should not create entire schemes of categorization which themselves only contain a few articles each. For instance, it would be inappropriate to create eponymous categories for each individual in [[:Category:Alternative rock musicians]] each containing only one or two articles themselves and apply this rule about a larger scheme.
<span id="OVERLAP" /><span id="OVERLAPPING" /><span id="Mostly overlapping categories />
<span id="OVERLAP" /><span id="OVERLAPPING" /><span id="Mostly overlapping categories />



Revision as of 03:17, 5 October 2011

Categorization is a useful tool to group articles for ease of navigation, and correlating similar information. However, not every verifiable fact (or the intersection of two or more such facts) in an article requires an associated category. For lengthy articles, this could potentially result in hundreds of categories, most of which aren't particularly relevant. This may also make it more difficult to find any particular category for a specific article. Such overcategorization is also known as "category clutter".

To address these concerns, this page lists types of categories that should generally be avoided. Based on existing guidelines and previous precedent at Wikipedia:Categories for discussion, such categories, if created, are likely to be deleted.

Non-defining characteristics

One of the central goals of the categorization system is to categorize articles by their defining characteristics. Categorization by non-defining characteristics should be avoided. It is sometimes difficult to know whether or not a particular characteristic is "defining" for any given topic, and there is no one definition that can apply to all situations. However, the following suggestions or rules-of-thumb may be helpful:

  • if the characteristic would not be appropriate to mention in the lead portion of an article, it is probably not defining;
  • if the characteristic falls within any of the forms of overcategorization mentioned on this page, it is probably not defining.

Often, users can become confused between the standards of notability, verifiability, and "definingness". Notability is the test that is used to determine if a topic should have its own article. This test, combined with the test of verifiability, is used to determine if particular information should be included in an article about a topic. Definingness is the test that is used to determine if a category should be created for a particular attribute of a topic. In general, it is much easier to verifiably demonstrate that a particular characteristic is notable than to prove that it is a defining characteristic of the topic. In cases where a particular attribute about a topic is verifiable and notable but not defining, or where doubt exists, creation of a list article is often the preferred alternative.

In disputed cases, the categories for discussion process may be used to determine whether a particular characteristic is defining or not.

Trivial characteristics

Example: Bald People, Famous redheads, Age of death

Avoid categorizing topics by characteristics that are wholly peripheral to the topic's notability. For biographical articles, it is usual to categorize by such aspects as his or her career, origins, and major accomplishments. In contrast, someone's tastes in food, their favorite holiday destination, or the number of tattoos they have would be considered trivial. Such things may be interesting information to include in an article, but they not useful for categorization. If something could be easily left out of a biography, it is likely that it is a trivial characteristic.

Note that this form of overcategorization also applies to grouping people by trivial circumstances of their deaths, such as categorizing people by the age at which they died or the place of death or by whether they still had unreleased or unpublished work at the time of their death. Even though such categories may be interesting to some people, they aren't particularly encyclopedic.

Opinion about a question or issue

Example: Cat lovers, Iraq liberation opposition, Star Trek fans

Avoid categorizing people by their personal opinions, even if a reliable source can be found for the opinions. This includes supporters or critics of an issue, personal preferences (such as liking or disliking green beans), and opinions or allegations about the person by other people (e.g. "alleged criminals"). Please note, however, the distinction between holding an opinion and being an activist, the latter of which may be a defining characteristic (see Category:Activists).

Subjective inclusion criterion

Examples: Obese people, Cult actors, Mysterious musicians, Outstanding Canadians

Adjectives which imply a subjective inclusion criterion should not be used in naming/defining a category. Examples include such subjective words as: famous, notable, great, etc; any reference to size: large, small, tall, short, etc; or distance: near, far, etc; or character trait: beautiful, evil, friendly, greedy, honest, intelligent, old, popular, ugly, young, etc.

Arbitrary inclusion criterion

Examples: School districts at the top 7% on Pennsylvania standardized tests, Locations with incomes over $30,000, Category:100th episodes

There is no particular reason for choosing "7%", "$30,000", or the 100th episode as cutoff points in these cases. Likewise, a district with 3,800 students is not meaningfully different from one with 4,100 students. A better way of representing this kind of information is to put it in an article such as "List of school districts in (region) by size". Note that Wikipedia allows a table to be made sortable by any column.

An exception to this is categorizing by year, since making a category for each year is not arbitrary.

Trivial intersection

Example: Celebrity Gamers, Red haired kings

Avoid intersections of two traits that are unrelated, even if some person can be found that has both traits. For example, celebrities are usually notable for reasons other than being gamers.

Intersection by location

Examples: Roman Catholic Bishops from Ohio, Quarterbacks from Louisiana, Male models from Dallas

Geographical boundaries may be useful for dividing subjects into regions that are directly related to the subjects' characteristics (for example, Roman Catholic Bishops of the Diocese of Columbus, Ohio or New Orleans Saints quarterbacks).

In general, avoid subcategorizing subjects by geographical boundary if that boundary does not have any relevant bearing on the subjects' other characteristics. For example, quarterbacks' careers are not defined by the specific state that they once lived in (unless they played for a team within that state).

However, location may be used as a way to split a large category into subcategories. For example, Category:American writers by state.

Non-notable intersections by ethnicity, religion, or sexual orientation

Example: Jewish mathematicians, LGBT murderers, Sportspeople by religion

Dedicated group-subject subcategories, such as Category:LGBT writers or Category:African American musicians, should only be created where that combination is itself recognized as a distinct and unique cultural topic in its own right. If a substantial and encyclopedic head article (not just a list) cannot be written for such a category, then the category should not be created. Please note that this does not mean that the head article must already exist before a category may be created, but that it must at least be reasonable to create one.

Likewise, people should only be categorized by ethnicity or religion if this has significant bearing on their career. For instance, in sports, a Roman Catholic athlete is not treated differently from a Lutheran or Methodist. Similarly, in criminology, a person's actions are more important than their race or sexual orientation. While "LGBT literature" is a specific genre and useful categorisation, "LGBT quantum physics" is not.

Narrow intersection

Example: Pre-1933 two-digit Virginia state highways

If an article is in "category A" and "category B", it does not follow that a "category A and B" has to be created for this article. Such intersections tend to be very narrow, and clutter up the page's category list. Even worse, an article in categories A, B and C might be put in four such categories "A and B", "B and C", "A and C" as well as "A, B and C", which clearly isn't helpful.

In general, intersection categories should only be created when both parent categories are very large and similar intersections can be made for related categories.

Small with no potential for growth

Example: The Beatles' wives, Husbands of Elizabeth Taylor, Catalan-speaking countries

Avoid categories that, by their very definition, will never have more than a few members, unless such categories are part of a large overall accepted sub-categorization scheme, such as subdividing songs in Category:Songs by artist or flags in Category:Flags by country.

Note that this cannot apply to eponymous categories: one should not create entire schemes of categorization which themselves only contain a few articles each. For instance, it would be inappropriate to create eponymous categories for each individual in Category:Alternative rock musicians each containing only one or two articles themselves and apply this rule about a larger scheme.

Mostly overlapping categories

Example: 1971 National League All-Stars, 1852 religious leaders

If two or more categories have a large overlap (e.g. because many athletes participate in multiple all-star games, and religious leadership does not radically change from year to year), it is generally better to merge the subjects to a single category, and create lists to detail the multiple instances.

Unrelated subjects with shared names

Examples:Ice-named rappers, Churches named for St. Dunstan

Avoid categorising by a subject's name when it is a non-defining characteristic of the subject, or by characteristics of the name rather than the subject itself. For example, a category for unrelated people who happen to be named "Jackson" is not useful. However, a category may be useful if the people, objects, or places are directly related—for example, a category grouping subarticles directly related to a specific Jackson family, such as Category:Jackson musical family.

Eponymous categories for people

Examples: John Wayne, Barbra Streisand, ZZ Top, Eponymous fashion model categories, Sports broadcasting families.

In general, avoid creating categories named after individual people, or groupings of people (such as families or musical groups). Articles directly related to the subject (which would thus be potential members of such categories) typically are already links in the eponymous article in question. If these links are not present, then the links should be added before proposing such a category for deletion. Sometimes, renaming the category to reflect the topic, rather than the person, is a good alternative to deletion. Category:Shakespearean scholarship and Category:Tolkien studies, are two such examples. Note that articles on works etc by the person can be placed in categories like Category:Novels by Agatha Christie.

However, there are sometimes good reasons to have an eponymous category. Most examples are either collections of subarticles (see Wikipedia:Summary style), or collections of articles on a topic about the named person. Category:William Shakespeare and Category:J. R. R. Tolkien, (sub-categories of which were noted as examples above), are two such examples. Another example is Category:Alexander the Great, which includes subarticles as well as topic articles such as Alexander (film), Alexander Mosaic, Alexander Romance, Alexander in the Qur'an, Alexander the Great (1956 film), and Alexander the Great (song).

Candidates and nominees

Example: Potential 2008 Republican U.S. Presidential Candidates (deleted in November 2006)

Wikipedia is not a crystal ball. A candidate for public office, the possible next CEO of a certain corporation, a potential member of a sports team, an actor on the "short list" to play a role, or an award nominee (just to name a few examples) should not be grouped by category. Lists may sometimes be appropriate for such groupings, especially after the passage of the events to which they relate.

Award recipients

Example: Category:MTV Movie Award winners, Category:Honorary citizens of Berlin, Category:People who have received honorary degrees from Harvard University

People can and do receive awards and/or honors throughout their lives. In general (though there are a few exceptions to this), recipients of an award should be grouped in a list rather than a category.

Exceptions include Category:Nobel laureates and Category:Academy Award winners. See also Category:Award winners.

Published list

Example: Rolling Stone's 500 Greatest Albums

Magazines and books regularly publish lists of the "top 10" (or some other number) in any particular field. Such lists tend to be subjective and somewhat arbitrary. Some particularly well-known and unique lists such as the Billboard charts may constitute exceptions, although creating categories for them may risk violating the publisher's copyright or trademark.

Venues by event

Example: WrestleMania venues, Republican National Convention venues, Democratic National Convention venues

There is no encyclopedic value in categorizing locations by the events or event types that have been held there, such as arenas that have hosted specific sports events or concerts, convention centers that have hosted specific conventions or meetings, or cities featured in specific television shows that film at multiple locations.

Likewise, avoid categorizing events by their hosting locations. Many notable locations (e.g. Madison Square Garden) have hosted so many sports events and conventions over time that categories listing all such events would not be readable.

However, categories that indicate how a specific facility is regularly used in a specific and notable way for some or all of the year (such as Category:National Basketball Association venues) may sometimes be appropriate.

See also #Performers by performance venue.

Performers by performance

Avoid categorizing performers by their performances. Examples of "performers" include (but are not limited to) actors/actresses (including pornographic actors), comedians, dancers, models, orators, singers, etc.

Performers by action or appearance

Examples: Actresses who have appeared veiled, Anal porn actress, Musicians who play left-handed. Saxophonists who are capable of circular breathing

Avoid categorising performers by some action they may have performed (such as a "pirouette", a "runway walk", a "spit take", a "pratfall", a "sword fight", "anal sex", etc.); some method of performance (such as while standing on their head, left-handed, etc.); or how they may have chosen to appear (such as bald, veiled, etc.)

Performers by role or composition

  • Performers who have portrayed <character name>
  • Performers who have portrayed <a type of character>
  • Performers who have performed <a specific work>
Examples: Fictional characters by actor and subcategories, American dramatic actors, Actors that portrayed heroes or villains, Jim Steinman artists, Actors & Actresses who portrayed, Actors who have played serial killers, Actors who have played gay characters, Actors who played HIV-positive characters, and Actors who have played the President of the United States.

Avoid categories which categorise performers by their portrayal of a role. This includes portraying a specific character (such as Darth Vader, or Hamlet). This also includes voicing animated characters (such as Donald Duck), or doing "impressions"; portraying a "type" of character (such as wealthy, poor, religious, homeless, gay, female, politician, Scottish, dead, etc.); or performing a specific work (such as Amazing Grace, "Waltz of the swans" from Swan Lake, "To be or not to be" from Hamlet (the play), "Why did the chicken cross the road?" (a joke), etc.).

Similarly, avoid categorizing artists based on producers, film directors or other artists they have worked with (such as "George Martin musicians" or "Steven Spielberg actors"). Performers are defined by their body of work, not by the people they have associated with professionally. For example, Tom Hanks is distinguished by his performances as an actor, not by the fact that he has appeared in Steven Spielberg's films.

Performers by performance venue

Examples: Artists who played Coachella, Saturday Night Live musical guests, Ozzfest performers, Celebrity Poker Showdown players, Entertainers who performed for troops during the Vietnam War, and Actors by series

Avoid categorising performers by an appearance at an event or other performance venue. This also includes categorization by performance in any specific radio, television, film, or theatrical production (such as The Jack Benny Program, M*A*S*H, Star Wars, or Phantom of the Opera).

Note also that performers should not be categorized into a general category which groups topics about a particular performance venue or production (e.g. Category:Star Trek), when the specific performance category would be deleted (e.g. Category:Star Trek script writers).

See also #Venues by event.

People associated with

Examples: People associated with John McCain, People associated with Pope Pius XI, People associated with Madonna, People associated with the hippie movement

The problem with vaguely-named categories such as this is determining what degree or nature of "association" is necessary to qualify a person for inclusion in the category. The inclusion criteria for these "associated with X" categories are usually left unstated, which fails WP:OC#SUBJECTIVE; but applying some threshold of association fails WP:OC#ARBITRARY.

However, it may be appropriate to have categories whose title clearly conveys a specific and defined relationship to another person, such as Category:Obama family or Category:Obama Administration personnel.

Miscellaneous categories

Examples: People of the Moravian Church miscellaneous, Brass bands of other countries

Do not categorize articles into "miscellaneous", "other", "not otherwise specified" or "remainder" categories. It is not necessary to completely empty every parent category into its subcategories. If there are some articles that don't fit appropriately into any of the standard subcategories, leave the articles in the parent category. The articles categorized together as "other" or "miscellaneous" generally will have little in common and therefore should not be categorized together in a dedicated "miscellaneous" category.

See also