Talk:Smith set

Mistakes, May 2006

The article currently begins:

In voting systems, the Smith set is the smallest set of candidates in a particular election such that each member beats every other candidate outside the set in a pairwise election.

The smallest such set is clearly the empty set. Does the writer mean the smallest non-empty set? The largest non-universal set? What?

The article also states that

If we contract each of these cliques to a single vertex representing a set of candidates, we have a directed acyclic graph, which necessarily has a vertex with zero in-degree, and the set of candidates this vertex corresponds to is the Smith set.

The antecedent of "this vertex" is not clear. After all, it is quite possible for a directed acyclic graph to have more than one vertex with zero in-degree. Can an election have more than one Smith set? If not, then we need either an algorithm for deciding which vertex corresponds to the Smith set, or a proof that this particular graph has only one vertex with zero in-degree. Saying it's directed and acyclic is not enough.

I hope someone can clean up this article; I know just enough about the subject to know it's wrong, not enough to fix it. :) --Quuxplusone 00:06, 9 May 2006 (UTC)[reply]

Ah, I figured out what was intended. Suppose we have two sources (vertices with zero in-degree). These sources are necessarily part of the same clique, and should be contracted together. This could happen to everything (in which case the DAG is just a point), which corresponds to the case that the Smith set is the set of all candidates (A>B>C>A, for example). I don't think I'll return that the article, though; that's more confusing than it's worth. CRGreathouse 22:34, 21 July 2006 (UTC)[reply]

It should be the smallest non-empty set. The last paragraph that tries to "clearly" demonstrate that the Smith set always exists by showing how to construct it is wrong, it is unreferenced, wasn't first properly reviewed, and certainly isn't clear. But it is fixable. For now I'm deleting it, but will come back with fixes in 3-6 weeks with a revision for that content. --DCary 15:50, 3 July 2006 (UTC)[reply]

Ok, 3-4 months, not weeks, later I finally made my changes. The paragraph I deleted earlier at least triggered a look at using Kosaraju's algorithm to calculate both the Smith and Schwartz sets faster than Floyd-Warshall.

With this change I deleted the problematic paragraph on clones. It had problems with accuracy ("2 members is impossible"), vagueness and/or consistency ("clone-like (strictly worse than one of the other 3)"), and NPOV ("appear deceptively large"). Someone may want to try to fix that paragraph, but it didn't seem worthwhile to me. DCary 21:17, 18 October 2006 (UTC)[reply]

Smith criterion vs Smith set

The Smith criterion article has essentially all of the information here, and a fair amount not in this article. I think we should delete this article and redirect to the other. CRGreathouse 03:55, 20 July 2006 (UTC)[reply]

I plan to add some material about the Smith set, and also about the Schwartz set in its article. It is taking me longer than I had expected to get it prepared. Now I'm estimating another 1-3 months. It might be worthwhile to defer the decision about deleting/merging this article until then. DCary 22:02, 20 July 2006 (UTC)[reply]

I would have thought it better to add it all in one place, since I can't imagine the conceptual difference between Smith set and Smith criterion, but very well. I withdraw my suggestion until that time. In any case, Schwartz set could certainly use some work! Thanks. CRGreathouse 23:35, 20 July 2006 (UTC)[reply]

Thanks for waiting. I've added my material and cleaned up the article for what I think would be a good separate article. If we keep them separate, some of the redundancy in the Smith Criterion article could be cleaned out. I'm neutral on merging this into the Smith Criterion. I see some value in keeping the two separate, one focused on the definition and structure of the Smith set, the other looking at the interrelation of the Smith criterion to other criterion. But I also recognize some value to merging this into the Smith criterion article. I'll volunteer to do the merge if there is still some interest in that and no objections otherwise. Or if someone else wants to do it, go for it. DCary 21:17, 18 October 2006 (UTC)[reply]

Original terminology

What did Smith call the Smith set originally (say, in his 1973 paper)? CRGreathouse 22:10, 21 July 2006 (UTC)[reply]

Heh, that's funny -- Smith's paper doesn't actually give the so-called "Smith criterion" at all, but a stronger condition he calls the "Condorcet criterion" (p. 1038, although he acknowledges that it's a generalization of Condorcet's original). CRGreathouse (talk | contribs) 04:06, 16 August 2006 (UTC)[reply]

As far as I know, the terms "Smith set" and "Schwartz set" have been introduced by Fishburn in his paper "Condorcet Social Choice Functions" (SIAM Journal on Applied Mathematics, vol. 33, p. 469--489, 1977). Markus Schulze 13:10, 16 August 2006 (UTC)[reply]

Thanks! I actually have that one, but I haven't gotten around to reading it yet. CRGreathouse (talk | contribs) 16:22, 17 August 2006 (UTC)[reply]

OK, I read the paper. Fishburn does use the term (actually "Smith's Condorcet Principle"), but he uses it in the same strong sense as Smith. CRGreathouse (talk | contribs) 04:51, 18 August 2006 (UTC)[reply]

That will be great in the article. --Homunq 22:53, 18 August 2006 (UTC)[reply]

Article omissions

The article uses the jargon term "beat" without defining it, and never mentions majorities. It neglects the synonyms "top cycle" and "GETCHA set" used in some of the literature of social choice theory. The article doesn't include any helpful examples. (I suggest a 3-candidate example in which Instant Runoff defeats a Condorcet winner and a 4-candidate example in which Minmax defeats the entire Smith set.) The article could compare the Smith set to other subsets of the Smith set besides the Schwartz set, such as "uncovered" sets and the (Jeffrey) Banks Set (which is the set of alternatives that can win assuming strategically sophisticated voting using the sequential pairwise elimination voting method of Robert's Rules) as in the paper The Banks Set and the Uncovered Set in Budget Allocation Problems by Dutta, Jackson and Le Breton. And it would help if the article distinguishes between the sincere Smith set and the voted Smith set, since electing a candidate in the sincere Smith set is considered more desirable than electing a candidate in a set manipulated by strategic voting. SEppley (talk) 15:36, 22 November 2012 (UTC)[reply]

Even Google can't tell you what GETCHA and GOCHA stand for. John Moser (talk) 13:39, 28 August 2018 (UTC)[reply]

Algorithms

The article has always mentioned some (graph-theoretic?) algorithms which can be used to find the Smith set, all with quadratic or cubic work factors. Recently GreekApple123 added a direct algorithm starting from the Copeland score, whose work factor so far as I can see is quadratic. I made GreekApple’s account more precise, relabelling the candidates to make the need for a sorting step explicit; and I relegated mention of the alternatives, which don’t seem to serve any real purpose.

At the same time, boldly/rashly, I deleted the short section on ‘alternative formulations’ since I couldn’t see the interest of it. Maybe it’s linked from somewhere else... but I hope I haven’t introduced any errors. Colin.champion (talk) 20:08, 1 March 2021 (UTC)[reply]

I put in a snippet of rather telegraphic C code to illustrate calculation of the Smith set by the direct quadratic algorithm. It’s not unknown for programmers to make errors in this sort of thing. Here’s a simple-minded equivalent (cubic rather than quadratic in time, but does anyone care?).

 int smith(int **r,int n)
 { int i,j,m,all0 ; 
   for(m=1;m<n;m++)
   { for(all0=1,i=m;i<n;i++) for(j=0;j<m;j++) if(r[i][j]) all0 = 0 ; 
     if(all0) return m ; 
   }
   return n ; 
 }

I ran a validation comparing results of the two versions on 100 million randomly generated results matrices, which took 20s; there were no discrepancies. Colin.champion (talk) 14:43, 6 March 2021 (UTC)[reply]

Algorithms again

@Bluefoxicy: has added a ‘hand algorithm’ to the ‘Copeland score algorithm’ previously present. (I’m afraid I missed this change at the time through not receiving a notification.) I can see his reasoning but I think his addition is misleading: he has provided a vaguer and more digestible description of a slight variant of the Copeland method rather than a separate algorithm.

The algorithm works because of a property proved earlier in the article: dominating sets are nested by Copeland score. It follows that by adjusting the Copeland threshold you can work through the nested sets in increasing order of size until you come to a dominating set; and this set is necessarily the Smith set. There’s a nice description at the top of p 10 of Darlington’s Are Condorcet and minimax voting systems the best? (v8).

You can avoid mentioning the Copeland score by using other statistics which work as well; but unless you prove the necessary property, you’ve reduced the intelligibility of the algorithm. And you can avoid sorting an array by scanning it repeatedly rather than working through it once sequentially; but this is no real gain either.

Now it seems to me that the ‘Copeland algorithm’ goes into so much detail that it conceals the simplicity of the underlying idea – and still doesn’t make clear the connection with the logical properties proven earlier. I wanted to go into detail (a) so that the reader would feel confident that he or she could implement the algorithm by hand or software, and (b) to make its quadratic cost explicit to avoid the need to discuss alternatives on efficiency grounds.

So I suggest that rather than offering two supposedly different algorithms, the Copeland algorithm should be preceded by a short description explaining the principles, and saying that a precise algorithmic specification follows. And I propose deleting reference to fancy algorithms such as Floyd-Marshall which don’t add anything. Colin.champion (talk) 08:43, 25 June 2021 (UTC)[reply]

I have done this. If my explanation is too short, by all means amplify it. Colin.champion (talk) 09:50, 1 July 2021 (UTC)[reply]

Merge into Smith set

"Smith set, but without counting ties" doesn't seem to be notable enough to deserve its own wiki article. –Maximum Limelihood Estimator 03:38, 23 April 2024 (UTC)[reply]

Done. – Closed Limelike Curves (talk) 02:04, 24 October 2024 (UTC)[reply]

Rename to top-cycle?

"Top cycle" is a more descriptive and intuitive name. I'd like to propose renaming this article to top-cycle. – Closed Limelike Curves (talk) 02:05, 24 October 2024 (UTC)[reply]

Which one is more commonly recognized? If you can show that there is a majority of reliable sources that refer to the concept as the term top-cycle instead of Smith set, then sure, I suppose, but otherwise I'd rather keep as is. 180 Degree Open Angedre (talk) 04:11, 24 October 2024 (UTC)[reply]

Which one is more commonly recognized?

Like most terms in social choice: 100% recognition from experts, 0% recognition from normal people. Both have substantial use but very little in the way of public recognition.

The reason I suggest "Top-cycle" is it's going to be more useful and memorable to a normal person, or someone with only a passing familiarity with social choice, to remember or understand that name. "Top-cycle" is self-explanatory (it's the top Condorcet cycle), whereas "Smith" is not a very descriptive name. ~~Although really, all of this is just covering up how we missed a huge opportunity to call it the Condor-set.~~ – Closed Limelike Curves (talk) 16:15, 25 October 2024 (UTC)[reply]

180 asks for reliable sources that use top-cycle. You say "well, it makes sense to normal people". This is not how we do things here. If all the reliable sources use "Smith set", that is the appropriate name for the article. --SarekOfVulcan (talk) 16:24, 25 October 2024 (UTC)[reply]

They asked for "a majority of reliable sources"—if you want reliable sources that use this name, I can provide plenty of them. (A quick Google scholar search confirms that it's a very common name.) However, I cannot demonstrate (per @180 Degree Open Angedre's request) that these make up more than half of all references to the top-cycle. – Closed Limelike Curves (talk) 16:36, 25 October 2024 (UTC)[reply]

I don't make page moves unless there are lots of reliable sources using a name. However, in many cases there are multiple names for the same thing that are commonly-used in the literature. Are we not allowed to include "this term makes more sense to normal people" as a consideration at all, when choosing between multiple similarly-notable names? That would certainly have changed my behavior with regard to most of the moves I've made, since generally that's the justification I've used—in all these situations, the page move was from one common name in the literature to another, similarly-common name that I think is more intuitive or memorable to the average person. – Closed Limelike Curves (talk) 16:47, 25 October 2024 (UTC)[reply]

(edit conflict) Ok, that's useful. https://onlinelibrary.wiley.com/doi/full/10.3982/TE5120 says: The top cycle has been reinvented several times and is known under various names such as Good set (Good (1971)), Smith set (Smith (1973)), weak closure maximality (Sen (1977)), and GETCHA (Schwartz (1986)). All the other sources refer to the top cycle, but without drawing the equivalence to the Smith set. If I search for "smith set" voting, I get some 1700 results, roughly comparable to your 2500 results. However, if I search "top cycle" voting or voters, I only get about 800 results. So, I'm not seeing that it's clear enough to move a long-established article.--SarekOfVulcan (talk) 17:03, 25 October 2024 (UTC)[reply]

I am speaking slightly informally here according to my own experience & familiarity with the literature, and don't intend to go dredge up a bunch of papers, but my take is this: many voting algorithms are special cases of graph algorithms, and the two fields (social choice vs graph theory) are liable to 1. reinvent things many times and 2. not necessarily use the same terminology for the same construct. Top-cycle is more the graph theory term (although idt this is actually discussed much in that field; I suppose there is the Closure problem referring to the same thing) and Smith set is more the social choice term. Given that this article is clearly more geared toward the latter demographic I would suggest retaining the name as Smith set as that is the most common I have seen

there is some discussion in https://arxiv.org/abs/2004.02350 on subtleties and terminology, particularly section 5.2.3 and appendix C.6 Affinepplan (talk) 17:47, 25 October 2024 (UTC)[reply]

Did you mean the other way around? Searching for the exact term "Smith set" on Google Scholar, I get many references to some kind of Smith set from group theory, but nothing about social choice. (Which actually suggests moving this just to avoid ambiguity with this other, unrelated "Smith set".) On the other hand, the search query ""top-cycle" set" immediately shows 10 results specifically using the term in social choice/economics. – Closed Limelike Curves (talk) 18:06, 25 October 2024 (UTC)[reply]

both can be used I suppose. the terms are pretty interchangeable anyway and I do/did not mean to imply there is a strong dichotomoy. but no, I did not mean the other way around.

Smith set is named after John Smith, from

John H. Smith. Aggregation of preferences with variable electorate. Econometrica, 41(6):1027–1041, 1973.doi:10.2307/1914033.

which of course is a social choice paper about elections published in an economics journal. not a graph theory paper published in a combinatorics journal.

in any case, there is certainly not grounds to change the title of this article. Affinepplan (talk) 18:22, 25 October 2024 (UTC)[reply]

@SarekOfVulcan: based on this apparently-unrelated Smith set from group theory, we might want to look for alternative titles. If there's complaints about top-cycle, something like "Smith set (voting)" works. – Closed Limelike Curves (talk) 19:34, 25 October 2024 (UTC)[reply]

Why? If there's no article, there's no need to disambiguate. And if this turns out to be primary, there's no need to move. SarekOfVulcan (talk) 19:43, 25 October 2024 (UTC)[reply]

I seriously seriously seriously doubt that anybody will ever get confused between Smith sets of finite groups and the Smith set of this article. I don't see any need at this time to disambiguate, especially given that the former does not even have a wikipage. Affinepplan (talk) 19:47, 25 October 2024 (UTC)[reply]

Actually, scratch that. Based on this article's longstanding use of the title "Smith set", I assumed "top-cycle" and "Smith set" would be similarly-popular, even though I'd personally only ever come across "Top-cycle" and, every once in a while, "Schwartz set". But it seems that describing this as the Smith set is genuinely rare in reliable sources? – Closed Limelike Curves (talk) 17:02, 25 October 2024 (UTC)[reply]

it is not rare. Affinepplan (talk) 17:48, 25 October 2024 (UTC)[reply]

For what it's worth, a Google Scholar search for "Smith set" voting gives 1750 results. A search for "top cycle" voting returns 776 results. While "top cycle" is more likely to return a paper about voting, "Smith set" is more often used in voting method papers as such. Searching for just "top cycle" also returns physics papers about thermodynamic cycles, so I wouldn't say the presence of unrelated papers singles out "Smith set" as needing disambiguation. I don't think there's a need to change the page name when there's already a redirect from top cycle to it. Wotwotwoot (talk) 20:00, 25 October 2024 (UTC)[reply]