Talk:Type punning
This article has not yet been rated on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||
|
Dispute
[edit]The C standard includes the following footnote: "If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called “type punning”). This might be a trap representation" (C17, footnote 97). So type punning through unions is supposed to be legal, and the footnote has survived multiple revisions of the standard. In contrast, the section from the Annex (which was only ever informative, not normative) that lists type punning through unions as unspecified behaviour got an update with C11: As already quoted in the article, it now reads "The values of bytes that correspond to union members other than the one last stored into" instead of "The value of a union member other than the last one stored into". The reasoning here is that only the bytes that do not simultaneously belong to both union members (the one last stored into, and the one used for access) take unspecified values. The behaviour only becomes unspecified if the storing member is too short (or if you create a trap repersentation). 92.78.25.98 (talk) 11:49, 24 January 2022 (UTC)
- Broadly agreed. Unfortunately footnotes are also only informative, making it rather unfortunate that the specifiers of the standard chose a footnote to communicate their apparent intent in explicit form. Ewx (talk) 22:50, 26 January 2022 (UTC)
- Fast inverse square root § Avoiding undefined behavior claims that type punning with union is legal C but undefined behavior in C++. If that's wrong, it should be updated to be consistent with this page's section Type punning § Use of union. 2001:56A:7A59:1E00:4D4:BE1C:EBC0:21E8 (talk) 05:01, 27 October 2022 (UTC)
- Agreed that footnotes and Annex J are both informative, not normative, but there is (since C11) no conflict here between the annex, the footnote, and the normative text. That leaves no plausible reason to interpret the standard differently than the footnote indicates. 2601:3C1:100:B80:5DC6:142A:8C44:7BC7 (talk) 16:45, 4 December 2022 (UTC)
GNU C strict-aliasing option
[edit]This explicitly does not defeat type punning directly via unions; see the GCC documentation for -fstrict-aliasing. Ewx 10:14, 30 August 2007 (UTC)
Missing underscore
[edit]It's true, it doesn't show up in IE, even though other underscores do. How bizarre. Anyone got any ideas? —Preceding unsigned comment added by Ewx (talk • contribs) 08:36, 6 September 2007 (UTC)
Unspecified behavior when reading union members
[edit]"For example, reading from a different union member than the last one written invokes unspecified behavior,[1] but the effect in practice is usually to permit type punning."
I believe this is a misunderstanding. The behaviour is unspecified only because the representations of the members are unspecified [edit: actually implementation-defined]. As Clive Feather notes in DR257, "... one of the changes from C90 to C99 was to remove any restriction on accessing one member of a union when the last store was to a different one. The rationale was that the behaviour would then depend on the representations of the values." In other words, this is required to allow type punning -- or at least was intended to when the change was made. Also see DR283, which would clarify this point.
(There are actually some good reasons why the standard shouldn't have allowed type punning via unions -- in particular, that would make it easier to produce a memory-safe implementation of Standard C. Note that punning via arrays of character type has "always" been allowed.)
Also, the article should not be citing a committee draft of C99 TC2; it should be citing the published version. David-Sarah Hopwood (talk) 16:41, 11 January 2009 (UTC)
I agree. The text as it stands is atrocious. Annex J is non-normative and at least in this instance misleading. Assuming nobody comes up with a convincing counterargument to any of the above I'll try to update the text to something more accurate.Ewx (talk) 18:11, 18 July 2011 (UTC)
- I have now done so. Language lawyers might want to check if I've got it right.Ewx (talk) 10:24, 13 August 2011 (UTC)
- Yep that's fine (if we allow casting through unions - I think if we're absolutely correct that's undefined as well; though every implementation I know of will allow it even with strict aliasing), although it may not result in the most performant code - C99 style intializers would be better (at least gcc4.0 does generate suboptimal code there. See Casting through a union (1) here for those without one at hand [1]) so I changed that. As far as I see the sockets example violates strict aliasing rules as well. While I'm not familiar with the library in question, the standard is clear that pointers to aggregate types with differing tags do not alias. So if one struct isn't just a typedef of the other (unlikely) that's illegal as well and should at least be mentioned (not that I'm comfortable with listing code that's undefined according to the standard anyhow - even if it may work) Voo42 (talk) 21:40, 14 August 2011 (UTC)
- The article is about type-punning, not about cycle-stealing for a particular implementation, so attempting to optimize it is, quite frankly, a rather bizarre thing to do. As it happens recent GCC generates identical code for both approaches. And the union approach, making the assumptions given in the article, does not yield undefined behaviour; that is what this whole thread is about.Ewx (talk) 07:37, 15 August 2011 (UTC)
- Yep that's fine (if we allow casting through unions - I think if we're absolutely correct that's undefined as well; though every implementation I know of will allow it even with strict aliasing), although it may not result in the most performant code - C99 style intializers would be better (at least gcc4.0 does generate suboptimal code there. See Casting through a union (1) here for those without one at hand [1]) so I changed that. As far as I see the sockets example violates strict aliasing rules as well. While I'm not familiar with the library in question, the standard is clear that pointers to aggregate types with differing tags do not alias. So if one struct isn't just a typedef of the other (unlikely) that's illegal as well and should at least be mentioned (not that I'm comfortable with listing code that's undefined according to the standard anyhow - even if it may work) Voo42 (talk) 21:40, 14 August 2011 (UTC)
The article's discussion of type-punning through a union entirely misses the point when it diverges into the strict-aliasing rule (§ 6.5/7 of the C language spec). The definition of unions (paragraph 6.2.5/20 in C17) and the rules for lvalue conversion (paragraph 6.3.2.1/2 in C17) always provide for accessing any member of an existing union object. The question was never whether accessing a different member than was last written is permitted, but rather whether it has the intended type-punning effect (or indeed, any specified effect at all). However, the intended interpretation has long been indicated by a footnote to the specifications for union member access via the . and -> operators. In C17 this is footnote 97, which says: "If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called 'type punning'). This might be a trap representation." That gets a bit hairy if the last-written member is smaller than the one being read, but that's not what the article is talking about. — Preceding unsigned comment added by 2601:3C1:100:B80:5DC6:142A:8C44:7BC7 (talk) 14:22, 4 December 2022 (UTC)
Non-Type Punning?
[edit]In the tutorial for Google's Go programming language, the author refers to a common static array declaration in C:
int foo[3];
...as being "punned". Specifically, the author states that the "index value" (int literal) is "punned" when used as the dimension of the array (also int literal). This definition clearly differs from type-punning, as the types are the same. I'd consider this "overloaded syntax", but I hadn't heard of non-type punning before seeing this.
Is this actually a common usage of "pun" in CS? If so, I think it should be added to the page (and if common enough, perhaps move the page itself from Type punning to Punning). TricksterWolf (talk) 04:10, 14 November 2013 (UTC)
- It's true that there's a correspondence between the syntax of declarators and expressions in C. It seems like something quite different from the contents of this article though. At any rate if it's a widespread usage of "pun" then surely it will be straightforward to find lots of examples of it.Ewx (talk) 09:22, 14 November 2013 (UTC)
Union type punning in C++
[edit]This page declares that union type punning is legal in C++03, without any citation. As near as I can tell, it is not in fact legal in C++03. I have not checked the C++11 standard, but the C++98 standard (which C++03 is a minor variant of) seems to indicate that accessing a data member of a union other than the one last stored is only legal if a) the union is POD, b) both data members are POD structs, c) the data members share a common initial sequence, and d) the field being accessed through the data member is part of that common initial sequence. `unsigned int` and `float` do not match these rules, and therefore I believe this is an illegal access. Kevin Ballard (talk) 21:06, 18 March 2014 (UTC)
Origin of the term "punning" in type punning
[edit]This is never explained. The definition of "pun" is "a play on words". How does this apply to type-punning? It should be explained, because the originator of the term obviously thought that type-punning was analogous to word-punning, and that realizing this might help the reader more deeply understand the concept. — Preceding unsigned comment added by 216.206.138.44 (talk) 15:36, 11 June 2015 (UTC)
- Agreed, this is what I came here to try to figure out. My guess is that it's because a pun involves a word having two meanings or being a homophone for another word, which is analogous to reading an object of one datatype as if it was another. No idea where you'd find a citation for that though. 130.63.110.250 (talk) 19:16, 13 October 2022 (UTC)