Talk:Assembly language/Archive 4

This is an archive of past discussions about Assembly language. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Archive 3

Archive 4

To add a paragraph on the origin (the first machine assmenlr was made for, date and authors,) reason ir was naed assembler, etc.

I want to add a short description of the origins of assembler. (a) the first digital computer an assembler was written for, what it (assembler) was like (b) it's author (c) why it is named assembler/assembly language? by whom and when? (d) did it have another name originally?

Unfortunately, The sources I found are somehow contradictory. A number of different views can be found at https://www.quora.com/Why-is-the-assembly-language-called-so Particularly, it has the following statements: 1. The inventor actually called what we now call an assembler (which converts the mnemonics of Assembly into machine code) a “converter.” 2. the programmer would convert each symbolic instruction to its binary equivalent, which became known as “assembling” the program. It wasn’t long before someone wrote a program to do the job, and naturally named it the “assembler.” In a backward way, the symbolic instructions became known as “assembler” or “assembly” code. The most reliable source I found is IEEE computer society article David Wheeler 1985 Computer Pioneer Award "For assembly language programming". https://www.computer.org/web/awards/pioneer-david-wheeler Wheeler's "initial orders" allowed Edsac instructions to be provided in a simple language rather than by writing binary numbers, and made it possible for non-specialists to begin to write programs. This was the first "assembly language" and was the direct precursor of every modern programming language, all of which derive from the desire to let programmers write instructions in a legible form that can then be translated into computer-readable binary. --P.maistrenko (talk) 14:26, 4 October 2018 (UTC)

Kathleen B, 1947, Assembly language

The second paragraph of the paper written by the Booths begins:

"The non-original ideas, contained in the following text, have been derived from a number of sources, ... It is felt, however, that acknowledgement should be made to Prof. John von Neumann and to Dr. Herman Goldstein for many fruitful discussions ..."

Kathleen Booth's 1947 contribution to the field began with a 1946 trip by her future husband, Andrew Booth, to the USA, spending time at Princeton, gaining insight into the field from Prof. von Neumann. The following year they both came, for a longer (6 month) visit.

Their 1947 paper envisioned parallel arithmetic units and/or a large memory. Each parallel unit would possibly do 100 calculations per second, and a large memory would be 1,000 to 10,000 "numbers" of "approximately 40 binary digits." After talking theoretics, they labeled this "quite impracticable."

Over a decade later the word sizes of 36 bits (e.g. IBM 7094) were the high end, and while 32K did exist by that time, earlier machines were typically in the 4K range.

In short, since programming in those days meant flipping switches, Kathleen Booth's work was not the writing of an assembler, with or without a symbol table. It was really about not having to flip switches over and over, but rather recording binary values on paper tape.

One someone or several someones, together or at similar times but isolated from one another, developed assembly language. The "is credited" wording seems short and to the point. The Von Neumann reference isn't really necessary - it would require splitting all of this into sections to cover

Andrew Booth's 1946 trip
The six month follow up trip by both Kathleen and Andrew
Kathleen's flipping of switches - early day "programming"
Their review of possible memory types, including paper tape, magnetic tape, magnetic drum
Her theoretical work, which didn't product an actual assembler program, since symbolic characters were not even part of the system - there were no character strings.

It should be understood that even Project Whirlwind didn't meet all of the goals envisioned by the Booths. To recap, even "is credited" may be an overstatement, but to say that she actually wrote an assembler, on a machine that didn't deal in character data, is untrue. Pi314m (talk) 07:34, 10 February 2019 (UTC)

Misplaced text

The last three paragraphs of Assembly language#Assembly directives have nothing to do with Assembly directives

Symbolic assemblers let programmers associate arbitrary names (labels or symbols) with memory locations and various constants. Usually, every constant and variable is given a name so instructions can reference those locations by name, thus promoting self-documenting code. In executable code, the name of each subroutine is associated with its entry point, so any calls to a subroutine can use its name. Inside subroutines, GOTO destinations are given labels. Some assemblers support local symbols which are lexically distinct from normal symbols (e.g., the use of "10$" as a GOTO destination).
Some assemblers, such as NASM, provide flexible symbol management, letting programmers manage different namespaces, automatically calculate offsets within data structures, and assign labels that refer to literal values or the result of simple computations performed by the assembler. Labels can also be used to initialize constants and variables with relocatable addresses.
Assembly languages, like most other computer languages, allow comments to be added to program source code that will be ignored during assembly. Judicious commenting is essential in assembly language programs, as the meaning and purpose of a sequence of binary machine instructions can be difficult to determine. The "raw" (uncommented) assembly language generated by compilers or disassemblers is quite difficult to read when changes must be made.

I'd probably move the text to new subsections of Assembly language#Key concepts or Assembly language#Language design.

Also, some compilers generate assembly language with comments or pseuodo-assembly listing with comments, e.g., many of IBM's PL/I compilers. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:10, 14 April 2019 (UTC)

"First appeared 1949" -- did it?

The infobox claims that "assembly language" first appeared in 1949, and the article also has a number of categories relating to that year. However, I can't find any text or citations in the article justifying that date, and in fact the section on "Historical perspective" claims that "The first assembly language was developed in 1947 by Kathleen Booth for the ARC2". Seems like an error.

I suspect that, as usual, the exact date depends on the exact definition. One important property of assemblers now is symbolic addressing. It might be that early assemblers (random guess) allowed symbolic opcodes, but not symbolic addressing. You then have to decide if that counts or not. I suspect it took some years to get to the modern definition. Gah4 (talk) 20:17, 20 April 2020 (UTC)

Pronunciation

Can you add the pronunciation of assembly? In particular, assembly as in light or as in literature? And where is the stress? Assembly or assembly? Thanks--Stemby (talk) 10:43, 17 June 2008 (UTC)

It's IPA: /əˈsɛmb.lɪi/. See Wikitionary. —Ben FrantzDale (talk) 12:09, 17 June 2008 (UTC)

Also, IBM documentation pretty much all calls it Assembler Language, even though everyone I know call the program an assembler, but the language assembly. DEC called theirs names like Macro-11 in the titles of manuals, but assembly in the detailed description. I am not sure that there is a standard as to how to accent the pronunciation. Gah4 (talk) 21:14, 20 April 2020 (UTC)

two-pass assemblers

Some editorial comment indicated that it was unclear that the listing and code-generation would occur during the second pass. That's expected, since forward-references to symbols meant that the required values wouldn't be available until the whole source was read/parsed. And yes, reading a card deck or paper tape more than once was a little inconvenient TEDickey (talk) 23:16, 20 April 2020 (UTC)

one pass

There is recent discussion in edit summaries on one-pass assembly. Many (most? all?) assemblers allow forward references, such that they might not be able to completely assemble something on the first pass. It is, then, usual for a first pass that determines the address of each item (instruction or data), and then on the second pass, knowing all addresses, generate actual output. However, the object format used for OS/360, and possibly for others, has an address on each output card (that is, 80 byte record), such that output does not have to be in sequential address order. That makes it easier to write a one-pass assembler. On the other hand, you can write things where the length is determined by a symbolic name. I am not sure what they do about that. Note, though, that the out of order object code just moves the problem to the linker. The OS/360 linkage editor is famous for being slower than compilers. (An original design goal, and the reason for the name linkage editor, is to reduce the need for complete recompilation.) A multi-pass assembler that reads the input card deck more than once is pretty inconvenient. Later, they just use temporary disk files. Early assemblers had to run in small memory, though so did the linkers. Gah4 (talk) 21:05, 20 April 2020 (UTC)

A couple of notes:

On the IBM 650 the Symbolic Optimal Assembly Program (SOAP) assigned addresses to symbols at the time of first use in a fashion intended to reduce rotational delay.
On the S/360, everything except BPS/360 required tape or disk; the compilers had no need to read the physical cards once per pass. In particular, on OS/360 both Assembler (E) and Assembler (F) used work files. There was no single pass assembler for DOS or OS. Shmuel (Seymour J.) Metz Username:Chatul (talk) 00:49, 21 April 2020 (UTC)

I don't know if any are still around, but there was SPASM, a Single Pass assembler for OS/360, referenced here. I don't remember ever using it, though. Gah4 (talk) 01:12, 21 April 2020 (UTC)

one pass / two pass assemblers

as per this edit that was reverted [1], I think it would add something to have a brief description of one pass vs two pass assemblers. I know what they are, but don't have any references at the moment (although I'm sure I could find some). However, I have not heard of "Jove" pass assembly? Does anyone know what this is? --stmrlbs|talk 18:30, 3 June 2009 (UTC)

I've never heard of a Jove pass assembler either. A Google search does not turn up anything likely. Shouldn't be too hard to find sources on one and two pass assembers, though. —Preceding unsigned comment added by Yworo (talk • contribs) 19:22, 3 June 2009 (UTC)

well, it was a bit harder than I thought to find a reference. I found a lot of interesting class notes, each with a little different interpretation. However, I don't think class notes are RS. But, I put in a brief description of the basic types of each, and the main differences. I think currently, the 2 kind of blend into each other because of more sophisticated one-pass assemblers, which build tables which allow them to plug in addresses that are forward referenced. But, imo, this is hazy as to whether it is a one-pass or a two-pass or something in between. Plus, they have multi-pass assemblers, however that is just more passes to do more sophisticated processing of the source. I left that out as I think just a basic definition will do for this article.

However, if anyone feels that they can improve it.. be bold!! --stmrlbs|talk 02:37, 4 June 2009 (UTC)

I believe that historically it was more common to have 3 or more passes than it was to have only one. I've revised the text to reflect that, and also briefly mentioned the possible need for an extra pass when doing peephole optimization. Shmuel (Seymour J.) Metz Username:Chatul (talk) 19:14, 11 December 2011 (UTC)

I believe, from the small memory days, assemblers (and compilers) had many phases, usually separate sets of code read from disk. It seems that OS/360 Assembler F^[1] seems to have eight phases. It might be that all macro expansion is done before the first pass as described here. I believe that conditional assembly is done at the same time. Also, there is a phase for writing out error messages, which shouldn't count as a pass. It uses (up to) three temporary files, reading and writing as it goes through phases. Gah4 (talk) 01:37, 21 April 2020 (UTC)

It varied all over the landscape, but prior to S/360 the number of passes, even for macro-assemblers, was usually small. The FORTRAN II Assembly Program (FAP) had only two passes, with a symbol-table sort in between. 7070/7074 Autocoder had three phases. Lots of others had only two.

References

^ Program Logic IBM System/360 Operating System Assembler (F) (PDF). Program Logic (Third ed.). IBM. December 1970. GY26-3700-2. Retrieved 21 April 2020 – via bitsavers.

Current usage - IBM mainframes

There is a lot of IBM 360 family legacy code that is a mix of Cobol and assembly. The historical reason is that the database access methods, such as ISAM (Indexed Sequential Access Method) (it might have been BDAM?) were implemented as assembly based macros, and as long as some assembly was needed, some optimized code was also implemented in assembly. For current usage, few companies would be willing to take the risk or time it would take to port huge libraries of working assembly code to higher level languages. Rcgldr (talk) 14:38, 11 September 2020 (UTC)

COBOL compilers in both DOS and OS supported ISAM. Shmuel (Seymour J.) Metz Username:Chatul (talk) 19:26, 11 September 2020 (UTC)

As posted below, maybe it was BDAM or some feature for ISAM. The key point is that there is still a legacy mix of Cobol and assembler for IBM mainframes. [IBM Cobol and assembly] Rcgldr (talk) 22:22, 11 September 2020 (UTC)

I suspect that ISAM was designed for COBOL. I believe PL/I supports it, but then it seems to be designed to support many COBOL features. It might still be that some called assembly routines to do I/O, or other operations. Gah4 (talk) 19:40, 11 September 2020 (UTC)

Maybe BDAM. I used ISAM in PL/I and COBOL beginning in 1970 and never needed assembler for any of it. I don’t think HLLs supported all DAM options. I finally realized you could process all members of a BPAM dataset in PL/I (and presumably COBOL) with only an assembler routine to put the member names in the JFCB. Peter Flass (talk) 21:20, 11 September 2020 (UTC)

I suppose that there might be some feature of ISAM that wasn't available in COBOL, and might need a special routine. This article is, at least somewhat, supposed to be system independent. I do remember with the PDP-10/Fortran-10, which does know about direct access files, but I also needed record locking. Someone wrote two Macro-10 programs, just a few instructions each, which did that. That is, more than one copy of the program could run, and access the file, at the same time. I suspect that there are enough times when just a little assembly program can make things easier, but also complicate porting when needed later. My favorite for IA32 is a two instruction program to RDTSC (read time stamp counter) and return. Nice for accurate timing to find bottlenecks. So, maybe the article can say something general about the use of small assembly programs called from high-level languages. In all my years of assembly programming, just about all is routines called from high-level languages. Can we write something general about that? Gah4 (talk) 22:09, 11 September 2020 (UTC)

"system independent" - the issue here is current usage of assembler by mainframes is mostly due to IBM mainframe legacy code, still very popular, as it is used by banks, insurance companies, government, ... .Rcgldr (talk) 01:45, 12 September 2020 (UTC)

@Chatul: @Peter Flass: Assembly macros are still in use. IBM's migration to the current version of Cobol includes the changes needed for the assembly code, but doesn't mention porting that assmembly code into Cobol, so apparently some aspects of the access methods still require assembly macros: DFSMS macro instructions for data sets pdf Rcgldr (talk) 07:54, 12 September 2020 (UTC)

every assembly language is designed for exactly one specific computer architecture

The article says: every assembly language is designed for exactly one specific computer architecture. While this should mostly be true, it doesn't seem quite so obvious. For one, when an architecture is extended, consider S/360, S/370, XA/370, ESA/370, ESA/390, z/, the new assembler is usually backwards compatible. (As long as you don't use new instructions.) Also, pretty often the first try for new instructions is done with macros in the old assembler. That won't work for new addressing modes, though. But also it depends on what you mean by assembly language. If it means other than the specific machine instructions, then some assemblers for 8 bit microprocessors could be used for more than one. A look-up table for the machine instructions was used, where the assembler instructions (see discussion above) were the same. Then there is GNU gas, which is multi-architecture, though usually not using the syntax of the one designed for each architecture, and often different opcode mnemonics. Gah4 (talk) 04:19, 12 September 2020 (UTC)

Half a century ago there were assemblers from UNIVAC and SDS targeted to multiple architecture, long before gas. Shmuel (Seymour J.) Metz Username:Chatul (talk) 19:44, 13 September 2020 (UTC)

The article may say that but it isn't the case. An assembly language is designed to be used with a specific processor, or processor family, not with a specific machine.

216.152.18.132 (talk) 02:06, 3 August 2021 (UTC)

Except when it isn't; some assemblers are table driven and can handle multiple architectures. Meta-Symbol^[1]^[2] and Meta-Assembler^[3] (MASM) go back to the 1960s, and more recently there is gas. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 09:41, 3 August 2021 (UTC)

References

^ SYMBOL and META-SYMBOL Reference Manual for 900 Series/9300 Computers (PDF). Scientific Data Systems. March 1969. 90 05 06G. Retrieved August 3, 2021.
^ Xerox Meta-Symbol Sigma 5-9 Computers Language and Operations Reference Manual (PDF). Xerox Data Systems. Oct 1975. 90 09 52G. Retrieved August 3, 2021.
^ Sperry Univac Computer 1100 Series Meta-Assembler (MASM) Programmer Reference (PDF). Revision 1. Sperry Univac Computer Systems. 1977. UP-8453. Retrieved August 3, 2021.

GA Review

This review is transcluded from Talk:Assembly language/GA1. The edit link for this section can be used to add comments to the review.

Reviewer: Wasted Time R (talk · contribs) 13:20, 17 September 2020 (UTC)

This looks to have been a drive-by nomination made by an erratic editor who has since been indef-blocked for incompetence. The article has large swaths of unsourced material, not just explanatory material but historical and analytical as well. In some cases there are whole sections without any citations. So this has to be a fail.

But the article is not bad at all. Content-wise, my main suggestion for improvement is that the use of assembly language for IBM mainframes needs to be given more attention. It is mentioned here and there, but back in the heyday of the IBM 360/370, when it was the dominant computing platform in the industry, assembly language was everywhere, not just for high-performance system software components but for run-of-the-mill business applications as well. Learning 360/370 Assembly was part of the standard education that commercial programmers had to get and there were a lot of textbooks and commercial courses available for it. For instance, in the textbook Kevin McQuillen, System/360–370 Assembler Language (OS) (Mike Murach & Associates, 1975), the example programs that the text develops concerns a batch inventory control and reorder application, and later parts of a batch payroll application are constructed. This whole aspect of historical assembly language use is counter-intuitive to today's reader and part of the value that this article can bring is to describe it. Wasted Time R (talk) 13:20, 17 September 2020 (UTC)

Other Platforms

I would suggest adding more information on the use of assembly language on platforms from other vendors, not just on other IBM platforms. I know for a fact that CDC and RCA used assemblers as implementation languages on their operating systems, and I'm confident that many others did as well. Similarly, there was a lot of customer use of assembly languages on non-IBM platforms. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:21, 10 September 2021 (UTC)

Historical perspective - Atari St - Commodore Amiga

C was more popular than assembly for these 68000 based home computer systems. Rcgldr (talk) 14:34, 11 September 2020 (UTC)

Most of games and demos were written in Assembly but strategy/RPG/adventure games (ported from PC & Mac) and OSs and utilities were written in C. So it's not easy to say which language was more popular, both were very important. Hobbyists also used a lot of BASIC (and STOS/AMOS) --84.248.217.94 (talk) 10:22, 3 August 2021 (UTC)

Macro facilities in open code

@Wtshymanski: In many assemblers, pseudo-ops used inside of macro definitions can also be used in open code, and the article does not discuss this. As a start, I added the text below, which user:Wtshymanski reverted:

In addition, some of the assembler statements useful in macro definitions are also valid in open code, e.g., the HLASM statements
AGO

Transfer to specified assembler statement

AIF

Evaluate logical and transfer if true

GBLx

Define compile-time variables in a global context

LCLx

Define compile-time variables in a local context

SETx

Evaluate expressions and assign their values to compile time variables

There is a lot of code that uses these facilities outside of macro definitions, and I believe that the existing text on assembly language macros is misleading without a discussion of the use of them in open code. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:40, 10 September 2021 (UTC)

Tiobe mixing assembly language with WebAssembly ?

This article cites Tiobe reporting around 2.5% usage of assembly language. WebAsembly is not listed at all amongst 100 languages. Given the expectation that the latter is used more than the former, Tiobe may have mixed up these two. The Wikipedia article on the latter confusingly refers to the former. — Preceding unsigned comment added by Jgeer (talk • contribs) 23:27, 7 November 2021 (UTC)

Non sequitur

GliderMaven GliderMaven merged two sentences to read Because assembly depends on the machine code instructions, each assembly language is specific to a particular computer architecture and sometimes to an operating system. However, the reason that FORTRAN Assembly Program (FAP) on the FORTRAN Monitor System differs from Macro Assembly Program on IBSYS/IBJOB and Assembler D on DOS/360 differs from Assembler F on OS/360 has nothing to do with dependency on the machine code, since the machine code is identical. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:03, 11 November 2021 (UTC)

FORTRAN Monitor System ran on the 709, 7090, and 7094, as did IBSYS/IBJOB. Assembler D and F ran on System/360. Machine code for 709, 7090, and 7094 is different from System/360, because they are different computer architectures. It's reasonable to say that if there are two different sets of machine instructions, two different assemblers will be required. (Although it's possible to create one assembler program that uses nearly the same mnemonics for two or more architectures, and produces different machine code as commanded by some sort of mode setting.) Jc3s5h (talk) 18:02, 11 November 2021 (UTC)

Yes, It's reasonable to say that if there are two different sets of machine instructions, two different assemblers will be required, but in this case it's four assemblers for two architectures. I wasn't citing four distinct assemblers for a single architecture, but rather two distinct assemblers for each of the architectures.

I believe that some microprocessors have a lot more than two distinct assemblers.

There are assemblers that let you specify a different opcode table in order to support multiple architectures. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:05, 11 November 2021 (UTC)

So let's get this straight, you think that stating that each assembler is specific to a machine code and sometimes an operating system, you think that it means that each machine code and operating system has only one assembler???? GliderMaven (talk) 04:44, 12 November 2021 (UTC)

Um. No. That's very much like saying that all lions are cats, so therefore all cats are lions. It's faulty logic. One does not imply the other. At all. GliderMaven (talk) 04:44, 12 November 2021 (UTC)

You are totally misrepresenting what I wrote and what I think. The issue is not the word specific; the issue is the word because. The text from because to the comma is incorrect for assemblers specific to an operating system. That is the obvious reason that there were originally two sentences. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:49, 12 November 2021 (UTC)

As usual, it is complicated. Some of the OS difference is actually different macro sets that could be used with the same assembler. I suspect that the assemblers and macros for Linux/390 are very different from OS/390, even for the same hardware. One of the fun things you can do with macros is to make an assembler for a completely different machine. This was usual in the early microcomputer days, where macros for OS/360 assemblers would generate 8080 or 6502 code. There was a program to reformat the object program into the usual form. Gah4 (talk) 22:23, 12 November 2021 (UTC)

Macro definitions are typically in separate libraries. The differences between FAP and MAP, or among Assembler D, Assembler F and Assembler XF include differences in the pseudo-ops. The diffences among microprocessor assemblers include the order of operands in machine instructions. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:04, 14 November 2021 (UTC)

Well, for example, OS/360 and DOS/360 use completely different macros, not only the lower levels. I believe that there is Assembler D for OS and DOS, but don't know if they share code. Linux/390 uses ASCII, though some assemblers for EBCDIC systems will accept ASCII source. And then there are cross assemblers. There are just so many different things that have been tried, that it is hard to say more. Gah4 (talk) 05:15, 15 November 2021 (UTC)

Yes, DOS/360 and OS/360 have different macro libraries, but they are not part of the assemblers. Assembler D is DOS only, assemblers E and F are is CP-67, OS/360 and TSS/360 and Assembler XF is in at least DOS/VSE, OS/VS1, OS/VS2 and VM/370. I believe that Assemblers E and F share code. All of which confirms that the because is incorrect. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:08, 15 November 2021 (UTC)

[ASMF-PLM-1] Program Logic IBM System/360 Operating System Assembler (F) (PDF). Program Logic (Third ed.). IBM. December 1970. GY26-3700-2. Retrieved 21 April 2020 – via bitsavers.

[2] SYMBOL and META-SYMBOL Reference Manual for 900 Series/9300 Computers (PDF). Scientific Data Systems. March 1969. 90 05 06G. Retrieved August 3, 2021.

[3] Xerox Meta-Symbol Sigma 5-9 Computers Language and Operations Reference Manual (PDF). Xerox Data Systems. Oct 1975. 90 09 52G. Retrieved August 3, 2021.

[4] Sperry Univac Computer 1100 Series Meta-Assembler (MASM) Programmer Reference (PDF). Revision 1. Sperry Univac Computer Systems. 1977. UP-8453. Retrieved August 3, 2021.

[1]

[1]

[2]

[3]