This is a simple program in C.
char a;
void main(){};
And it caused this assembly to be generated startig with
.text
.globl a
.bss
.type a, @object
.size a, 1
so I like to know how to interpret the above
so I see .text
I belive this is just symbol .
and text
means start of code section
And U see .global
so I believe my variable(s) that start right after that will be global variables or functions, etc. or do I need to write section name, i.e. .text
right before all variables and functions? this is the question
then u see .bss
now after that .
and bss
all uninitialied variables and functions are declared
and then finally I see something akin to what my C program had a global variable named char a
like
.type a, @object
so .type tells what is it so I assume its of object type as mentioned with @
and object
in .type a,@object
so now size which is 1 char. so this line
.size a, 1
so I assume if I had global int a;
then that would be
.size a,4
char is 1 byte int is 4 bytes
then moving on
I have
a:
so the first few lines becomes like following
assume this is code 1
# my comment 1
# my comment 2
.text
.globl a
.bss
.type a, @object
.size a, 1
a:
So the question is why a: is at the bottom
what if I do like this
this is code 2
a:
.text
.globl a
.bss
.type a, @object
.size a, 1
so I like to know is code 1
and code 2
same? to declare or define a:
appearing first in one and at second in code 2
so from above my a
is in .text
and .global
and .bss
and .type
is @object and size
is 1 byte. This is lots of code to define just one char variable. So is it correct understanding??? should I doubt it
further moving on, now it turn of a global main which is in .text
section plus .global
so I see
.zero 1
.text
.globl main
.type main, @function
main:
so I really dont want to care about .zero 1
line but if I am wrong not to care then tell me the use of it. so again have my gcc place main
in .zero
(some section???) and .text
section plus .global
code section and the type is @function
so now I know type come after ,
as in .type main,@function
and
in .type a, @object
then I encounter complete BS, searching for .LFB0:
brought zero google search results
is .LFB0: a some section of program that my x86-64 processor can run
and .cfi_startproc is eh_frame so I read .eh_frame is a section that lives in the loaded part of the program. so I like to know if I am coding in assembly can I ignore .cfi_startproc
line. but What is the point of this. does this mean after this everything is loaded in memory or registers and and is .ehframe
main:
.LFB0:
.cfi_startproc
endbr64
pushq %rbp #
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp #,
.cfi_def_cfa_register 6
so if I am making a simple assembly program simlar to above C program in assembly do I need to code from .LFB0:
to movq %rsp, %rbp #,\n.cfi_def_cfa_register 6
if not needed then I can assume my program will become
.text
.globl a
.bss
.type a, @object
.size a, 1
a:
.zero 1
.text
.globl main
.type main, @function
main:
.cfi_startproc
pushq %rbp
movq %rsp, %rbp
nop
popq %rbp
ret
.cfi_endproc
so my full program becomes above, how to compile this with nasm can any one please tell I believe I have to save it with .s or .S extension which one s small or large S? I am coding in Ubuntu
This is gcc generated code
.file "test.c"
# GNU C17 (Ubuntu 11.2.0-7ubuntu2) version 11.2.0 (x86_64-linux-gnu)
# compiled by GNU C version 11.2.0, GMP version 6.2.1, MPFR version 4.1.0, MPC version 1.2.0, isl version isl-0.24-GMP
# GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
# options passed: -mtune=generic -march=x86-64 -fasynchronous-unwind-tables -fstack-protector-strong -fstack-clash-protection -fcf-protection
.text
.globl a
.bss
.type a, @object
.size a, 1
a:
.zero 1
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
endbr64
pushq %rbp #
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp #,
.cfi_def_cfa_register 6
# test.c:2: void main(){};
nop
popq %rbp #
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 11.2.0-7ubuntu2) 11.2.0"
.section .note.GNU-stack,"",@progbits
.section .note.gnu.property,"a"
.align 8
.long 1f - 0f
.long 4f - 1f
.long 5
0:
.string "GNU"
1:
.align 8
.long 0xc0000002
.long 3f - 2f
2:
.long 0x3
3:
.align 8
4:
CodePudding user response:
.text
is a directive that tells the assembler to start a program code section (the “text” section of the program, a read-only executable section containing mostly instructions to be executed). It is here because GCC without optimization always puts a .text
at the top of the file, even if it's about to switch to another section (like .bss
in this case) and then back to .text
when it's ready to emit some bytes into that section (in your case, a definition for main
). GCC does still parse the whole compilation unit before emitting any asm, though; it's not just compiling one global variable / function at a time as it goes along.
.globl a
is a directive that tells the assembler that a
is a “global” symbol, so its definition should be listed as an external symbol for the linker to link with.
.bss
is a directive that tells the assembler to start the “block starting symbol” section (which will contain data that is initialized to zero or, on some systems, mostly older, is not initialized).
.type a @object
and .size a, 1
are directives that describe the type and size of an object named a
. The assembler adds this information to the symbol table or other information in the object file it outputs. It is useful for debuggers to know about the types of objects.
a:
is label. It acts to define the symbol. As the assembler reads assembly, it counts bytes in the section it is current generated. Each data declaration or instruction takes up some bytes, and the assembler counts those. When it sees a label, it associates the label with the current count. (This is commonly called the program counter even when it is counting data bytes.) When the assembler writes information about a
to the symbol table, it will include the number of bytes it is from the beginning of the section. When the program is loaded into memory, this offset is used to calculate the address where the object a
will be in memory.
So the question is why a: is at the bottom
a:
must be after .bss
because a
will be put into the section the assembler is currently working on, so that needs to be set to the desired section before declaring the label. The location of a
relative to the other directives might be flexible, so that reordering them would have no consequence.
so I like to know is code 1 and code 2 same?
No, a:
must appear after .bss
so that it is put into the correct section.
.zero 1
says to emit 1 zero byte in the current section. Like (almost?) all directives GCC uses, it's well documented in the GNU assembler manual: https://sourceware.org/binutils/docs/as/Zero.html
so again have my gcc place
main
in.zero
No, .text
starts (or switches back to) the code section, so main
will be in the code section.
is .LFB0: a some section of program that my x86-64 processor can run
Anything ending with a colon is a label. .LFB0
is a local label the compiler is using in case it needs it as a jump or branch target.
so I like to know if I am coding in assembly can I ignore
.cfi_startproc
line.
When writing assembly for simple functions without exception handling and related features, you can ignore .cfi_startproc and other call-frame information directives that generate metadata that goes in the .eh_frame
section. (Which is not executed, it's just there as data in the file for exception handlers and debuggers to read.)
… if not needed then I can assume my program will become…
If you are omitting some of the .cfi… directives, I would omit all of them, unless you look into what they do and determine which ones can be omitted selectively.
I believe I have to save it with .s or .S extension which one s small or large S?
With GCC and Clang, assembly files ending in .S
are processed by the “preprocessor” before assembly, and assembly files ending in .s
are not. This is the preprocessor familiar from C, with #define
, #if
, and other directives. Other tools may not do this. If you are not using preprocessor features, it generally does not matter whether you use .s
or .S
.