Questions tagged [assembly]
Assembly language questions. Please tag the processor and/or the instruction set you are using, as well as the assembler, a valid set should be like this: ([assembly] [x86] [gnu-assembler] or [att]). Use the [.net-assembly] tag instead for .NET assemblies, [cil] for .NET assembly language, [wasm] for web assembly, and for Java bytecode, use the tag java-bytecode-asm instead.
44,334
questions
2310
votes
12
answers
235k
views
Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)?
I am doing some numerical optimization on a scientific application. One thing I noticed is that GCC will optimize the call pow(a,2) by compiling it into a*a, but the call pow(a,6) is not optimized and ...
1772
votes
15
answers
151k
views
Is < faster than <=?
Is if (a < 901) faster than if (a <= 900)?
Not exactly as in this simple example, but there are slight performance changes on loop complex code. I suppose this has to do something with generated ...
1638
votes
11
answers
197k
views
Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations with _mm_popcnt_u64 on Intel CPUs
I was looking for the fastest way to popcount large arrays of data. I encountered a very weird effect: Changing the loop variable from unsigned to uint64_t made the performance drop by 50% on my PC.
...
941
votes
11
answers
181k
views
Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly?
I wrote these two solutions for Project Euler Q14, in assembly and in C++. They implement identical brute force approach for testing the Collatz conjecture. The assembly solution was assembled with:
...
880
votes
17
answers
912k
views
What's the purpose of the LEA instruction?
For me, it just seems like a funky MOV. What's its purpose and when should I use it?
700
votes
4
answers
91k
views
How do I achieve the theoretical maximum of 4 FLOPs per cycle?
How can the theoretical peak performance of 4 floating point operations (double precision) per cycle be achieved on a modern x86-64 Intel CPU?
As far as I understand it takes three cycles for an SSE ...
532
votes
17
answers
527k
views
How do you get assembler output from C/C++ source in GCC?
How does one do this?
If I want to analyze how something is getting compiled, how would I get the emitted assembly code?
502
votes
40
answers
150k
views
When is assembly faster than C? [closed]
One of the stated reasons for knowing assembler is that, on occasion, it can be employed to write code that will be more performant than writing that code in a higher-level language, C in particular. ...
320
votes
7
answers
87k
views
Why does this code execute more slowly after strength-reducing multiplications to loop-carried additions?
I was reading Agner Fog's optimization manuals, and I came across this example:
double data[LEN];
void compute()
{
const double A = 1.1, B = 2.2, C = 3.3;
int i;
for(i=0; i<LEN; i++) {...
318
votes
16
answers
938k
views
Is it possible to "decompile" a Windows .exe? Or at least view the Assembly?
A friend of mine downloaded some malware from Facebook, and I'm curious to see what it does without infecting myself. I know that you can't really decompile an .exe, but can I at least view it in ...
315
votes
11
answers
220k
views
Using GCC to produce readable assembly?
I was wondering how to use GCC on my C source file to dump a mnemonic version of the machine code so I could see what my code was being compiled into. You can do this with Java but I haven't been able ...
312
votes
11
answers
72k
views
What does multicore assembly language look like?
Once upon a time, to write x86 assembler, for example, you would have instructions stating "load the EDX register with the value 5", "increment the EDX" register, etc.
With modern CPUs that have 4 ...
311
votes
4
answers
130k
views
How to run a program without an operating system?
How do you run a program all by itself without an operating system running?
Can you create assembly programs that the computer can load and run at startup, e.g. boot the computer from a flash drive ...
287
votes
5
answers
16k
views
Why does Java switch on contiguous ints appear to run faster with added cases?
I am working on some Java code which needs to be highly optimized as it will run in hot functions that are invoked at many points in my main program logic. Part of this code involves multiplying ...
285
votes
12
answers
74k
views
Is 'switch' faster than 'if'?
Is a switch statement actually faster than an if statement?
I ran the code below on Visual Studio 2010's x64 C++ compiler with the /Ox flag:
#include <stdlib.h>
#include <stdio.h>
#include ...
284
votes
6
answers
263k
views
What exactly is the base pointer and stack pointer? To what do they point?
Using this example coming from Wikipedia, in which DrawSquare() calls DrawLine():
(Note that this diagram has high addresses at the bottom and low addresses at the top.)
Could anyone explain to me ...
283
votes
10
answers
194k
views
Assembly code vs Machine code vs Object code?
What is the difference between object code, machine code and assembly code?
Can you give a visual example of their difference?
282
votes
3
answers
98k
views
What is a retpoline and how does it work?
In order to mitigate against kernel or cross-process memory disclosure (the Spectre attack), the Linux kernel1 will be compiled with a new option, -mindirect-branch=thunk-extern introduced to gcc to ...
271
votes
5
answers
38k
views
Why does GCC use multiplication by a strange number in implementing integer division?
I've been reading about div and mul assembly operations, and I decided to see them in action by writing a simple program in C:
File division.c
#include <stdlib.h>
#include <stdio.h>
int ...
239
votes
8
answers
351k
views
Show current assembly instruction in GDB
I'm doing some assembly-level debugging in GDB. Is there a way to get GDB to show me the current assembly instruction in the same way that it shows the current source line? The default output after ...
236
votes
4
answers
38k
views
Why would introducing useless MOV store instructions speed up a tight loop in x86_64 assembly?
Background:
While optimizing some Pascal code with embedded assembly language, I noticed an unnecessary MOV instruction, and removed it.
To my surprise, removing the un-necessary instruction caused ...
229
votes
25
answers
83k
views
Protecting executable from reverse engineering?
I've been contemplating how to protect my C/C++ code from disassembly and reverse engineering. Normally I would never condone this behavior myself in my code; however the current protocol I've been ...
217
votes
5
answers
277k
views
The point of test %eax %eax [duplicate]
Possible Duplicate:
x86 Assembly - ‘testl’ eax against eax?
I'm very very new to assembly language programming, and I'm currently trying to read the assembly language generated from a binary. I'...
216
votes
32
answers
287k
views
Why aren't programs written in Assembly more often? [closed]
It seems to be a mainstream opinion that assembly programming takes longer and is more difficult to program in than a higher level language such as C. Therefore it seems to be recommend or assumed ...
205
votes
21
answers
73k
views
Is inline assembly language slower than native C++ code?
I tried to compare the performance of inline assembly language and C++ code, so I wrote a function that add two arrays of size 2000 for 100000 times. Here's the code:
#define TIMES 100000
void calcuC(...
201
votes
4
answers
184k
views
What are the calling conventions for UNIX & Linux system calls (and user-space functions) on i386 and x86-64
Following links explain x86-32 system call conventions for both UNIX (BSD flavor) & Linux:
http://www.int80h.org/bsdasm/#system-calls
http://www.freebsd.org/doc/en/books/developers-handbook/x86-...
190
votes
3
answers
36k
views
Why does GCC generate such radically different assembly for nearly the same C code?
While writing an optimized ftol function I found some very odd behaviour in GCC 4.6.1. Let me show you the code first (for clarity I marked the differences):
fast_trunc_one, C:
int fast_trunc_one(...
189
votes
13
answers
25k
views
Is incrementing an int effectively atomic in specific cases?
In general, for int num, num++ (or ++num), as a read-modify-write operation, is not atomic. But I often see compilers, for example GCC, generate the following code for it (try here):
void f()
{
int ...
189
votes
12
answers
249k
views
What is the difference between MOV and LEA?
I would like to know what the difference between these instructions is:
MOV AX, [TABLE-ADDR]
and
LEA AX, [TABLE-ADDR]
189
votes
1
answer
87k
views
What is the best way to set a register to zero in x86 assembly: xor, mov or and?
All the following instructions do the same thing: set %eax to zero. Which way is optimal (requiring fewest machine cycles)?
xorl %eax, %eax
mov $0, %eax
andl $0, %eax
183
votes
3
answers
120k
views
How do you use gcc to generate assembly code in Intel syntax?
The gcc -S option will generate assembly code in AT&T syntax, is there a way to generate files in Intel syntax? Or is there a way to convert between the two?
183
votes
1
answer
79k
views
Why do ARM chips have an instruction with Javascript in the name (FJCVTZS)?
FJCVTZS is "Floating-point Javascript Convert to Signed fixed-point, rounding toward Zero". It is supported in Arm v8.3-A chips and later. Which is odd, because you don't expect to see ...
179
votes
4
answers
49k
views
Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?
In the x86-64 Tour of Intel Manuals, I read
Perhaps the most surprising fact is that an instruction such as MOV EAX, EBX automatically zeroes upper 32 bits of RAX register.
The Intel documentation (...
161
votes
4
answers
55k
views
What is the meaning of "non temporal" memory accesses in x86
This is a somewhat low-level question. In x86 assembly there are two SSE instructions:
MOVDQA xmmi, m128
and
MOVNTDQA xmmi, m128
The IA-32 Software Developer's Manual says that the NT in ...
160
votes
3
answers
200k
views
What does `dword ptr` mean?
Could someone explain what this means? (Intel Syntax, x86, Windows)
and dword ptr [ebp-4], 0
158
votes
14
answers
124k
views
How can I see the assembly code for a C++ program?
How can I see the assembly code for a C++ program?
What are the popular tools to do this?
156
votes
6
answers
110k
views
What is the purpose of XORing a register with itself? [duplicate]
xor eax, eax will always set eax to zero, right? So, why does MSVC++ sometimes put it in my executable's code? Is it more efficient that mov eax, 0?
012B1002 in al,dx
012B1003 push ...
155
votes
5
answers
275k
views
Purpose of ESI & EDI registers?
What is the actual purpose and use of the EDI & ESI registers in assembler?
I know they are used for string operations for one thing.
Can someone also give an example?
154
votes
13
answers
15k
views
How are everyday machines programmed? [closed]
How are everyday machines (not so much computers and mobile devices as appliances, digital watches, etc) programmed? What kind of code goes into the programming of a Coca-Cola vending machine? How ...
147
votes
5
answers
461k
views
What is the function of the push / pop instructions used on registers in x86 assembly?
When reading about assembler I often come across people writing that they push a certain register of the processor and pop it again later to restore it's previous state.
How can you push a register? ...
147
votes
7
answers
44k
views
How does this milw0rm heap spraying exploit work?
I usually do not have difficulty to read JavaScript code but for this one I can’t figure out the logic. The code is from an exploit that has been published 4 days ago. You can find it at milw0rm.
...
147
votes
2
answers
167k
views
What is the purpose of the RBP register in x86_64 assembler?
I'm trying to learn a little bit of assembly, because I need it for Computer Architecture class. I wrote a few programs, like printing the Fibonacci sequence.
I recognized that whenever I write a ...
143
votes
3
answers
311k
views
How can one see content of stack with GDB?
I am new to GDB, so I have some questions:
How can I look at content of the stack?
Example: to see content of register, I type info registers. For the stack, what should it be?
How can I see the ...
141
votes
11
answers
161k
views
How to view the assembly behind the code using Visual C++?
I was reading another question pertaining the efficiency of two lines of code, and the OP said that he looked at the assembly behind the code and both lines were identical in assembly. Digression ...
138
votes
11
answers
285k
views
How to disassemble a binary executable in Linux to get the assembly code?
I was told to use a disassembler. Does gcc have anything built in? What is the easiest way to do this?
138
votes
4
answers
71k
views
What are CFI directives in Gnu Assembler (GAS) used for?
There seem to be a .CFI directive after every line and also there are wide varieties of these ex.,.cfi_startproc , .cfi_endproc etc.. more here.
.file "temp.c"
.text
.globl main
...
138
votes
6
answers
127k
views
What is the "FS"/"GS" register intended for?
So I know what the following registers and their uses are supposed to be:
CS = Code Segment (used for IP)
DS = Data Segment (used for MOV)
ES = Destination Segment (used for MOVS, etc.)
SS = Stack ...
138
votes
4
answers
33k
views
Why does Windows64 use a different calling convention from all other OSes on x86-64?
AMD has an ABI specification that describes the calling convention to use on x86-64. All OSes follow it, except for Windows which has it's own x86-64 calling convention. Why?
Does anyone know the ...
136
votes
9
answers
189k
views
What does "int 0x80" mean in assembly code?
Can someone explain what the following assembly code does?
int 0x80
133
votes
3
answers
7k
views
Possible GCC bug when returning struct from a function
I believe I found a bug in GCC while implementing O'Neill's PCG PRNG. (Initial code on Godbolt's Compiler Explorer)
After multiplying oldstate by MULTIPLIER, (result stored in rdi), GCC doesn't add ...