Questions tagged [x86-64]
x86-64 is a 64 bit extension to the Intel x86 architecture
7,129
questions
880
votes
17
answers
912k
views
What's the purpose of the LEA instruction?
For me, it just seems like a funky MOV. What's its purpose and when should I use it?
700
votes
4
answers
91k
views
How do I achieve the theoretical maximum of 4 FLOPs per cycle?
How can the theoretical peak performance of 4 floating point operations (double precision) per cycle be achieved on a modern x86-64 Intel CPU?
As far as I understand it takes three cycles for an SSE ...
515
votes
6
answers
123k
views
Why does GCC generate 15-20% faster code if I optimize for size instead of speed?
I first noticed in 2009 that GCC (at least on my projects and on my machines) have the tendency to generate noticeably faster code if I optimize for size (-Os) instead of speed (-O2 or -O3), and I ...
376
votes
16
answers
208k
views
How can I determine if a .NET assembly was built for x86 or x64?
I've got an arbitrary list of .NET assemblies.
I need to programmatically check if each DLL was built for x86 (as opposed to x64 or Any CPU). Is this possible?
320
votes
7
answers
87k
views
Why does this code execute more slowly after strength-reducing multiplications to loop-carried additions?
I was reading Agner Fog's optimization manuals, and I came across this example:
double data[LEN];
void compute()
{
const double A = 1.1, B = 2.2, C = 3.3;
int i;
for(i=0; i<LEN; i++) {...
286
votes
17
answers
103k
views
Submit to App Store issues: Unsupported Architecture x86
So I am trying to use the Shopify API. When I archive the app and validate it then there are no issues but when I submit it to the app store then it gives me the following issues.
ERROR ITMS-90087: "...
271
votes
5
answers
38k
views
Why does GCC use multiplication by a strange number in implementing integer division?
I've been reading about div and mul assembly operations, and I decided to see them in action by writing a simple program in C:
File division.c
#include <stdlib.h>
#include <stdio.h>
int ...
236
votes
4
answers
38k
views
Why would introducing useless MOV store instructions speed up a tight loop in x86_64 assembly?
Background:
While optimizing some Pascal code with embedded assembly language, I noticed an unnecessary MOV instruction, and removed it.
To my surprise, removing the un-necessary instruction caused ...
201
votes
4
answers
184k
views
What are the calling conventions for UNIX & Linux system calls (and user-space functions) on i386 and x86-64
Following links explain x86-32 system call conventions for both UNIX (BSD flavor) & Linux:
http://www.int80h.org/bsdasm/#system-calls
http://www.freebsd.org/doc/en/books/developers-handbook/x86-...
189
votes
4
answers
34k
views
How does Rust's 128-bit integer `i128` work on a 64-bit system?
Rust has 128-bit integers, these are denoted with the data type i128 (and u128 for unsigned ints):
let a: i128 = 170141183460469231731687303715884105727;
How does Rust make these i128 values work on ...
179
votes
4
answers
49k
views
Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?
In the x86-64 Tour of Intel Manuals, I read
Perhaps the most surprising fact is that an instruction such as MOV EAX, EBX automatically zeroes upper 32 bits of RAX register.
The Intel documentation (...
152
votes
12
answers
125k
views
How to find if a native DLL file is compiled as x64 or x86?
I want to determine if a native assembly is complied as x64 or x86 from a managed code application (C#).
I think it must be somewhere in the PE header since the OS loader needs to know this ...
149
votes
1
answer
32k
views
Bubble sort slower with -O3 than -O2 with GCC
I made a bubble sort implementation in C, and was testing its performance when I noticed that the -O3 flag made it run even slower than no flags at all! Meanwhile -O2 was making it run a lot faster as ...
147
votes
2
answers
167k
views
What is the purpose of the RBP register in x86_64 assembler?
I'm trying to learn a little bit of assembly, because I need it for Computer Architecture class. I wrote a few programs, like printing the Fibonacci sequence.
I recognized that whenever I write a ...
140
votes
11
answers
67k
views
Why do x86-64 systems have only a 48 bit virtual address space?
In a book I read the following:
32-bit processors have 2^32 possible addresses, while current 64-bit processors have a 48-bit address space
My expectation was that if it's a 64-bit processor, the ...
138
votes
4
answers
33k
views
Why does Windows64 use a different calling convention from all other OSes on x86-64?
AMD has an ABI specification that describes the calling convention to use on x86-64. All OSes follow it, except for Windows which has it's own x86-64 calling convention. Why?
Does anyone know the ...
133
votes
3
answers
7k
views
Possible GCC bug when returning struct from a function
I believe I found a bug in GCC while implementing O'Neill's PCG PRNG. (Initial code on Godbolt's Compiler Explorer)
After multiplying oldstate by MULTIPLIER, (result stored in rdi), GCC doesn't add ...
129
votes
11
answers
140k
views
Floating point vs integer calculations on modern hardware
I am doing some performance critical work in C++, and we are currently using integer calculations for problems that are inherently floating point because "its faster". This causes a whole ...
122
votes
19
answers
163k
views
System.BadImageFormatException: Could not load file or assembly (from installutil.exe)
I am trying to install a Windows service using InstallUtil.exe and am getting the error message
System.BadImageFormatException: Could not load file or assembly '{xxx.exe}' or one of its ...
119
votes
1
answer
138k
views
Ask GDB to list all functions in a program
How can you list all functions in a program with GDB?
117
votes
3
answers
174k
views
How to build x86 and/or x64 on Windows from command line with CMAKE?
One way to get cmake to build x86 on Windows with Visual Studio is like so:
Start Visual Studio Command prompt for x86
Run cmake: cmake -G "NMake Makefiles" \path_to_source\
nmake
One way to get ...
117
votes
5
answers
60k
views
Memory alignment : how to use alignof / alignas?
I work with shared memory right now.
I can't understand alignof and alignas.
cppreference is unclear : alignof returns "alignment" but what is "alignment" ? number of bytes to add for the next block ...
113
votes
10
answers
41k
views
Why is x86 ugly? Why is it considered inferior when compared to others? [closed]
I've been reading some SO archives and encountered statements against the x86 architecture.
Why do we need different CPU architecture for server & mini/mainframe & mixed-core? says
"PC ...
113
votes
8
answers
108k
views
Targeting both 32bit and 64bit with Visual Studio in same solution/project
I have a little dilemma on how to set up my visual studio builds for multi-targeting.
Background: c# .NET v2.0 with p/invoking into 3rd party 32 bit DLL's, SQL compact v3.5 SP1, with a Setup project. ...
105
votes
2
answers
57k
views
What does @plt mean here?
0x00000000004004b6 <main+30>: callq 0x400398 <printf@plt>
Anyone knows?
UPDATE
Why two disas printf give me different result?
(gdb) disas printf
Dump of assembler code for function ...
104
votes
2
answers
65k
views
What does the endbr64 instruction actually do?
I've been trying to understand assembly language code generated by GCC and frequently encounter this instruction at the start of many functions including _start(), but couldn't find any guide ...
101
votes
3
answers
127k
views
How to run amd64 docker image on arm64 host platform?
I have an m1 mac and I am trying to run a amd64 based docker image on my arm64 based host platform. However, when I try to do so (with docker run) I get the following error:
WARNING: The requested ...
99
votes
6
answers
93k
views
How to detect 386, amd64, arm, or arm64 OS architecture via shell/bash
I'm looking for a POSIX shell/bash command to determine if the OS architecture is 386, amd64, arm, or arm64?
98
votes
2
answers
31k
views
What does "rep; nop;" mean in x86 assembly? Is it the same as the "pause" instruction?
What does rep; nop mean?
Is it the same as pause instruction?
Is it the same as rep nop (without the semi-colon)?
What's the difference to the simple nop instruction?
Does it behave differently on AMD ...
96
votes
2
answers
64k
views
How can objdump emit intel syntax
How can I tell objdump to emit assembly in Intel Syntax rather than the default AT&T syntax?
96
votes
1
answer
72k
views
x86_64 registers rax/eax/ax/al overwriting full register contents [duplicate]
As it is widely advertised, modern x86_64 processors have 64-bit registers that can be used in backward-compatible fashion as 32-bit registers, 16-bit registers and even 8-bit registers, for example:
...
91
votes
3
answers
43k
views
Where is the x86-64 System V ABI documented?
The x86-64 System V ABI (used on everything except Windows) used to live at http://x86-64.org/documentation/abi.pdf, but that site has now fallen off the internet.
Is there a new authoritative home ...
87
votes
3
answers
5k
views
Why can't GCC generate an optimal operator== for a struct of two int32s?
A colleague showed me code that I thought wouldn't be necessary, but sure enough, it was. I would expect most compilers would see all three of these attempts at equality tests as equivalent:
#include &...
79
votes
1
answer
6k
views
Why does my Intel Skylake / Kaby Lake CPU incur a mysterious factor 3 slowdown in a simple hash table implementation?
In short:
I have implemented a simple (multi-key) hash table with buckets (containing several elements) that exactly fit a cacheline.
Inserting into a cacheline bucket is very simple, and the critical ...
76
votes
2
answers
17k
views
What is the purpose of the "PAUSE" instruction in x86?
I am trying to create a dumb version of a spin lock. Browsing the web, I came across a assembly instruction called "PAUSE" in x86 which is used to give hint to a processor that a spin-lock is ...
76
votes
1
answer
4k
views
C# and SIMD: High and low speedups. What is happening?
Introduction of the problem
I am trying to speed up the intersection code of a (2d) ray tracer that I am writing. I am using C# and the System.Numerics library to bring the speed of SIMD instructions.
...
75
votes
1
answer
17k
views
What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
int 0x80 on Linux always invokes the 32-bit ABI, regardless of what mode it's called from: args in ebx, ecx, ... and syscall numbers from /usr/include/asm/unistd_32.h. (Or crashes on 64-bit kernels ...
74
votes
3
answers
90k
views
Force gcc to compile 32 bit programs on 64 bit platform
I've got a proprietary program that I'm trying to use on a 64 bit system.
When I launch the setup it works ok, but after it tries to update itself and compile some modules and it fails to load them. ...
74
votes
6
answers
41k
views
How to use gdb with LD_PRELOAD
I run a program with LD_PRELOADing a specific library. Like this.
LD_PRELOAD=./my.so ./my_program
How do I run this program with gdb?
73
votes
4
answers
6k
views
Why is the construction of std::optional<int> more expensive than a std::pair<int, bool>?
Consider these two approaches that can represent an "optional int":
using std_optional_int = std::optional<int>;
using my_optional_int = std::pair<int, bool>;
Given these two functions......
72
votes
4
answers
6k
views
Why is memcmp(a, b, 4) only sometimes optimized to a uint32 comparison?
Given this code:
#include <string.h>
int equal4(const char* a, const char* b)
{
return memcmp(a, b, 4) == 0;
}
int less4(const char* a, const char* b)
{
return memcmp(a, b, 4) < 0;
...
71
votes
4
answers
101k
views
What are the names of the new X86_64 processors registers?
Where can I find the names of the new registers for assembly on this architecture?
I am referring to registers in X86 like EAX, ESP, EBX, etc. But I'd like them in 64bit.
I don't think they are the ...
71
votes
5
answers
41k
views
To learn assembly - should I start with 32 bit or 64 bit?
I'm really wanting to learn assembly. I'm pretty good at c/c++, but want a better understanding of what's going on at a lower level.
I realize that assembly related questions have been asked before, ...
70
votes
6
answers
66k
views
What is the difference between x64 and IA-64?
I was on Microsoft's website and noticed two different installers, one for x64 and one for IA-64. Reference:Installing the .NET Framework 4.5, 4.5.1
My understanding is that IA-64 is a subclass of ...
69
votes
2
answers
198k
views
How to check if Intel Virtualization is enabled without going to BIOS in Windows 10 [closed]
I want to check if Intel virtualization is enabled in my laptop or not (Lenovo Thinkpad, Win 10 64 bit). Is there any way available to check it without going to BIOS?
67
votes
4
answers
150k
views
Difference between x86, x32, and x64 architectures?
Please explain the difference between x86, x32 and x64? Its a bit confusing when it comes to x86 and x32 because most of the time 32-bit programs run on x86...
Related/possible duplicate which also ...
66
votes
3
answers
52k
views
What registers are preserved through a linux x86-64 function call
I believe I understand how the linux x86-64 ABI uses registers and stack to pass parameters to a function (cf. previous ABI discussion). What I'm confused about is if/what registers are expected to be ...
65
votes
2
answers
4k
views
Performance difference between Windows and Linux using Intel compiler: looking at the assembly
I am running a program on both Windows and Linux (x86-64). It has been compiled with the same compiler (Intel Parallel Studio XE 2017) with the same options, and the Windows version is 3 times faster ...
65
votes
1
answer
102k
views
Intel 64, rsi and rdi registers
In Intel x86 64 bit architecture there is the rax...rdx registers which are simply A...D general purpose registers.
But there are also registers called rsi and rdi which are the "source index&...
61
votes
2
answers
17k
views
"Unexplainable" core dump
I've seen many core dumps in my life, but this one has me stumped.
Context:
multi-threaded Linux/x86_64 program running on a cluster of AMD Barcelona CPUs
the code that crashes is executed a lot
...