git describe --tags to find it): v4.0-dev-225-g2f8b6cfc7xtensa-esp32-elf-gcc --version to find it):xtensa-esp32-elf-gcc emits wrong machine code in some situations. The particular problem I noticed is the generated machine code for l32r a6, 0x???? instruction has not only the wrong byte order, but also lacks the immediate number field.
//Detailed problem description goes here.
When I was trying to study the machine code of ESP32, I discovered a strange pattern in the disassembly that I couldn't explain. After some searching and guesswork, I roughly figured out what might have gone wrong with the compiler. see section "Steps to repropduce" for detail.
for all instructions like l32r a6, 0xabcd, the generated machine code should be 0xabcd61.
for some l32r a6, 0xabcd, the machine code turned out to be 0x610000, which is coincidentally another instruction xsr.lbeg a0. Interestingly, the linker will silently ignore the problem during the linking and relocation stage.
I have no time to create my own program that exactly reproduces the problem. However, the binary wifi library shipped with esp-idf should give you enough information about the problem.
Here I attached a snippet of the disassembly of the function esp_wifi_get_channel before and after relocation (linking).
after_relocation.txt
before_relocation.txt
Search for 610000 or xsr.lbeg to locate the problem.
The disassembly of this function is particularly useful for showing the problem. The function is supposed to take 2 arguments (stored in register a2 and a3). The register a6 is never being written to throughout the function. However, it is being read several times. The only possible place for it being assigned value is where 610000 resides, which looks shockingly similar to 000061, which stands for l32r a6, offset.
EDIT: Double checked the machine code with hex editor. The problem is indeed on the compiler instead of objdump.

EDIT: Also note the garbage bytes cf, ff that follow the 610000 in before_relocation.txt and the garbage instruction 400e8745: fd4c movi.n a13, 79 that follows xsr.lbeg a0 in after_relocation.txt. I have no idea what is happening to the compiler and linker here.
Search for 610000 or xsr.lbeg to locate the problem.
I've found the following:
c8: f01d retw.n
ca: 610000 xsr.lbeg a0
cc: R_XTENSA_SLOT0_OP .text.esp_wifi_get_channel+0x8
cd: cf .byte 0xcf
ce: ff .byte 0xff
cf: 0a1c movi.n a10, 16
d1: 0628 l32i.n a2, a6, 0
and I can tell two things:
first: bytes at 0xca and 0xcb are two padding zeros.
second: relocation at address 0xcc applies to a whole instruction.
The code above is fine, the disassembly is buggy because the disassembler got out of sync with instruction stream.
A patch that fixes this objdump issue is available here: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=4b8e28c79356265b2c111e044142fb6d6d2db44e
It is a part of binutils since release 2.31.
xtensa-esp32-elf-gcc emits wrong machine code in some situations.
The compiler does not emit machine code, it only emits assembly code. The assembler turns it into machine code. You can invoke compiler with -S instead of -c to get the assembly output and see what's there.
Search for 610000 or xsr.lbeg to locate the problem.
I've found the following:
c8: f01d retw.n ca: 610000 xsr.lbeg a0 cc: R_XTENSA_SLOT0_OP .text.esp_wifi_get_channel+0x8 cd: cf .byte 0xcf ce: ff .byte 0xff cf: 0a1c movi.n a10, 16 d1: 0628 l32i.n a2, a6, 0and I can tell two things:
first: bytes at 0xca and 0xcb are two padding zeros.
second: relocation at address 0xcc applies to a whole instruction.The code above is fine, the disassembly is buggy because the disassembler got out of sync with instruction stream.
A patch that fixes this objdump issue is available here: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=4b8e28c79356265b2c111e044142fb6d6d2db44e
It is a part of binutils since release 2.31.
Thanks for the reply. However, note the order of the three bytes. Shouldn't the machine code at ca be 000061 instead of 610000 according to Xtensa ISA? In other words, shouldn't be the three bytes in linear order be 61 00 00 in the attached hex editor screenshot?
According to ISA, XSR.* instructions have the following form:
23 <- 0110 0001 sr t 0000 -> 0, where sr = 0000 0000 for lbeg, t = 0000 for a0.
Due to little-endianness, this should translate to 00 00 61 in hex editor.
However,L32R at imm16 instruction is like this:
23 <- imm16 t 0001 -> 0, where t= 0110 for a6
So the correct form should be 61 00 00 in hex editor, which is definitely not the case currently...
Also the trailing zeros are usually not the convention of GCC's assembler. For my experience, it usually fills the offset with some plausible number and change later during the relocation phase.
EDIT: You can also verify this with other disassemblers such as:
https://onlinedisassembler.com/
Here are the screenshots:
Original buggy version:

After correction:

EDIT: __Also note that the code is still not correct even after relocation.__ after_relocation.txt shows the same piece of code after linking. It is pulled from the disassembly of the full rom image. See the end of my problem description for detail. The instruction 400e8745: fd4c movi.n a13, 79 that follows the problematic xsr instruction is useless since the a13 register is never being used after that. Also that instruction is not in the object file. I suspect it is a part of the erroneous machine code that coincidentally resembled a movi instruction.
Ahh, I realized what went wrong here. It is the problem of objdump indeed. The code is correct.
Instead of being 0x610000, the instruction is actually 0xffcf61, which interprets to l32r a6, -0xC3. The two zero bytes are for alignment purpose since most of the branching instructions can only branch to 32bit aligned addresses.
/* There is a branching to 0xcc elsewhere */
c8: f01d retw.n /* This is a return instruction. Will never execute pass this point */
ca: 610000 xsr.lbeg a0 /* The 0x61 is at 0xcc, which is a 32bit-aligned address thanks to those zeros */
cc: R_XTENSA_SLOT0_OP .text.esp_wifi_get_channel+0x8
cd: cf .byte 0xcf
ce: ff .byte 0xff /* 0xffcf61 is the actual instruction */
cf: 0a1c movi.n a10, 16
d1: 0628 l32i.n a2, a6, 0
The two zero bytes are for alignment purpose since most of the branching instructions can only branch to 32bit aligned addresses.
Only call{0,4,8,12} instructions require 4-bytes-aligned addresses on xtensa, ordinary branches and jumps don't require any alignment for the target address. But sometimes it is a bit faster when branch target is aligned, so the assembler does that. This behavior may be disabled with the --no-target-align option to the assembler.
There are additional alignment complexities with the loop instruction and the first instruction of the loop body.
Preview release of the toolchain which includes updated binutils version is now available, see https://www.esp32.com/viewtopic.php?f=10&t=7400&p=31257#p41667.
Most helpful comment
I've found the following:
and I can tell two things:
first: bytes at 0xca and 0xcb are two padding zeros.
second: relocation at address 0xcc applies to a whole instruction.
The code above is fine, the disassembly is buggy because the disassembler got out of sync with instruction stream.
A patch that fixes this objdump issue is available here: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=4b8e28c79356265b2c111e044142fb6d6d2db44e
It is a part of binutils since release 2.31.