v0.9.15 doesn't start the emulation on m1 Mac mini. I can open the app, configure everything, but as soon as I start the Emulation it will stay at a blank screen. If I start the app by checking "open with Rosetta" it will work, although the emulation is way to fast (discussed on other issue with virtualC64)
Xcode output just before it stops:
2020-11-25 09:50:48.739121+0100 vAmiga[1948:75168] setup
DiskMountDialog.120::awakeFromNib()
DialogController.68::awakeFromNib()
DiskMountDialog.239::windowDidResize(_:)
DiskMountDialog.239::windowDidResize(_:)
DiskMountDialog.199::insertDiskAction(_:): insertDiskAction df0
MyDocument.439::loadScreenshots(): Seeking screenshots for disk with id 13183632172158702109
MyDocument.448::loadScreenshots(): 0 screenshots loaded
Animation.197::zoomIn(steps:): Zooming in...
v0.9.15.2 emulation does not start on Apple Silicon.
There are some loads of misaligned addresses that probably should be fixed.
Quick fix:
(vAmiga) > Edit Scheme... > Diagnostics > Enable "Thread Sanitizer" and "Undefined Behavior Sanitizer"
Replacing read16 and read32 in Serialization.h with this quick and dirty hack gets rid of the misaligned address accesses.
inline u16 read16(u8 *& buffer)
{
u8 b1 = *buffer;
buffer += 1;
u8 b2 = *buffer;
buffer += 1;
u16 result = (b1 << 8) | b2;
return result;
}
inline u32 read32(u8 *& buffer)
{
u8 b1 = *buffer;
buffer += 1;
u8 b2 = *buffer;
buffer += 1;
u8 b3 = *buffer;
buffer += 1;
u8 b4 = *buffer;
buffer += 1;
u32 result = (b1 << 24) + (b2 << 16) + (b3 << 8) | b4;
return result;
}
But several data race conditions make the emulator go stop on Apple silicon. That also occurs on x86 machines. But this does not cause any problems there.
data race conditions make the emulator go stop on Apple silicon. That also occurs on x64 machines.
Thanks a lot for digging into that!
How did you detect the race conditions on x64 machines? Using the Sanitizer setting mentioned above?
Being able to debug these conditions on x64 Macs would be brilliant news. It would mean that I can rule out (hopefully all) Mac Silicon bugs without having such a machine in my possession.
Yes, use the settings above. And thanks for your brilliant work!
To tackle the memory alignment issue, let's do some benchmarking first. Here is my example code:
#include <arpa/inet.h>
unsigned char a[1024];
unsigned short read16(int i) {
return a[i] << 8 | a[i+1];
}
unsigned long read32(int i) {
return a[i] << 24 | a[i+1] << 16 | a[i+2] << 8 | a[i+3];
}
unsigned short read16_2(int i) {
return htons(((unsigned short *)a)[i]);
}
unsigned long read32_2(int i) {
return htonl(((unsigned long *)a)[i]);
}
Code produced by x86-64 gcc (-O3):
read16(int):
movsx rax, edi
add edi, 1
movzx edx, BYTE PTR a[rax]
movsx rdi, edi
movzx eax, BYTE PTR a[rdi]
sal edx, 8
or eax, edx
ret
read32(int):
movsx rax, edi
lea edx, [rdi+3]
movzx eax, BYTE PTR a[rax]
movsx rdx, edx
movzx edx, BYTE PTR a[rdx]
sal eax, 24
or eax, edx
lea edx, [rdi+1]
add edi, 2
movsx rdx, edx
movsx rdi, edi
movzx edx, BYTE PTR a[rdx]
sal edx, 16
or eax, edx
movzx edx, BYTE PTR a[rdi]
sal edx, 8
or eax, edx
cdqe
ret
read16_2(int):
movsx rdi, edi
movzx eax, WORD PTR a[rdi+rdi]
rol ax, 8
ret
read32_2(int):
movsx rdi, edi
mov rax, QWORD PTR a[0+rdi*8]
bswap eax
mov eax, eax
ret
a:
.zero 1024
Code produced by x86-64 clang (-O3):
read16(int): # @read16(int)
movsxd rax, edi
movzx eax, word ptr [rax + a]
rol ax, 8
ret
read32(int): # @read32(int)
movsxd rax, edi
movzx ecx, byte ptr [rax + a]
shl ecx, 24
movzx edx, byte ptr [rax + a+1]
shl rdx, 16
movsxd rcx, ecx
or rcx, rdx
movzx edx, byte ptr [rax + a+2]
shl rdx, 8
or rdx, rcx
movzx eax, byte ptr [rax + a+3]
or rax, rdx
ret
read16_2(int): # @read16_2(int)
movsxd rax, edi
movzx eax, word ptr [rax + rax + a]
rol ax, 8
ret
read32_2(int): # @read32_2(int)
movsxd rax, edi
mov eax, dword ptr [8*rax + a]
bswap eax
ret
a:
.zero 1024
Bottom line:
In v0.9.16.1, I've introduced new macros for big endian memory access:
//
// Accessing memory
//
// Reads a value in big-endian format
#define R8BE(a) (*(u8 *)(a))
#define R16BE(a) HI_LO(*(u8 *)(a), *(u8 *)((a)+1))
#define R32BE(a) HI_HI_LO_LO(*(u8 *)(a), *(u8 *)((a)+1), *(u8 *)((a)+2), *(u8 *)((a)+3))
#define R8BE_ALIGNED(a) (*(u8 *)(a))
#define R16BE_ALIGNED(a) (htons(*(u16 *)(a)))
#define R32BE_ALIGNED(a) (htonl(*(u32 *)(a)))
// Writes a value in big-endian format
#define W8BE(a,v) { *(u8 *)(a) = (v); }
#define W16BE(a,v) { *(u8 *)(a) = HI_BYTE(v); *(u8 *)((a)+1) = LO_BYTE(v); }
#define W32BE(a,v) { W16BE(a,HI_WORD(v)); W16BE((a)+2,LO_WORD(v)); }
#define W8BE_ALIGNED(a,v) { *(u8 *)(a) = (u8)(v); }
#define W16BE_ALIGNED(a,v) { *(u16 *)(a) = ntohs((u16)v); }
#define W32BE_ALIGNED(a,v) { *(u32 *)(a) = ntohl((u32)v); }
The faster _ALIGNED variants are used for accessing the Amiga memory. They will work on ARM as well, because all accesses will happen at aligned memory locations. The other variants are utilized by the Serializer since the values inside a snapshot are not aligned in general.
v0.9.16.1 still no joy on Apple silicon. a few days ago I had it running with roger's fixes though...
I do have a spare Macmini9,1. if you want me to, I can setup a VPN / Screen Sharing account for you. that's how roger was able to do some debugging on M1.
Hmm I am uncertain whether the cause of the failure on m1 is found .... It is not clear to me whether it did run ok with rogers fixes on m1 or not ....
When it did run ok with the above fixes then zip the project and put it into here ... dirk can then run a file compare and patch it into the version at github ....
unfortunately, I already have wiped the Mac mini on which roger did the debugging. I only kept the working .app which runs slow like molasses if startet without the "Thread Sanitizer" of Xcode.
Oh I see ... and also there seems to be no ARM emulator on Intel Macs to test the ArmBuild of vAmiga or vC64 directly ... apple wants us to buy the real hardware 馃槵
Another way to test the ARM build of the emulator code would be to make a blank iOS Project and copy over the Emulator source-code-folder only into that empty iOS project ... Then to instantiate the C64 or Amiga Object and to depoy and run the emulator on an iPhone/iPad with an AXX processor ... which shares the same ABI as the M1 processor ... so in theory the same problems should arise in an iPad/iPhone build of the vAmiga/C64 emulator .. no?
I have a mac mini with M1. I also get the same issue that emulation doesn't seem to start.
I have an iPhone with an AppleSilicon A9 processor. And emulation seems to start on that machine... I see the DF0 drive head clicking ...

question the M1 owners ... do you at least see the hand disk image ?
next I will try to boot defender of the crown on the A9 processor ...
I only see a black screen when I try to boot with both 1.3 kick or the built-in AROS
the emulator itself taken from master branch without its GUI loads defender of the crown on an iPhone AppleSilicon A9 processor
(the first seconds of the video nothing happens ...because XCode loads the code package onto the AppleSilicon powered iPhone... wait a bit)
Good news. I got it to boot :)

There seems to be a problem with this code
void
Oscillator::waitUntil(u64 deadline)
{
#ifdef __MACH__
// mach_wait_until(deadline);
#else
assert(false);
// TODO: MISSING IMPLEMENTATION
#endif
}
If I comment out the wait code above thing seems to start running (of course too fast) but the sleep value that comes in here is wayyy too big so the code just sits sleeping in there it looks like.
deadline u64 515659668836899
but the sleep value that comes in here is wayyy too big
Wow, that's great news! So it's due to the Mach conversion unit thing that was mentioned in another thread (where I replied the code would already take care of it 馃檮).
Yeah. I think the code needs some extra checks :) I can run more tests if you find something.
@emoon good news that m1 is back in game 馃ぉ!!
The timing bug on applesilicon and its background was mentioned in this thread see second post https://github.com/dirkwhoffmann/virtualc64/issues/592
That perfectly explains now why the isolated emulator on A9 AppleSilicon needed the warp mode set to true !!!
I first spent no attention the importance or meaning that I had to set it to true ... to make the isolated emulator code to work on the iPhone... 馃檮
if I now uncomment the setWarp(true) like this
//armAmiga->setWarp(true);
DiskFile *df0file = DiskFile::makeWithFile(df0_path);
if (df0file) {
fprintf(stderr, "disk found, insert the disk %s", df0_path);
Disk *disk = Disk::makeWithFile(df0file);
armAmiga->df0.ejectDisk();
armAmiga->df0.insertDisk(disk);
armAmiga->df0.setWriteProtection(false);
}
armAmiga->run();
and run it again on the A9 chip then it does not load defender of the crown anymore ... the A9 gets stuck as on the m1 chip very early in boot process ... no drive actions

I guess setWarp(true) skips the defect time waiting code ...
@dirkwhoffmann should I send you the xcode project with the isolated emulator code for testing on iPhone ARM chip ?
should I send you the xcode project with the isolated emulator code for testing on iPhone ARM chip ?
Yes, that's the way to go
pm with zipped project sent ...馃槑
So bad for apple... they wanted to sell us tons of M1 Macs and now we just use our iPhones for ARM development instead 馃槀
I think I got it:
This
Oscillator::waitUntil(u64 deadline)
{
#ifdef __MACH__
mach_wait_until(deadline);
#else
...
has to be replaced by that:
Oscillator::waitUntil(u64 deadline)
{
#ifdef __MACH__
mach_wait_until(nanos_to_abs(deadline));
#else
...
My nanos_to_absfunction was alright, but not called at all necessary places.
Yes, works fine now :)

yup, confirmed! works fine now ;-)
Great news and great teamwork!
I'll upload version v0.16.2 tomorrow.
I'll upload version v0.16.2 tomorrow.
0.9.16.2 ;-)
dieser thread kann zu, v0.9.16.2 laeuft auf Apple silicon.
issue resolved, v0.9.16.2 works on Apple silicon.
thanks everybody, happy new year to all!
issue resolved, v0.9.16.2 works on Apple silicon.
Could you do one thing before I close the thread? Could you check if snapshot saving (and loading) works? 馃槵
yup: taking and reverting to a snapshot works for me.
yup: taking and reverting to a snapshot works for me.
Great 馃槑. I'll close it then. Please reopen if other issues arise.
what would also be interesting (only theoretically) is whether M1 saved snapshots could be loaded on Intel machine ... but in practise its maybe not as important ...
Most helpful comment
Great news and great teamwork!
I'll upload version v0.16.2 tomorrow.