I would like to revisit some key points before tackling porting 5.0 code :
Codebase reduction : I believe we should drop support on this repo for windows service, and keep the effort entirely on Redis core executables. Windows service could be a side effort as a contrib project, it's a cross-cutting concern.
Formatting : Porting 5.0 will be a lot easier if somehow we match the formatting back to antirez's repo.
Modules: We could port some key modules and maintain them.
Upstream: Currently our upstream is microsoft's archieve redis. Clearly that won't help, we could consider removing the upstream or forking antirez's.
Feel free to add any other topics we should discuss.
Answering your questions:
Win32_Interop project). Since Redis is a server and operates solely in the background - natural expectation (also mine) on Windows is that it will be running... in the background :) and started automatically. I myself would like to have it installed as Windows Service and do not bother to set up anything else that would start it up.src/*.{c,h} last week.1, 2 and are settled so. Ok
Regarding item 4, that's exactly what I've thought. The origin is stalled.
My suggestion would be for us to fork antirez's directly, sync our 4.14 changes with the lastest commit there and from there evolve. We can also rename the windows project and give it a github.io page. The goal would be to attract more interest and contributors.
Pros : semantically makes more sense, makes absorbing upstream commits straightforward in most cases.
Cons : work to be done to rebase the repository. Might be not worth the effort right now.
@tporadowski I've merged the 5.0.6 commits, been debugging and applying fixes but my time is up for now. Also synced the tcl tests and unit tests don't look bad.
I've been facing an issue with AOF and RDB save that I couldn't figure out yet.
=== REDIS BUG REPORT START: Cut & paste starting from here ===
Redis version: 5.0.6
[36560] 01 Oct 18:00:43.980 # --- EXCEPTION_ACCESS_VIOLATION
[36560] 01 Oct 18:00:43.980 # --- STACK TRACE
redis-server.exe!LogStackTrace(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_StackTrace.cpp:95)(0x100000003, 0x1403247E0, 0x00170F00, 0x00000001)
redis-server.exe!StackTraceInfo(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_StackTrace.cpp:154)(0x00000003, 0x140324948, 0x140324530, 0x00000001)
redis-server.exe!UnhandledExceptiontHandler(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_StackTrace.cpp:186)(0x0014BD80, 0x140067C00, 0x00000000, 0x00448B70)
KERNELBASE.dll!UnhandledExceptionFilter(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_StackTrace.cpp:186)(0x0014C500, 0x7FF86A764420, 0x00000000, 0x140000000)
ntdll.dll!memset(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_StackTrace.cpp:186)(0x00393E65, 0x7FF86A63D0BD, 0x0014C348, 0x0014C3D0)
ntdll.dll!_C_specific_handler(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_StackTrace.cpp:186)(0x00000000, 0x0014C330, 0x0014C9F0, 0x0014C9F0)
ntdll.dll!_chkstk(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_StackTrace.cpp:186)(0x00000001, 0x7FF86A620000, 0x00000000, 0x7FF86A78E9F0)
ntdll.dll!RtlRaiseException(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_StackTrace.cpp:186)(0xCCCCCCCCCCCCCCCC, 0xCCCCCCCCCCCCCCCC, 0x00000000, 0xCCCCCCCCCCCCCCCC)
ntdll.dll!KiUserExceptionDispatcher(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_StackTrace.cpp:186)(0x14046D97C, 0xCCCCCCCCCCCCCCCC, 0xCCCCCCCCCCCCCCCC, 0xCCCCCCCCCCCCCCCC)
redis-server.exe!dictFingerprint(C:\Repos\tporadowski-redis\src\dict.c:531)(0x00000000, 0xCCCCCCCCCCCCCCCC, 0xCCCCCCCCCCCCCCCC, 0xCCCCCCCCCCCCCCCC)
redis-server.exe!dictNext(C:\Repos\tporadowski-redis\src\dict.c:588)(0x05407040, 0xCCCCCCCCCCCCCCCC, 0xCCCCCCCCCCCCCCCC, 0xCCCCCCCCCCCCCCCC)
redis-server.exe!rdbSaveModulesAux(C:\Repos\tporadowski-redis\src\module.c:3388)(0x0014D490, 0x00000001, 0x0014D5A0, 0x00000009)
redis-server.exe!rdbSaveRio(C:\Repos\tporadowski-redis\src\rdb.c:1185)(0x0014D490, 0x0014D514, 0x100000000, 0x0014D5A0)
redis-server.exe!rdbSave(C:\Repos\tporadowski-redis\src\rdb.c:1318)(0x02592760, 0x0014D5A0, 0xCCCCCCCCCCCCCCCC, 0xCCCCCCCCCCCCCCCC)
redis-server.exe!do_rdbSave(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_QFork_impl.c:40)(0x02592760, 0x00000CC8, 0x02592750, 0x00000000)
redis-server.exe!QForkChildInit(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_QFork.cpp:344)(0x0000020C, 0x00009A00, 0x0000000A, 0x0014E578)
redis-server.exe!QForkStartup(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_QFork.cpp:525)(0x00000006, 0x00444E40, 0x00000000, 0x0014EEA8)
redis-server.exe!main(C:\Repos\tporadowski-redis\src\Win32_Interop\Win32_QFork.cpp:1283)(0x00000006, 0x00444E40, 0x00000000, 0x1400AADDD)
redis-server.exe!invoke_main(d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:79)(0x140311700, 0x140311C10, 0x00000000, 0x00000000)
redis-server.exe!__scrt_common_main_seh(d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288)(0x00000000, 0x00000000, 0x00000000, 0x00000000)
redis-server.exe!__scrt_common_main(d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:331)(0x00000000, 0x00000000, 0x00000000, 0x00000000)
redis-server.exe!mainCRTStartup(d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp:17)(0x00000000, 0x00000000, 0x00000000, 0x00000000)
KERNEL32.DLL!BaseThreadInitThunk(d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp:17)(0x00000000, 0x00000000, 0x00000000, 0x00000000)
ntdll.dll!RtlUserThreadStart(d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp:17)(0x00000000, 0x00000000, 0x00000000, 0x00000000)
ntdll.dll!RtlUserThreadStart(d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp:17)(0x00000000, 0x00000000, 0x00000000, 0x00000000)
If you happen to have some time, this could use a fresh pair of eyes.
@israellot I was hoping that moving on to latest version would be easier when we have 4.0.x up-to-date, but didn't expect this to be so fast :D. Great job!
I've taken a look at this unit/dump test that fails and have a solution for this which I'm testing now - it was related to module.c: static dict* modules pointer being accessed in child process which was not passed there (and was therefore NULL).
Fix pushed in db01475, but later on some other tests fail.
Thanks @tporadowski ! I'll continue to work on making tests pass.
Some more progress today. Now stuck at /unit/maxmemory.tcl test.
Expected condition '$slave_buf > 2*1024*1024' to be true (0 > 2*1024*1024)
Hand testing the counter seems fine when plugging in replicas. The test fails with a zeroed counter.
I'll be able to get back to it in a few days only probably. If you happen to take a look, let me know.
@israellot I compared all src/* and tests/* files with original 5.0.6 and for the moment decided to disable those 2 failing tests in unit/maxmemory as they require the slave process to be temporarily stopped (kill -SIGSTOP) and then resumed (kill -SIGCONT). For that we'd have to find a similar way under Windows (either through some PsSuspend from SysInternals or some other tool).
It makes sense. That being said we have a build that passes all the tests.
Out of curiosity I downloaded latest 1.0.8 tech-preview release of Memurai. It is up-to-date with Redis 5.0.3 from what I understood, so I linked their executable (memurai.exe) to src/redis-server, removed one test that was checking version string and ran the tests from our develop branch (which is now at 5.0.6). Well, I don't remember now when it failed, but AFAIR it didn't even reach half of the tests... I will try to find out what causes them to fail on my side when I have some time.
So I think we're good and could even release 5.0.6 soon. There's still #4 though which might be worth looking at as maybe there is something vital fixed/added in deps that Redis for Windows would also benefit from?
Yes, we should probably update jemalloc to 5.x
I managed to get a compiling version of the project with jemalloc 5.2.1
Being able to ./configure jemalloc was a pain though.
Redis now fails at the same point on a RDB fork while saving module data.
I'm not sure if the missing pointer is the problem anymore.
You can check branch jemalloc-5.2.1
Hi @tporadowski
as the dev manager of Memurai I would like to inform you that Memurai passes 100% of the Redis tests (unit, integration, sentinel and cluster).
Thank you and good luck with your project.
Hi @enricogior,
I wouldn't expect to be otherwise :), but as I read my previous comment now it lacked this information, sorry for that. I edited it already and wrote that I will try to find out why they fail on my side, as I was certain they pass for your releases.
@israellot pointer to modules dictionary is passed without any problems, it seems though that this memory is allocated in a different way now and is not accessible to child (forked) process.
From Jemalloc release notes
Unlike all previous jemalloc releases, this release does not use naturally aligned "chunks" for virtual memory management, and instead uses page-aligned "extents". This change has few externally visible effects, but the internal impacts are... extensive. Many other internal changes combine to make this the most cohesively designed version of jemalloc so far, with ample opportunity for further enhancements.
Maybe the new allocation scheme would need a Qfork implementation.
At this point I would suggest we stick with jemalloc 4.x for our redis 5.x and leave the allocator upgrade as further development efforts.
Any thoughts @tporadowski ?
I did try to upgrade it, but so far without success. Did you try jemalloc 4.x?
Yes, no luck with 4.5.0 as well. Maybe we are missing something obvious on jemalloc configure step.
I've tried a bunch of flags like --disable-tcache and etc, but non seemed to do the trick so far.
I was able to investigate further, the pointers are passed correctly but the child process is blocked from reading the memory address. It seems the memory cannot be shared cross process anymore when updating the allocator.
@israellot I've been thinking about that newer jemalloc version for a while and wondering if not to switch to using threads even instead of a fork()ed process... and then I reminded myself of something which I totally, TOTALLY (!) forgot - jemalloc is customized in this fork... 馃ぃ I need to put this into Wiki or README. Whenever jemalloc calls VirtualAlloc or VirtualFree - those need to go through our AllocHeapBlock/PurgePages and FreeHeapBlock functions as they record which memory regions are to be shared with child process... Now jemalloc-5.2.1 branch looks much more promising, tests on debug version mostly go through, just release version is not working properly yet for some reason.
That really explains a lot! Good job @tporadowski
Did you see any benefit of using threads instead of the pseudo-fork?
The main concern would still be the same, which is the copy-on-write shared memory between parent and child threads I believe.
Last time I ran the tests on jemalloc-5.2.1 branch they failed miserably and didn't make it till finish. Now, I've taken one more time to try them on a fresh new build - and... magic! Just one test failed:
*** [err]: Test replication partial resync: no backlog (diskless: yes, reconnect: 1) in tests/integration/replication-psync.tcl.
That looks both very promising and mysterious ;), will have to look into that in upcoming days.
Redis 5.0.9 for Windows has just been released! Thanks @israellot for all your efforts!
I'm glad to see the 5.x release !
Thanks for the last effort!
Most helpful comment
Answering your questions:
Win32_Interopproject). Since Redis is a server and operates solely in the background - natural expectation (also mine) on Windows is that it will be running... in the background :) and started automatically. I myself would like to have it installed as Windows Service and do not bother to set up anything else that would start it up.src/*.{c,h}last week.