Hi,
I am getting MSR Allocation errors on Intel Clear Linux

This warning should be already fixed #1912 I suspect you load the msr module before the miner starts without allow_writes=on parameter. Anyway it's just a warning not error.
Thank you.
Hi XMRIG,
I understand the MSR errors are just warning now but are you saying to pass --allow-writes=on in xmrig command line?
Also, Clear Linux is having trouble allocating 1GB huge pages, as seen in the photo "failed to allocate randomx dataset using 1GB pages."
How can I fix this?
Anyway it's just a warning not error.
For now.
I'd strongly suggest you guys stop poking at the MSRs directly from userspace but use a proper file in sysfs. Would you be willing to test a patch if we expose the hw prefetcher controls somewhere in /sys/devices/system/cpu/cpuN/... and you convert your tool to write to that file instead of using the wrmsr tool?
Thx.
@bp3tk0v Which file exactly? and please note there is no single value for prefetcher controls even for same vendor CPUs https://xmrig.com/docs/miner/randomx-optimization-guide/msr advanced format to use masks read/write access is required.
Thank you.
@bp3tk0v Which file exactly? and please note there is no single value for prefetcher controls even for same vendor CPUs https://xmrig.com/docs/miner/randomx-optimization-guide/msr
advanced formatto use masks read/write access is required.
Well, that is the good thing about putting it in the kernel - one is free to design the abstraction as optimally as possible. Also, from reading that page a bit, you won't need to fiddle with secure boot either, once you have a sysfs interface.
And you can hide all that complexity and vendor-specific bit settings in the kernel too.
So, for example, if you only want to disable prefetchers, I think it could be something as simple as
echo "disable" > /sys/devices/system/cpu/config/prefetchers
or so. This way you can disable the prefetchers globally. The exact name and place will be the result of the usual LKML bikeshedding. :-)
The control can be also finer-grained - per CPU, disable single prefetchers only, etc, etc. It all is dictated by the requirements people would have.
So once you know what your requirements are - i.e., how you want to disable the prefethers from your tool - just send a note to x86-at-kernel.org (replace "-at-" with @) where we can continue designing it and test patches.
Thx.
@bp3tk0v It would be nice if you figured it out and XMRig would happily use it, but the problem is that MSR registers that control this are not documented at all. Current patch is the result of many trials and errors and is essentially a black box. I have only a vague idea what each of the 4 registers controls and what some bits in them do. They control more than just prefetchers. You can't expose /sys/devices/system/cpu/config/prefetchers without getting some heavy NDA'd documents from AMD first.
@SChernykh that all depends on the use case, of course. I saw yesterday while browsing through those issue pages here that Intel have actually documented the prefetchers:
the gist being that for some workloads disabling the prefetchers might make sense. So I don't see why AMD won't follow there if it makes sense to do so. Like they've done in the past for other things.
What people should not do is poke at random MSRs. This is always a bad idea and we're tainting the kernel if userspace has done so so that people looking at bug reports can know that the CPU configuration has been changed in an unexpected way.
Now, do you have a writeup somewhere and perhaps even performance data to show that disabling the prefetchers for your workload makes sense?
Hard proof data always helps when talking to hw people. :)
Thx.
So I don't see why AMD won't follow there if it makes sense to do so. Like they've done in the past for other things.
They only share it with motherboard manufacturers. It's not a public information. Why? Ask them.
Now, do you have a writeup somewhere and perhaps even performance data to show that disabling the prefetchers for your workload makes sense?
Silly question. No, we're tinkering with MSRs just for fun, you know. All necessary links were already quoted here.
disabled value, because it is much more complicated than a single value.@bp3tk0v I understand your position and motivation and will be happy to consensus, but right now it looks like disallowing knives, because people can do weird things with knives, not possible to provide a safe interface for every possible case.
They only share it with motherboard manufacturers. It's not a public information. Why? Ask them.
There are multiple reasons why. And sometimes, if it makes sense, they - and by they I mean all vendors - document stuff publicly.
Silly question. No, we're tinkering with MSRs just for fun, you know. All necessary links were already quoted here.
I was being serious but your call. If you wanna solve this properly, get back to me with performance proof that shows that disabling the hw prefetchers shows a difference which is not in the noise.
* Disabling raw access to MSR registers will make impossible research or fine tuning, you really suggest to use Windows for it in future?
I cannot talk about the future. Right now MSR writes are not disabled - there's an innocent warning that gets issued into dmesg and the kernel gets tainted. What that means is that bug reports about such kernels might not be taken seriously. Just like bug reports when using proprietary software. IOW, users are on their own.
* Users do not update the kernel every time, so for example some LTS users will wait for changes in prefetchers for years until the next release.
LTS kernels get stable backports. If desired, that functionality can be backported there pretty easily.
* Even for Intel where is only one MSR register it is not possible to implement a simple `disabled` value, because it is much more complicated than a single value.
It all depends on how/what you want to achieve. And it doesn't matter how many MSRs there are - you can write them all in one go.
@bp3tk0v I understand your position and motivation and will be happy to consensus, but right now it looks like disallowing knives, because people can do weird things with knives, not possible to provide a safe interface for every possible case.
No, we're not doing that. We're asking people to help us define a proper interface and move their tools to using it instead of poking at random MSRs. And I've converted our in-tree tools already so it is very easily doable actually.
I hope that makes sense.
Thx.
I was being serious but your call. If you wanna solve this properly, get back to me with performance proof that shows that disabling the hw prefetchers shows a difference which is not in the noise.
I already linked above to the page https://xmrig.com/docs/miner/randomx-optimization-guide/msr it contains all information, what registers changed with values and it also contains links to early reddit discussions. You can also measure performance differences by yourself.
get back to me with performance proof that shows that disabling the hw prefetchers shows a difference which is not in the noise.
I think you are not being serious. It's been a common knowledge for over a year that turning off prefetchers (using MSR mod) increases hashrate significantly. As I said, links are in this topic, take your time and follow to the reddit discussions from https://xmrig.com/docs/miner/randomx-optimization-guide/msr
What we need is the access to all MSR registers listed in https://github.com/xmrig/xmrig/blob/dev/scripts/randomx_boost.sh , all bits in them. If it doesn't work in some future Linux release, we'll find a workaround anyway.
You can also measure performance differences by yourself.
What would be a typical workload to run?
Guessing wildly here but, probably to benchmark things you'd run one of the provided benchmark scripts, with and then without the MSR "random writes" to very specific places with very specific values.
Then copy the working proven "random writes" into your kernel support for when someone writes "disable" to whatever sysfs location. I don't see how this makes the writes any less "random".
And then maintain knowledge of which CPUs like which MSRs so that an Atom without them doesn't complain, AMD MSRs are purely guesses, etc. Good luck.
You can also measure performance differences by yourself.
What would be a typical workload to run?
./xmrig --bench=1M without sudo and not under root so it can't apply MSRs. Then the same with sudo or under root. And you need to enable huge pages first to exclude that factor (preallocate 1280 huge pages).
Most helpful comment
./xmrig --bench=1Mwithout sudo and not under root so it can't apply MSRs. Then the same with sudo or under root. And you need to enable huge pages first to exclude that factor (preallocate 1280 huge pages).