Getting USB device descriptor errors on boot of arm64 kernel using rpi-4.19.y branch.
All usb ports unusable and unable to detect devices.
Below is the dmesg log.
[ 0.206616] usbcore: registered new interface driver usbfs
[ 0.206663] usbcore: registered new interface driver hub
[ 0.206761] usbcore: registered new device driver usb
[ 0.451596] usbcore: registered new interface driver r8152
[ 0.451652] usbcore: registered new interface driver lan78xx
[ 0.451705] usbcore: registered new interface driver asix
[ 0.451752] usbcore: registered new interface driver smsc95xx
[ 0.455294] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.19
[ 0.455307] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 0.455318] usb usb1: Product: xHCI Host Controller
[ 0.455328] usb usb1: Manufacturer: Linux 4.19.58-v8+ xhci-hcd
[ 0.455338] usb usb1: SerialNumber: 0000:01:00.0
[ 0.456670] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 4.19
[ 0.456682] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 0.456692] usb usb2: Product: xHCI Host Controller
[ 0.456702] usb usb2: Manufacturer: Linux 4.19.58-v8+ xhci-hcd
[ 0.456712] usb usb2: SerialNumber: 0000:01:00.0
[ 0.483608] dwc_otg fe980000.usb: base=(____ptrval____)
[ 0.790156] usb 1-1: new high-speed USB device number 2 using xhci_hcd
[ 0.889508] dwc_otg fe980000.usb: DWC OTG Controller
[ 0.889533] dwc_otg fe980000.usb: new USB bus registered, assigned bus number 3
[ 0.889574] dwc_otg fe980000.usb: irq 21, io mem 0x00000000
[ 0.889885] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.19
[ 0.889897] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 0.889908] usb usb3: Product: DWC OTG Controller
[ 0.889918] usb usb3: Manufacturer: Linux 4.19.58-v8+ dwc_otg_hcd
[ 0.889928] usb usb3: SerialNumber: fe980000.usb
[ 0.891408] usbcore: registered new interface driver usb-storage
[ 0.897443] usbcore: registered new interface driver usbhid
[ 0.897451] usbhid: USB HID core driver
[ 0.918449] usb 1-1: device descriptor read/64, error 18
[ 1.154449] usb 1-1: device descriptor read/64, error 18
[ 1.390200] usb 1-1: new high-speed USB device number 3 using xhci_hcd
[ 1.518444] usb 1-1: device descriptor read/64, error 18
[ 1.754502] usb 1-1: device descriptor read/64, error 18
[ 1.862462] usb usb1-port1: attempt power cycle
[ 2.518209] usb 1-1: new high-speed USB device number 4 using xhci_hcd
[ 2.538978] usb 1-1: device descriptor read/8, error -61
[ 2.666644] usb 1-1: device descriptor read/8, error -61
[ 2.904871] usb 1-1: new high-speed USB device number 5 using xhci_hcd
[ 2.922966] usb 1-1: device descriptor read/8, error -61
[ 3.051140] usb 1-1: device descriptor read/8, error -61
[ 3.158510] usb usb1-port1: unable to enumerate USB device
[ 3.383972] usbcore: registered new interface driver brcmfmac
It's caused by https://github.com/raspberrypi/linux/commit/d5dc848c982dff2e020f294e384447efe6ea6617.
It should be reverted.
@Strit that commit is a step forward - I see that is not fixing everything but at least some devices are working with it. You still have the old option to limit ram and things will get back on track.
@pelwell do you have any leads on this? I'm directly interested in making this work.
It sounds like the DMA bounce buffer support needed by the PCIe block is either not hooked up or not working.
How does that come into play given that we limit now the dma to 1G?
My understanding (and this whole area is a bit murky and complicated) is that the DMA zone limits where memory for dma_*_alloc is placed.
PCIe is a bus master, so it is effectively also a DMA controller, but it can only get to the bottom 3GB of RAM. If there is data in the top 1GB to be read from our written to USB then something has to arrange to copy to or from an intermediate buffer before or after the transfer. This is a common requirement, and yet bounce buffers don't seem to have a common implementation - individual subsystems seem to do it differently.
That would explain it indeed. I found the implementation of the bounce buffers for the pcie controller and saw that the threshold is set to 3G. This basically means that we need to set this to 1G as well, right? At least that is how it sounds to me. But I'm confused how come this works on 32bit as we similarly only limit the zone and the driver bounce threshold default to the same 3G. There is something I still miss in this story.
[ Again - this is just what I think is going on, I can't guarantee it is correct. ]
Think about what happens when an application needs to send some data to USB: a user-space pointer will get passed to a system call. At this point the driver or framework has a choice - whether to copy the data into an intermediate buffer for transmission or whether to attempt to use it in-place. The DMA Zone helps with the first case because it can guarantee that the intermediate buffer is in a DMA-safe location. It doesn't help (directly) with the second case because those user pages could be anywhere in RAM, so the PCIe driver hooks into the DMA methods for the USB device and (transparently to the user) performs a copy at the point the buffer is mapped for DMA, returning the address of the bounce buffer rather than the user pages.
That was my understanding as well. Thanks for clarifying @pelwell . So in this case we need to make sure the threshold is indeed 1G. We need to allocate bounce buffers for everything that goes beyond this limit. The first 1G is marked as DMA safe-zone so everything else needs to go through bounce buffers.
That's still not correct. For PCIe (which means for XHCI USB on the Pi 4), any memory within the first _3GB_ would be safe to use as a bounce buffer for anything in the last 1GB on a 4GB Pi4. PCIe _is_ a DMA controller, it doesn't need to use the system DMA controller, so the 1GB limit does not apply. However there seems to be only one global DMA_ZONE configuration, so it must be set to the largest value safe for _all_ DMA controllers.
But that exactly what we do. The DMA_ZONE configuration applies to all DMA controllers (at least it should) and we set it to the first 1GB. The system controller can address that and also the PCIe should work with that limit.
What I am saying is that a 1GB limit is unnecessarily pessimistic for PCIe as a bounce buffer threshold. Yes it's fine to allocate bounce buffers from memory below 1GB for PCIe as well, but don't require the use of bounce buffers for USB/PCIe accesses to RAM above 1GB - the threshold should be 3GB, or performance will suffer.
Got it now. It makes sense. Thanks for expanding it.
@pelwell I think I "found" the problem. The bounce buffers are only compiled when CONFIG_ARM is enabled. They don't compile on arm64.
I have been bashing my head for a while now and it seems to not be as easy as I thought. The current implementation uses a lot of arm_dma* identifiers (and not only). All these are arm specific. Does anyone know in depth this implementation? arm64 defines a couple of dma_ips structures but I'm not sure what is the way forward here as I'm not familiar with this part of the kernel.
@agherzan Good job! Thanks!
I will have another 2G memory more ;-)
Step by step.
@agherzan @pelwell This is the V5 of brcm pcie controller patches, there is mention of ARM64 part, maybe help?
https://www.spinics.net/lists/kernel/msg2909719.html
It's always good to see the approach others have taken, but I think that the multiple buses of BCM2711 and the different memory maps they see makes the per-device custom DMA ops a better technique.
@pelwell Would it be a better approach to limit the ram by setting the gpu mem to 1024? Do you foresee any issues with that?
The GPU can only address a single GB of RAM. 2711 has a register that controls which GB of RAM it uses, but a) it's only ever been the first GB so changing that will break everything, and b) persuading Linux to limit DMA allocations to a GB that isn't the first sounds like a new version of the problems we are trying to solve, so I think we should just pretend for now that the mapping is fixed. Setting GPU mem to 1024 (if it worked, and I don't think it would) would reserve the first GB (call it GB0) for the VPU. You now have an ARM complex that can't use address 0x000xxxxx, where the kernel likes to live, and that only has 2GB of usable RAM (GB1 and GB2) - GB3 is still inaccessible due to the PCIe wrapper problem we're trying to work around. Does that sound like progress?
In summary, total_mem=3072 is the best workaround I know of.
I've created a trivial patch to add bounce buffers on ARM64 for pcie-bcrmstb.
pcie-brcmstb-bounce64.patch.zip
I'm sure there is a much better way to do that, but this seem to be working for me.
root@raspberrypi:~# uname -a
Linux raspberrypi 4.19.64-v8-test1+ #31 SMP PREEMPT Sun Aug 11 23:47:03 BST 2019 aarch64 GNU/Linux
root@raspberrypi:~# free -h
total used free shared buff/cache available
Mem: 3.7Gi 151Mi 3.4Gi 8.0Mi 186Mi 3.5Gi
Swap: 99Mi 0B 99Mi
root@raspberrypi:~# lsusb
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 002: ID 152d:0562 JMicron Technology Corp. / JMicron USA Technology Corp.
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 004: ID 045e:001d Microsoft Corp. Natural Keyboard Pro
Bus 001 Device 003: ID 0451:1446 Texas Instruments, Inc. TUSB2040/2070 Hub
Bus 001 Device 002: ID 2109:3431 VIA Labs, Inc. Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
I've created a trivial patch to add bounce buffers on ARM64 for pcie-bcrmstb.
pcie-brcmstb-bounce64.patch.zip
I tried this patch, it works for me!
All 4G memory is alive now, COOL !! :-)
Thanks @yaroslavros !
Please submit this patch as a Pull Request - complete with a Signed-off-by: line - to make it easier for people to review.
Here is a PR: https://github.com/raspberrypi/linux/pull/3144
I've created a trivial patch to add bounce buffers on ARM64 for pcie-bcrmstb.
pcie-brcmstb-bounce64.patch.zip
How does one apply/install this .patch?
This is fixed in apt firmware.
Most helpful comment
I've created a trivial patch to add bounce buffers on ARM64 for pcie-bcrmstb.
pcie-brcmstb-bounce64.patch.zip
I'm sure there is a much better way to do that, but this seem to be working for me.
root@raspberrypi:~# uname -a Linux raspberrypi 4.19.64-v8-test1+ #31 SMP PREEMPT Sun Aug 11 23:47:03 BST 2019 aarch64 GNU/Linux root@raspberrypi:~# free -h total used free shared buff/cache available Mem: 3.7Gi 151Mi 3.4Gi 8.0Mi 186Mi 3.5Gi Swap: 99Mi 0B 99Mi root@raspberrypi:~# lsusb Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 002: ID 152d:0562 JMicron Technology Corp. / JMicron USA Technology Corp. Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 004: ID 045e:001d Microsoft Corp. Natural Keyboard Pro Bus 001 Device 003: ID 0451:1446 Texas Instruments, Inc. TUSB2040/2070 Hub Bus 001 Device 002: ID 2109:3431 VIA Labs, Inc. Hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub