-
Notifications
You must be signed in to change notification settings - Fork 13
Home
A while back I had the idea to turn a Raspberry Pi Zero into a $5
USB to HDMI/SDTV/DSI/DPI display adapter.
This series adds a USB host driver and a device/gadget driver to achieve
that.
Version 1 of the patchset based on the downstream Raspberry Pi repo is available here (graphics support for Pi4 is missing in mainline, patches are on the ML):
https://github.com/notro/linux/tree/gud_drm-1
The one thing that decides how useful this all is, is how smooth an experience it gives. My hope was that it should not be noticeably laggy with ordinary office use on 1920x1080@RG16. I'm pleased to see that it's also possible to watch youtube movies, although not in fullscreen.
Some of the main factors that affects performance:
-
Display resolution
-
Userspace providing damage reports (FB_DAMAGE_CLIPS or DRM_IOCTL_MODE_DIRTYFB)
-
Color depth (DRM_CAP_DUMB_PREFERRED_DEPTH = 16 if RGB565)
-
How well the frames compress (lz4)
-
USB2 vs. USB3
-
Gadget device memory bandwidth, CPU power for decompression
-
(Big endian hosts will have to do byte swapping on the frames)
I've tested these:
-
xorg-server on Pi4. This was nice and smooth since it uses DRM_IOCTL_MODE_DIRTYFB and honours DRM_CAP_DUMB_PREFERRED_DEPTH.
-
Ubuntu 20.04 GNOME on x86. This was useable, but not so good for movies. GNOME doesn't look at DRM_CAP_DUMB_PREFERRED_DEPTH and doesn't set FB_DAMAGE_CLIPS on the pageflips.
I've made a short video to show what it looks like: https://youtu.be/AhGZWwUm8JU
One main factor for tearing is the size of the transfer buffer. The device announces how much it can receive (and decompress) in one transfer. The host tries to kmalloc a buffer of this size and keeps cutting it in half until it succeeds. The maximum size for this buffer is 4MB by default (KMALLOC_MAX_SIZE). 1920x1080@RG16 fits in 4MB. If the update doesn't fit in this buffer (uncompressed), it will split it into multiple updates. When showing a movie this causes a constant tearing line where the split is.
Solutions:
-
Increase CONFIG_FORCE_MAX_ZONEORDER (KMALLOC_MAX_SIZE), but it has to be done on both the host and the device (distros probably don't want to increase this just for a usb driver). I tried using a scatter/gather buffer (usb_sg_init) in the host to work around KMALLOC_MAX_SIZE, but that gave me bounce buffers (swiotlb) on Pi4 (and: swiotlb buffer is full).
-
Change the protocol and let the device know that a split frame is coming so it can decompress each part into a temporary buffer and then apply it in one go.
-
The lz4 library in the kernel has only support for the block format, not the LZ4 Frame format. Maybe that could have helped, I don't know, I'm a noob here.
Device side tearing
Received updates are decompressed/copied into the framebuffer that's being scanned out, so tearing is possible. Double buffering/page flipping could be used for full frames, but well behaving userspace doesn't send many of those, so very little impact. Double buffering on all updates would require memcpy between buffers locally, hampering performance. Maybe a Pi4 could get away with it without loosing too much performance, but a Pi Zero would certainly not (memory bandwidth).
I have hacked together something: STM32F769I-DISCO
One use case for these drivers is reusing old tablets and cell phones as USB displays.
The Pi4 has two hdmi ports and I was asked if the driver supports that, and I've concluded that it would be too much work to implement at this point. It is possible to extend the protocol and implement it later.
I have used a Pi4 as the gadget device during development since it has much better memory bandwith (4000 vs 200 MBps) and CPU than the Pi Zero. They both have the same gadget controller (dwc2).
I've tested this series with usbip by connecting 2 Pi4's over cabled gigabit network. It worked fine.
Please help in keeping this wiki accurate and useful. Anyone with a github account can contribute.