Skip to content

I have reworked the XDMA procedure #338

@Prandr

Description

@Prandr

The mainline Xilinx DMA driver creates descriptors strictly on the boundaries of a page,
thus limiting them to only 4096 bytes. This way it makes very inefficient use of them.

I have reworked the DMA procesure from ground up. It merges contiguous
address ranges both in physical and DMA (bus) address spaces into as few descriptors as
possible. This reduces transfer overhead, as larger transfers no longer have to be split
and interrupted to be iteratively submitted to the IP core. It is particulary optimised
for modern CPUs with an IOMMU, that may be able to map the whole data buffer into a
contiguous DMA address range.

Furthermore, handling of the transfer has been drastically slimmed down to bring down the
latency as far as possible. It aspires to implement Figures 5-7 from PG195 in
most efficient manner. Among the improvements:

  • Length of the adjacent descriptor blocks gets adapted to the Maximum Request Read
    Size (MRRS) as PG195 (p. 24) commands.
  • The number of max descriptors per transfer could be adapted to FIFO size and
    number of channels (per p. 26 in PG195). However, due to a bug in the IP Core
    (see Known Issues) a transfer is limited to a single adjacent block. It is still
    much larger than in the old driver: typically just under 4 GB, double of that if
    the MRRS is 1024 bytes.
  • The memory for engines is allocated dynamically, which saves a little bit of
    kernel memory.

In order to achieve these goals the driver has following limitations and becuase of that,
won't be submitted as PR here:

  • Removed transfer queues and support for asynchronous I/O. Allegedly it is
    broken, and almost noone uses it anyway.
  • The backward support for Linux kernels is limited to version 4.12
  • The individual engines are each allowed to be opened and operated by a single thread.
    This permits to mostly eliminate locking.

This reworked version includes my PRs (in stable form) as well as relevant PRs by others
(like alonbl's patch set) that doesn't concern XDMA procedure (since it was thrown out in
full anyway).

Other features:

  • Reworked poll mode as compile-only option. Everything done in the same thread so there are no core migrations or context switches
  • Reworked ioctl functions. Added ability to submit transfer request over ioctl. This is
    primarily intended to circumvent 2 GB limit for read and write operations in Linux.
  • Reworked bypass BAR. It is now properly implemented, so it is possible to transmit
    data on it. It could be useful for small transfers, that require low and stable latency.
  • Descriptor bypass is NOT supported

You can find the reworked driver on my repo under "reworked_xdma_main" branch.
https://github.com/Prandr/dma_ip_drivers
Further discussions are best conducted there.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions