-
Notifications
You must be signed in to change notification settings - Fork 180
Description
Hello KAWAZOME Ichiro,
first of all thank you for udmabuf, it is much easier to handle than the AMD/XLNX DMA approaches.
I have a question to sync_for_cpu which I want to clarify so it will ease debugging.
Right now I have a 256MByte udma buffer. It is used as a ping-pong buffer, meaning the PL IP core writes to the memory and when half of the data is written, an interrupt is generated. Then the Linux user space app is streaming the data per TC/IP (the first 128MB). The Linux network stack needs to be faster than the PL IP. During sending, the PL IP writes to the second 128MB and generates an interrupt at the end. This should repeat continuously.
When using the O_SYNC flag, reading out the buffer takes too long (network stack is slower than PL IP). However, when not using O_SYNC I see cache issues (I believe) but the performance would be ok. I hoped, as the buffer is so large, that the CPU cache controller always needs to fetch the data from memory.
So when using this code sequence before each TCP send call:
unsigned char attr[1024];
unsigned long sync_offset = 0;
unsigned long sync_size = 0x10000;
unsigned int sync_direction = 0;
unsigned long sync_for_cpu = 1;
if ((fd = open("/sys/class/u-dma-buf/udmabuf0/sync_for_cpu", O_WRONLY)) != -1) {
sprintf(attr, "0x%08X%08X", (sync_offset & 0xFFFFFFFF), (sync_size & 0xFFFFFFF0) | (sync_direction << 2) | sync_for_cpu);
write(fd, attr, strlen(attr));
close(fd);
}
The whole buffer is invalidated? Does it need to be called once or before each TCP send function?
Is this approach with udmabuf ping pong buffer the most effective when PL data should be streamed via network?
The data read when reading from user space is 125MByte/sek for network and later 300MByte for writing data to an SSD.
Thanks
Marco