We're using DPDK (version 20.08 on ubuntu 20.04, c application) to receive UDP packets with a high throughput (>2 Mpps)
Contrary, since we only need to send a few configuration messages, we send messages through the default network stack. This way, we can use lots of readily available tools to send configuration messages; however, since all the received data is consumed by DPDK, these tools do not get back any messages.
The most prominent issue arises with ARP negotiation: the host tries to resolve addresses, the clients also do respond properly, however, these responses are all consumed by DPDK such that the host cannot resolve the addresses and refuses to send the actual UDP packets.
Our idea would be to filter out the high throughput packets on our application and somehow "forward" everything else (e.g. ARP responses) to the default network stack. Does DPDK have a built-in solution for that? I unfortunatelly coulnd't find anything in the examples.
I've recently heard about the packet function which allows to inject packets into SOCK_DGRAM sockets which may be a possible solution. I also couldn't find a sample implementation for our use-case, though. Any help is greatly appreciated.
CodePudding user response:
For the current use case, the best option is to make use of DPDK TAP PMD (which is part of LINUX DPDK). You can use Software or Hardware to filter the specific packets then sent it desired TAP interface.
A simple example to demonstrate the same would be making use DPDK skeleton
example.
- build the DPDK example via
cd [root folder]/example/skeleton; make static
- pass the desired Physical DPDK PMD NIC using DPDK eal options
./build/basicfwd -l 1 -w [pcie id of DPDK NIC] --vdev=net_tap0;iface=dpdkTap
- In second terminal execute
ifconfig dpdkTap 0.0.0.0 promisc up
- Use tpcudmp to capture Ingress and Egress packets using
tcpdump -eni dpdkTap -Q in
andtcpdump -enu dpdkTap -Q out
respectively.
Note: you can configure ip address, setup TC on dpdkTap
. Also you can run your custom socket programs too. You do not need to invest time on TLDP, ANS, VPP as per your requirement you just need an mechanism to inject and receive packet from Kernel network stack.
CodePudding user response:
Theoretically, if the NIC in question supports the embedded switch feature, it should be possible to make use of a virtual function (VF) created in addition to namely the physical function (PF).
- The PF is bound to DPDK userspace driver: it is used by the application (UDP traffic consumption).
- The VF is bound to Linux kernel driver (regular network stack): it is used to send and receive configuration messages. This way, there is a regular Linux network interface bound to the VF.
The DPDK application should pass -w [pci:dbdf],representor=0
argument to EAL in order to instantiate a representor for the VF, a separate ethdev used to control traffic flows of the VF. By default, all packets sent via the VF's regular network interface go to that representor ethdev, that is, to the application. And vice versa: the VF's network interface won't see any packets except those injected to it by the DPDK application via the VF's representor.
In order to implement the desired filtering / redirection of configuration messages, the application should insert a couple of rules.
The 1st rule's properties are as follows:
- The match pattern includes item REPRESENTED_PORT with ethdev ID
0
. This way, the application tells the NIC to match packets going from the physical network port represented by the PF ethdev. - The match pattern includes network items to look for configuration messages. In example, for ARP, one can use item
ETH
matching on EtherType0x0806
. - The rule's action is REPRESENTED_PORT with ethdev ID
1
. This way, matching packets (in example, ARP packets) are redirected to the port represented by the VF representor ethdev, that is, to the VF itself.
The 2nd rule's properties are inverse:
- Item REPRESENTED_PORT with ethdev ID
1
- Network items (in example
ETH
with EtherType0x0806
) - Action REPRESENTED_PORT with ethdev ID
0
The short of it: this way, it is possible to "split" data plane traffic (high-rate UDP packets) and control messages. The former go to the application (via the PF ethdev), as usual. The latter go to the VF, which, in turn, is bound to Linux kernel driver and has its regular network interface. The traffic is separated in the embedded switch, deep in the hardware, and the rules for that are inserted by the DPDK application.