On 2024-05-14 14:20:03 [+0200], Jesper Dangaard Brouer wrote:
Trick for CPU-map to do early drop on remote CPU:
# ./xdp-bench redirect-cpu --cpu 3 --remote-action drop ixgbe1
I recommend using Ctrl+\ while running to show more info like CPUs being
used and what kthread consumes. To catch issues e.g. if you are CPU
redirecting to same CPU as RX happen to run on.
Okay. So I reworked the last two patches make the struct part of
task_struct and then did as you suggested:
Unpatched:
|Sending:
|Show adapter(s) (eno2np1) statistics (ONLY that changed!)
|Ethtool(eno2np1 ) stat: 952102520 ( 952,102,520) <= port.tx_bytes /sec
|Ethtool(eno2np1 ) stat: 14876602 ( 14,876,602) <= port.tx_size_64 /sec
|Ethtool(eno2np1 ) stat: 14876602 ( 14,876,602) <= port.tx_unicast /sec
|Ethtool(eno2np1 ) stat: 446045897 ( 446,045,897) <= tx-0.bytes /sec
|Ethtool(eno2np1 ) stat: 7434098 ( 7,434,098) <= tx-0.packets /sec
|Ethtool(eno2np1 ) stat: 446556042 ( 446,556,042) <= tx-1.bytes /sec
|Ethtool(eno2np1 ) stat: 7442601 ( 7,442,601) <= tx-1.packets /sec
|Ethtool(eno2np1 ) stat: 892592523 ( 892,592,523) <= tx_bytes /sec
|Ethtool(eno2np1 ) stat: 14876542 ( 14,876,542) <= tx_packets /sec
|Ethtool(eno2np1 ) stat: 2 ( 2) <= tx_restart /sec
|Ethtool(eno2np1 ) stat: 2 ( 2) <= tx_stopped /sec
|Ethtool(eno2np1 ) stat: 14876622 ( 14,876,622) <= tx_unicast /sec
|
|Receive:
|eth1->? 8,732,508 rx/s 0 err,drop/s
| receive total 8,732,508 pkt/s 0 drop/s 0 error/s
| cpu:10 8,732,508 pkt/s 0 drop/s 0 error/s
| enqueue to cpu 3 8,732,510 pkt/s 0 drop/s 7.00 bulk-avg
| cpu:10->3 8,732,510 pkt/s 0 drop/s 7.00 bulk-avg
| kthread total 8,732,506 pkt/s 0 drop/s 205,650 sched
| cpu:3 8,732,506 pkt/s 0 drop/s 205,650 sched
| xdp_stats 0 pass/s 8,732,506 drop/s 0 redir/s
| cpu:3 0 pass/s 8,732,506 drop/s 0 redir/s
| redirect_err 0 error/s
| xdp_exception 0 hit/s
I verified that the "drop only" case hits 14M packets/s while this
redirect part reports 8M packets/s.
Patched:
|Sending:
|Show adapter(s) (eno2np1) statistics (ONLY that changed!)
|Ethtool(eno2np1 ) stat: 952635404 ( 952,635,404) <= port.tx_bytes /sec
|Ethtool(eno2np1 ) stat: 14884934 ( 14,884,934) <= port.tx_size_64 /sec
|Ethtool(eno2np1 ) stat: 14884928 ( 14,884,928) <= port.tx_unicast /sec
|Ethtool(eno2np1 ) stat: 446496117 ( 446,496,117) <= tx-0.bytes /sec
|Ethtool(eno2np1 ) stat: 7441602 ( 7,441,602) <= tx-0.packets /sec
|Ethtool(eno2np1 ) stat: 446603461 ( 446,603,461) <= tx-1.bytes /sec
|Ethtool(eno2np1 ) stat: 7443391 ( 7,443,391) <= tx-1.packets /sec
|Ethtool(eno2np1 ) stat: 893086506 ( 893,086,506) <= tx_bytes /sec
|Ethtool(eno2np1 ) stat: 14884775 ( 14,884,775) <= tx_packets /sec
|Ethtool(eno2np1 ) stat: 14 ( 14) <= tx_restart /sec
|Ethtool(eno2np1 ) stat: 14 ( 14) <= tx_stopped /sec
|Ethtool(eno2np1 ) stat: 14884937 ( 14,884,937) <= tx_unicast /sec
|
|Receive:
|eth1->? 8,735,198 rx/s 0 err,drop/s
| receive total 8,735,198 pkt/s 0 drop/s 0 error/s
| cpu:6 8,735,198 pkt/s 0 drop/s 0 error/s
| enqueue to cpu 3 8,735,193 pkt/s 0 drop/s 7.00 bulk-avg
| cpu:6->3 8,735,193 pkt/s 0 drop/s 7.00 bulk-avg
| kthread total 8,735,191 pkt/s 0 drop/s 208,054 sched
| cpu:3 8,735,191 pkt/s 0 drop/s 208,054 sched
| xdp_stats 0 pass/s 8,735,191 drop/s 0 redir/s
| cpu:3 0 pass/s 8,735,191 drop/s 0 redir/s
| redirect_err 0 error/s
| xdp_exception 0 hit/s
This looks to be in the same range/ noise level. top wise I have
ksoftirqd at 100% and cpumap/./map at ~60% so I hit CPU speed limit on a
10G link.
perf top shows
| 18.37% bpf_prog_4f0ffbb35139c187_cpumap_l4_hash [k] bpf_prog_4f0ffbb35139c187_cpumap_l4_hash
| 13.15% [kernel] [k] cpu_map_kthread_run
| 12.96% [kernel] [k] ixgbe_poll
| 6.78% [kernel] [k] page_frag_free
| 5.62% [kernel] [k] xdp_do_redirect
for the top 5. Is this something that looks reasonable?