Irq architecture for multi-core network driver.

From: David Daney
Date: Thu Oct 22 2009 - 17:41:16 EST

Next message: Sam Ravnborg: "Re: [Patch] sctp: remove deprecated SCTP_GET_*_OLD stuffs"
Previous message: Karol Lewandowski: "Re: [PATCH] SLUB: Don't drop __GFP_NOFAIL completely fromallocate_slab() (was: Re: [Bug #14265] ifconfig: page allocationfailure. order:5,ode:0x8020 w/ e100)"
Next in thread: Chris Friesen: "Re: Irq architecture for multi-core network driver."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

My network controller is part of a multicore SOC family[1] with up to 32 cpu cores.

The the packets-ready signal from the network controller can trigger
an interrupt on any or all cpus and is configurable on a per cpu basis.

If more than one cpu has the interrupt enabled, they would all get the
interrupt, so if a single packet were to be ready, all cpus could be
interrupted and try to process it. The kernel interrupt management
functions don't seem to give me a good way to manage the interrupts.
More on this later.

My current approach is to add a NAPI instance for each cpu. I start
with the interrupt enabled on a single cpu, when the interrupt
triggers, I mask the interrupt on that cpu and schedule the
napi_poll. When the napi_poll function is entered, I look at the
packet backlog and if it is above a threshold , I enable the interrupt
on an additional cpu. The process then iterates until the number of cpu
running the napi_poll function can maintain the backlog under the
threshold. This all seems to work fairly well.

The main problem I have encountered is how to fit the interrupt
management into the kernel framework. Currently the interrupt source
is connected to a single irq number. I request_irq, and then manage
the masking and unmasking on a per cpu basis by directly manipulating
the interrupt controller's affinity/routing registers. This goes
behind the back of all the kernel's standard interrupt management
routines. I am looking for a better approach.

One thing that comes to mind is that I could assign a different
interrupt number per cpu to the interrupt signal. So instead of
having one irq I would have 32 of them. The driver would then do
request_irq for all 32 irqs, and could call enable_irq and disable_irq
to enable and disable them. The problem with this is that there isn't
really a single packets-ready signal, but instead 16 of them. So If I
go this route I would have 16(lines) x 32(cpus) = 512 interrupt
numbers just for the networking hardware, which seems a bit excessive.

A second possibility is to add something like:

int irq_add_affinity(unsigned int irq, cpumask_t cpumask);

int irq_remove_affinity(unsigned int irq, cpumask_t cpumask);

These would atomically add and remove cpus from an irq's affinity.
This is essentially what my current driver does, but it would be with
a new officially blessed kernel interface.

Any opinions about the best way forward are most welcome.

Thanks,
David Daney

[1]: See: arch/mips/cavium-octeon and drivers/staging/octeon. Yes the staging driver is ugly, I am working to improve it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Sam Ravnborg: "Re: [Patch] sctp: remove deprecated SCTP_GET_*_OLD stuffs"
Previous message: Karol Lewandowski: "Re: [PATCH] SLUB: Don't drop __GFP_NOFAIL completely fromallocate_slab() (was: Re: [Bug #14265] ifconfig: page allocationfailure. order:5,ode:0x8020 w/ e100)"
Next in thread: Chris Friesen: "Re: Irq architecture for multi-core network driver."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]