Re: [PATCH 0/3] usb: gadget: uvc: allocate requests based on frame interval length and buffersize
From: Thinh Nguyen
Date: Tue May 21 2024 - 21:42:54 EST
On Wed, May 22, 2024, Michael Grzeschik wrote:
> On Fri, May 17, 2024 at 01:44:05AM +0000, Thinh Nguyen wrote:
> > On Mon, May 13, 2024, Michael Grzeschik wrote:
> > >
> > > I think we have to discuss what is ment by resynchronization here. If
> > > the trb ring buffer did run dry and the software is aware of this
> > > (elemnt in the started and prepared list) then the interrupt handler
> > > already is calling End Stream Command.
> >
> > The driver only aware of this when the controller tells it, which may be
> > already too late.
>
> In our special case there should not be any too late any more. Since we
> ensure that all requests will be enqueued for one transfer (which will
> represent one frame) in time and we are not dependent on the complete
> handler for nothing else than telling the uvc driver that the last
> request came back or if there was some error in the current active
> frame.
>
> As already stated we also have to wait with enqueueing the next frame
> to the hardware and only are allowed to enqueue one frame at a time.
> Otherwise it is not possible to tell the host if the frame was broken or
> not.
>
> I have the following scheme in my mind. It is simplified to take frames
> of only 4 requests into account. (>80 chars warning)
>
>
> frameinterval: | 125 us | 125 us | 125 us | 125 us | 125 us | 125 us | 125 us |
> | | | | | | | |
> pump thread: queue |rqA1 rqA2 rqA3 rqA4'| | | | |rqB0 rqB1 rqB2 rqB3 |rqB4' |
> irq thread: complete | |rqA1 |rqA2 |rqA3 |rqA4' | |rqB0 | rqB1
> qbuf thread: encode |rqB1 rqB2 rqB3 rqB4'| | | | |rqA1 rqA2 rqA3 rqA4'| |
>
> dwc3 thread: Hardware < Start Transfer End Transfer > < Start Transfer ....
>
> legend:
> - rq' : last request of a frame
> - rqB0 : first request of the next transfer with no payload but the header only
> telling the host that the last frame was ok/broken
>
> assumption:
>
> - pump thread is never interrupted by a kernel thread but only by some short running irq
> - if one request comes back with -EXDEV the rest of the enqueued requests should be flushed
>
> In the no_interrupt case we would also only generate the interrupt for
> the last request and giveback all four requests in the last interval.
> This should still work fine.
>
> We also only start streaming when one frame is totally available to be
> enqueued in one run. So in case frames with rqA and rqB both did come back
> with errors the start of the next frame will only begin after the next
> frame was completely and fully encoded.
Yes. This is better.
>
> > > When the stream is stopped, what implications does this have on the bus?
> > >
> > > When the Endpoint is enabled, will the hardware then send zero-length
> > > requests on its own?
> >
> > For isoc endpoint IN, yes. If the host requests for isoc data IN while
> > no TRB is prepared, then the controller will automatically send 0-length
> > packet respond.
>
> Perfect! This will help a lot and will make active queueing of own
> zero-length requests run unnecessary.
Yes, if we rely on the current start/stop isoc transfer scheme for UVC,
then this will work.
>
> > > With the next ep_queue we start another stream and when we keep up with
> > > this stream there is no underruns, right?
> > >
> > > I picture this scenario in my mind:
> > >
> > > thread 1: uvc->queue_buf is called:
> > > - we encode the frame buffer data into all available requests
> > > and put them into the per uvc_buffer perpared list
> > > (as we precalculated the amount of requests properly to the expected
> > > frame duration and buffer size there will be enough requests
> > > available)
> > > - wake up the pump thread
> > >
> > > thread 2: pump_worker is triggered
> > > - take all requests from the prepared available buffer and enqueue them
> > > into the hardware
> > > (The pump worker is running with while(1) while it finds requests in
> > > the per buffer prepared list) and therefor will have a high chance
> > > to finish the pumping for one complete frame.
> > > - check for any errors reported from the complete handlers
> > > - on error
> > > - stop enqueing new requests from current frame
> > > - wait for the last request from errornous frame has returned
> > > - only start pumping new requests from the next buffer when the last
> > > request from the active frame has finished
> > > - In the beginning of the next frame send one extra request with
> > > EOF/ERR tag so the host knows that the last one was ok or not.
> > >
> > > thread 3: complete handler (interrupt)
> > > - give back the requests into the empty_list
> > > - report EXDEV and errors
> > > - wake up the pump thread
> > >
> > > With this method we will continously drain the hw trb stream of the dwc3
> > > controller per frame and therefor will not shoot into one window where
> > > the current stream could be missed. With the data spreading over the
> > > many requests we also avoid the missed requests when the DMA was to
> > > slow.
> > >
> >
> > This sounds good.
> >
> > As long as we can maintain more than X number of requests enqueued to
> > accomodate for the worst latency, then we can avoid underrun. The driver
> > should monitor how many requests are enqueued and hopefully can keep up
> > the queue with zero-length requests.
>
> Except I totally misunderstood something or did oversimplify to much,
> the above explanation should run this unnecessary.
>
> Although we are able to track the amount of enqueued requests in the udc
> driver and compare that with the amount of completed requests.
>
> Also we have the usb_gadget_frame_number callback to the udc controller
> to ask in which frame it is operating at the moment. This way we would
> be able to calculate not to enqueue requests into a transfer that did
> not come back yet completely but would run into missed transfers.
>
I would not depend too much on usb_gadget_frame_number(). There's not
really a hard requirement for the output. It's controller specific. For
dwc3 controller, if operate in highspeed or higher, it returns
microframe number. For fullspeed, it returns frame number.
As you noted, if you can wait and queue all the requests of a video
frame at once, then this will work also.
Thanks,
Thinh