[PLUG] high cpu utilization
bain at devslashzero.com
Fri Nov 13 20:14:38 PST 2009
> > No, you don't want to do that,
> > Passing data around pipes is expensive, you might as well busy spin in
> > write thread. (yes there is always a splice syscall to do a zero copy op,
> > but its complicated and not needed, all we need here is a conditional
> > variable).
> How is a busy spin better than a blocking read on a pipe? Also, since
> he is streaming, he needs a 'stream' (perhaps eventually, the write
> may not be so fast?). Either he would have to implement buffer
> management to get the desired stream functionality or go for something
> like a pipe.
An who said streaming _has_ to involve data copying multiple times.
Writing to a pipe is making three copies of the data. once in read thread, one
inside the pipe and one in write thread. For each page that has to be written
to the disk we are creating three. Which is ridiculous. Splitting it in
different processes is even more stupid (unless using splice as below, with
splice world makes sense again).
And the reason busy waiting will be faster is because of the cache lines,
using any RPC mechanism for transferring huge amounts of data usually screws
up you cache lines enough to give drastic regression in performance. Busy
waiting will just make CPU cycles waste and not memory bus bandwidth (which is
a lot more limited resource in intel arch),which will be handled rather
gracefully by multi core cpu and linux schedular.
But i understand the point you are trying to make. Pipe is a nice abstraction
for a nice streaming data. Ands thats where a simple splice thing works the
best. Somewhere around 2002ish Linus implemented zerocopy pipes using splice,
This essentially eliminates the read/write thread and even the first data copy.
You open a device driver that produces a data, spilce it to a pipe and splice
the pipe to a consumer who then splices it back to whatever.
Basically a producer thread (like a video driver controller) just directs the
stream to consumer (like X server window), which ultimately forwards it to
final dest( the video driver). All of this happens without a single data copy.
This is how you can now use unix pipes to implement zero copy data streaming.
BTW: if you want to process the data in between, use vmsplice instead
PS: I just know the general idea about splice, google for exact implementation
and usage, i might have been wrong in specifics in above scenario.
A nice enough intro i found with quick google is here
More information about the plug-mail