'Brigades and Buckets to file-descriptors'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       apr-dev
Subject:    Brigades and Buckets to file-descriptors
From:       Colm MacCarthaigh <colm () stdlib ! net>
Date:       2006-04-29 15:03:43
Message-ID: 20060429150343.GA10265 () dochas ! stdlib ! net
[Download RAW message or body]


I'm currently playing with the new linux splice() and tee() system
calls, to see what - if any - performance boost they give me. Right now
I'm just hacking in a replacement to httpd's core output filter which is
messy, I'd like to add what I'm doing to APR.

I'd like to add a call that will figure out the most efficient way to
read/write a bucket to/from a file-descriptor. The call would be implemented
as a function pointer for each bucket type, just like what we're used
to, and would decide internally to use read() + write(), mmap() +
write(), sendfile(), splice(), or any other clever trick the platform
supports.

sendfile can handle filesystem -> socket, splice can handle
filesystem/pipe/socket -> filesystem/pipe/socket. (so, for example, it's
possible to do zero-copy proxy request handling). tee can do some really
clever stuff like give us zero-copy to a socket, while giving us
zero-copy to disk, ideal for mod_disk_cache. Of course those particular
tricks are single-platform right now, but it will probably spread.

The call might look like;

xxxx_bucket_send(apr_bucket_t * from, apr_brigade_t * to, apr_size_t * len,
	         apr_off_t offset, int blocking-flag);

The reason for having "to" be a brigade type is complex;

        1. The bucket containers help us avoid unneccessary calls to
           ioctl trying to figure out what type of file descriptor we're
           dealing with. 

	2. Is serves as a neat way to manage the paralellism. Rather
	   than the traditional read-in-sequence behaviour of a brigade
	   we would use the brigade as a write-in-paralell structure
	   too. So if multiple files need to be written to, we create
	   a bucket for each and add them to a brigade.

This of course is a new, and potentially confusing, use of both brigades
and buckets, which I'm a little concerned about, but I do like the
relative simplicity of it. The length field of the bucket container
would be used to indicate how much data we managed to write to the
bucket, or < 0  on error, just like regular write(). 

While I'm at it, I want to implement the performance boost that
configurable read buffer sizes can implement. So I want to kill a few
bytes in the bucket structure and add a bufsize field and an
accessor/modifier.

I'm not sure if doing a setvbuf like operation makes sense, as we did
the regular apr file_io system, as buckets handle all of their
allocations internally and it would seem to make more sense to me to
keep the memory there logically.

Any thoughts on the above?

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net
[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic