OLD CODE
This work is old. I never got around to finishing it. I suggest you just use epoll.
See the following info:
man epoll_create
man epoll_wait
man epoll_ctl
What is here?
This work is based on a discussion on the linux-kernel mailing list with regards to the unscalability of select/poll and how it should be fixed. Linus proposed an implementation which was more scalable then poll/select and better on resource than RT signals. The work bellow is based on the discussions that took place following the debate.
If you are interested in this kind of stuff you may also look at kevents for BSD; there is an article at openmagazine about it.
Why are you doing this?
It just so happened that a few months after this discussion was forgotten there was a need for such a performance boost at my work (Chrysalis-ITS). The work you see here is being done in hopes to improve performance of proxies which handle an order of 10's of thousands of connections; hence 10s of thousands of file descriptors. When a poll/select call is done the array returned must be scanned for data. The descriptors that fire are at most a magnitude less than the number of sockets... yet the whole array must be scanned. This leads to very poor performance.
How will I use it?
Here is a sample TCP-echo server (telnet to port 9999), the package includes automake/autoconf so it's a bit bloated, and you will have to download one of the patches (below) to get it to work:
- [ eventtest-20010415.005341.tar.gz ] ( 28k ) - use with >= alpha8
- [ eventtest-20010406.233704.tar.gz ] ( 30k ) - older version with POLLOUT hack
Here is the basic idea:
#include <linux/fdevent.h> static void handler(bind_event *); ... int fd; struct fdevent ev; void* something; fd = socket(...); /* or open() */ ev.fd = fd; ev.mask = FDEVENT_IN; ev.data = something; ev.fn = handler; sys_bind_event(&ev); ... for(;;) { bind_event events[DESIRED_MAX]; num = sys_get_events( &events, DESIRED_MAX, SLEEP_IN_MS ); for(i=0; i<num; i++) events[i].fn( &events[i] ); }What is missing?
This is my TODO list in order of decreasing importance
- struct fdevent should be reorganized to group items which are copied from kernel to user space and those that are not; this would use one larger copy as opposed to multiple small ones
- must decide if a fork()ed child inherits the events of the parent
- write a blurb for Configure.help
- after all that I will make a RELEASE BETA
- a few examples of how this should be used are needed the above is too simple
- man pages
- benchmarks are needed to prove that this is valid
- try to implement poll on top of this
- eventually... see if aio_read can be implemented using this (see I/O Event Handling Under Linux by Richard Gooch)
So, what is done?
The patches below are in reverse order; the one at the top is the one you want, as it is most current.
- alpha8 [ April 14, 2001 ] (based on kernel 2.4.3)
Still a concept patch. TCP sockets are the only file descriptors I will use for now. Works great for POLLIN/POLLOUT-type events. Next on thing to fix is to do some clean up and optimizations so to produce some benchmarks.
- [ a8.patch ] ( 26k )
- [ a8.patch.bz2 ] ( 7k )
- alpha7 [ April 6, 2001 ] (based on kernel 2.4.3)
Still a concept patch. TCP sockets are the only file descriptors I will use for now. Works great for POLLIN-type events. I still have to work out the POLLOUT-type. Wait queues work fine now.
- [ a7.patch ] ( 25k )
- [ a7.patch.bz2 ] ( 7k )
- alpha6 [ April 6, 2001 ] (based on kernel 2.4.3)
Still a concept patch. TCP sockets are the only file descriptors I will use for now. Works great for POLLIN-type events. I still have to work out the POLLOUT-type.
- [ a6.patch ] ( 18k )
- [ a6.patch.bz2 ] ( 5k )
- alpha5 [ April 4, 2001 ] (based on kernel 2.4.3)
Still a concept patch. TCP sockets are the only file descriptors I will use for now. Fixed some locking issues that I found when a forked process was released. It actually seem to work now. I have only tried it with a single read... not bad. One thing that is missing is the timeout; curently it is ignored and I will work on this next.
- [ a5.patch ] ( 16k )
- [ a5.patch.bz2 ] ( 5k )
- alpha4 [ April 3, 2001 ] (based on kernel 2.4.3)
Still a concept patch. TCP sockets are the only file descriptors I will use for now. Removed most of the crashing caused by incorectly initialized structure pointers. I still need to finish work on 'copy_files' before the implemntation can work with fork/pthread_create (need to duplicate the event queue on the new fd). Next step is to write a sample server and try it.
- [ a4.patch ] ( 14k )
- [ a4.patch.bz2 ] ( 4k )
- alpha3 [ April 2, 2001 ] (based on kernel 2.4.3)
Still a concept patch. I have tied into tcp sockets and that is where I will test first. Pretty much all the management functions have been done. I still need to initialize current->files (the stuff I added) and need to see if I have to do anything when the task is cloned, etc. Besides that it looks good.
- [ a3.patch ] ( 11k )
- [ a3.patch.bz2 ] ( 3k )
- alpha2 [ April 2, 2001 ] (based on kernel 2.4.3)
OK, still a concept patch. I have reworked the way that the 'file' located what queue it's supposed to be on and fixed a bunch of typeos and errors that were not caught till I tried to compile it the first time.
- [ a2.patch ] ( 7k )
- [ a2.patch.bz2 ] ( 2k )
- alpha1 [ April 1, 2001 ] (based on kernel 2.4.3)
This is just a design concept patch. The structs are defined in the headers, and user integration is done. The events are not yet fed into the queue.
- [ a1.patch ] ( 5k )
- [ a1.patch.bz2 ] ( 2k )