On Wed, May 16, 2007 at 11:44:13AM -0400, Jamie Guinan wrote:
On Wed, 16 May 2007, brad noyes wrote: Hi,
Do you mean MB/s as in MegaBytes/second? Lower-case "b" implies "bits" to me, but I suspect you meant bytes. :)
Yes, your assumption is correct. I never quite understood the capitalization conventions when it came to computers.
Anyway, if you know your total data set is going to exceed your system memory, which will largely get used as page cache, you might as well open the output with O_SYNC and write straight out. Then the big buffer in the writer thread can go away, and you can just queue chunks between the input and writer.
A good experiment would be to run the writer thread with dummy input data and see what kind of throughput you get.
You could try ext2 (no journalling), or tweaking some of the journalling options in ext3 (data=journal/ordered/writeback, see "man mount").
great idea. I forgot about tune2fs and mount options. thanks.
If you have enough CPU bandwidth, and your input stream has enough redundancy, you could gzip it before writing, which might reduce your output bandwidth requirements.
There isn't enough redundancy in the data. its essentially random data so it won't compress well at all.
Hope this helps, keep us posted.
Thanks for the input. -- brad