luwak backend and misc.

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

luwak backend and misc.

Kunal Nawale
Hi,
  I am evaluating luwak to be used as a redundant file storage server. I am trying to find out which backend will better suit for my purpose. Each of my server has sixteen 1TB drives, 4 total servers, 48GB ram each, 1x10Gb network interface
The file sizes that will be stored range from 1GB-20GB, with an average size of 3 GB.

Here are some observations/questions I had regarding this.

1) With bitcask backend, I tried uploading a 6 GB file. The upload and download worked fine for this file. But when I tried to upload a 17 GB file it took a very long time (more than 20 mins). Tried to download it but did not succeed, the download always come back with a size of 1,000,000 bytes.

2) I also tried fs_backend, but it turned out to be quite slow, the
upload of a 6 GB took considerably longer. The download never succeeded
it always returned me a chunk of that file not the whole file.

3) Are there any performance measurements available about the read/write bandwidths

4) Are there any latency numbers available, I am specifically looking at the time difference between the first byte read and the last byte write for an object.

5) Can an object be read simultaneously while it is being written. With a lag between the read write pointer being in the range of 60 MBytes.

Thanks
-Kunal






_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: luwak backend and misc.

bryan-basho
Administrator
On Thu, Jul 28, 2011 at 12:00 PM, Kunal Nawale <[hidden email]> wrote:

> Hi,
>  I am evaluating luwak to be used as a redundant file storage server. I am
> trying to find out which backend will better suit for my purpose. Each of my
> server has sixteen 1TB drives, 4 total servers, 48GB ram each, 1x10Gb
> network interface
> The file sizes that will be stored range from 1GB-20GB, with an average size
> of 3 GB.
>
> Here are some observations/questions I had regarding this.
>
> 1) With bitcask backend, I tried uploading a 6 GB file. The upload and
> download worked fine for this file. But when I tried to upload a 17 GB file
> it took a very long time (more than 20 mins). Tried to download it but did
> not succeed, the download always come back with a size of 1,000,000 bytes.

There are a few things that can cause these troubles.  Have you
checked the logs to see if there were any errors during any of these
operations?

On the upload side, it's possible that 6GB stands on one side of a
boundary, and 17GB on the other.  I'd suggest searching the size space
in a binary fashion: does 11.5GB work?  If there is a boundary, this
is a good way to find it.  It might be worth trying this both in the
case where the Riak cluster is cleaned out after each attempt, and
where it is left running with all data for all attempts.  Does
changing the "block_size" luwak parameter (controled by the
X-Luwak-Block-Size HTTP header when you create the file) change where
the boundary is?

Still on the upload side, where was the upload client running?  That
client may have been applying extra memory pressure to one of your
nodes if it was on the same machine as the cluster.

On the download side, if you haven't modified the block_size of your
luwak files, 1,000,000 indicates that there's exactly one block in the
file.  If this was an existing file of 1MB, then this just means that
your 17GB upload failed before flushing the tree for the new data.
We've also noticed that some clients (like Firefox) have trouble
parsing Luwak's chunked response, due to an error in gzip encoding -
try explicitly setting Accept-Encoding to only identity.

> 2) I also tried fs_backend, but it turned out to be quite slow, the
> upload of a 6 GB took considerably longer. The download never succeeded
> it always returned me a chunk of that file not the whole file.

The fs_backend was written as proof/testing code.  It is not optimized
for any variety of speed.  Best to stick with Luwak for your use case,
I think.

> 3) Are there any performance measurements available about the read/write
> bandwidths

None directly, but you should be able to estimate the write speed:
Luwak creates a Riak object for every N bytes of your file (N is known
as the "block size").  Luwak will not be able to write these objects
faster than any other Riak client.

> 5) Can an object be read simultaneously while it is being written. With a
> lag between the read write pointer being in the range of 60 MBytes.

I questions 4&5 are related, and I think it's easier to answer 5 first.

The simple answer is *new* Luwak files cannot be read while being
written.  This has to do with the fact that the HTTP interface does
not expose a way to flush the file's tree to Riak before finishing the
upload.  The data for the file is being persisted, but the root
pointer is not modified until the end.

This also means that *existing* Luwak file *can* be read while being
modified, but modifications made after the root pointer is found will
be invisible.

> 4) Are there any latency numbers available, I am specifically looking at the
> time difference between the first byte read and the last byte write for an
> object.

I'm interpreting this question as, "After I finish writing, how long
will it be before I can begin reading what I just wrote?" because of
the tree-flushing behavior I described above.

The answer depends on the backlog to the Luwak writer process (on the
Riak/server side), and the depth of the resulting file tree.  Once the
writer has written the final block to an object, it must then flush
the tree pointing to that object.  Flushing the tree requires writing,
at least, the root node and the "tld" object (where the metadata about
the file is stored).  The block_size and tree_order parameters
(1,000,000 bytes and 250, by default) determine how many other nodes
must be written between the root and the block.  Each node is an
additional Riak object write.

I hope that helps,
Bryan

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: luwak backend and misc.

Kunal Nawale
Hi Bryan,
  I will try increasing the file size from 6 GB up and find out where it
breaks. I will also capture the logs when it fails.

I have not played with the 'block_size' parameter, will try that too.
The upload client is running on a separate machine.
I am using curl to upload and download.
Thanks for your help.
-kunal



On 08/08/2011 03:51 PM, Bryan Fink wrote:

> On Thu, Jul 28, 2011 at 12:00 PM, Kunal Nawale<[hidden email]>  wrote:
>> Hi,
>>   I am evaluating luwak to be used as a redundant file storage server. I am
>> trying to find out which backend will better suit for my purpose. Each of my
>> server has sixteen 1TB drives, 4 total servers, 48GB ram each, 1x10Gb
>> network interface
>> The file sizes that will be stored range from 1GB-20GB, with an average size
>> of 3 GB.
>>
>> Here are some observations/questions I had regarding this.
>>
>> 1) With bitcask backend, I tried uploading a 6 GB file. The upload and
>> download worked fine for this file. But when I tried to upload a 17 GB file
>> it took a very long time (more than 20 mins). Tried to download it but did
>> not succeed, the download always come back with a size of 1,000,000 bytes.
> There are a few things that can cause these troubles.  Have you
> checked the logs to see if there were any errors during any of these
> operations?
>
> On the upload side, it's possible that 6GB stands on one side of a
> boundary, and 17GB on the other.  I'd suggest searching the size space
> in a binary fashion: does 11.5GB work?  If there is a boundary, this
> is a good way to find it.  It might be worth trying this both in the
> case where the Riak cluster is cleaned out after each attempt, and
> where it is left running with all data for all attempts.  Does
> changing the "block_size" luwak parameter (controled by the
> X-Luwak-Block-Size HTTP header when you create the file) change where
> the boundary is?
>
> Still on the upload side, where was the upload client running?  That
> client may have been applying extra memory pressure to one of your
> nodes if it was on the same machine as the cluster.
>
> On the download side, if you haven't modified the block_size of your
> luwak files, 1,000,000 indicates that there's exactly one block in the
> file.  If this was an existing file of 1MB, then this just means that
> your 17GB upload failed before flushing the tree for the new data.
> We've also noticed that some clients (like Firefox) have trouble
> parsing Luwak's chunked response, due to an error in gzip encoding -
> try explicitly setting Accept-Encoding to only identity.
>
>> 2) I also tried fs_backend, but it turned out to be quite slow, the
>> upload of a 6 GB took considerably longer. The download never succeeded
>> it always returned me a chunk of that file not the whole file.
> The fs_backend was written as proof/testing code.  It is not optimized
> for any variety of speed.  Best to stick with Luwak for your use case,
> I think.
>
>> 3) Are there any performance measurements available about the read/write
>> bandwidths
> None directly, but you should be able to estimate the write speed:
> Luwak creates a Riak object for every N bytes of your file (N is known
> as the "block size").  Luwak will not be able to write these objects
> faster than any other Riak client.
>
>> 5) Can an object be read simultaneously while it is being written. With a
>> lag between the read write pointer being in the range of 60 MBytes.
> I questions 4&5 are related, and I think it's easier to answer 5 first.
>
> The simple answer is *new* Luwak files cannot be read while being
> written.  This has to do with the fact that the HTTP interface does
> not expose a way to flush the file's tree to Riak before finishing the
> upload.  The data for the file is being persisted, but the root
> pointer is not modified until the end.
>
> This also means that *existing* Luwak file *can* be read while being
> modified, but modifications made after the root pointer is found will
> be invisible.
>
>> 4) Are there any latency numbers available, I am specifically looking at the
>> time difference between the first byte read and the last byte write for an
>> object.
> I'm interpreting this question as, "After I finish writing, how long
> will it be before I can begin reading what I just wrote?" because of
> the tree-flushing behavior I described above.
>
> The answer depends on the backlog to the Luwak writer process (on the
> Riak/server side), and the depth of the resulting file tree.  Once the
> writer has written the final block to an object, it must then flush
> the tree pointing to that object.  Flushing the tree requires writing,
> at least, the root node and the "tld" object (where the metadata about
> the file is stored).  The block_size and tree_order parameters
> (1,000,000 bytes and 250, by default) determine how many other nodes
> must be written between the root and the block.  Each node is an
> additional Riak object write.
>
> I hope that helps,
> Bryan

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: luwak backend and misc.

Drew Whitehouse
It would seem that a new file is readable (sort of) before the tree is flushed, no ?


-Drew


...deleted... 
The simple answer is *new* Luwak files cannot be read while being
written.  This has to do with the fact that the HTTP interface does
not expose a way to flush the file's tree to Riak before finishing the
upload.  The data for the file is being persisted, but the root
pointer is not modified until the end.

This also means that *existing* Luwak file *can* be read while being
modified, but modifications made after the root pointer is found will
be invisible.

4) Are there any latency numbers available, I am specifically looking at the
time difference between the first byte read and the last byte write for an
object.
I'm interpreting this question as, "After I finish writing, how long
will it be before I can begin reading what I just wrote?" because of
the tree-flushing behavior I described above.

The answer depends on the backlog to the Luwak writer process (on the
Riak/server side), and the depth of the resulting file tree.  Once the
writer has written the final block to an object, it must then flush
the tree pointing to that object.  Flushing the tree requires writing,
at least, the root node and the "tld" object (where the metadata about
the file is stored).  The block_size and tree_order parameters
(1,000,000 bytes and 250, by default) determine how many other nodes
must be written between the root and the block.  Each node is an
additional Riak object write.

I hope that helps,
Bryan

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



--
Drew Whitehouse
ANU Supercomputer Facility Vizlab



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com