Announcing Riak Pipe (BETA)

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Announcing Riak Pipe (BETA)

bryan-basho
Administrator
Hello again, Community.

I'm excited to announce the opening of a new beta-status Basho project
today: Riak Pipe.

http://github.com/basho/riak_pipe

Riak Pipe is a new way to distribute work around a Riak cluster.

The README explains much more than I can here, but essentially Riak
Pipe allows you to specify work in the form of a chain of function
pairs.  One function of that pair describes how to produce output from
input, and the other describes where in the cluster an input should be
processed.  Riak Pipe handles the details of ferrying data between
workers by building atop Riak Core's distribution power.

At this point in time Riak Pipe is BETA-status software.  We'd like
anyone who is interested in it to take a look and send us feedback.
Please do not put it into production.  We will be continuing to
improve Riak Pipe toward a future release date.

We have two plans for Riak Pipe.  The first is to power Riak's
MapReduce system with it.  We think Riak Pipe provides a cleaner, more
manageable subsystem that will provide much easier monitoring,
debugging, and general use of MapReduce in Riak.  You can see our work
toward that goal in the "pipe" branch of Riak KV (start at
src/riak_kv_mrc_pipe.erl):

https://github.com/basho/riak_kv/tree/pipe

Our second plan for Riak Pipe is to expand Riak's MapReduce system
with more abilities (imagine a keyed-reduce phase, or additional
processing languages), possibly to the extent of providing an entirely
separate interface (new query syntax? offline/asynchronous
processing?).  But for this part, we need your help.

We have some ideas about what external client interfaces might look
like.  We also have some ideas about what an external processing
interface might look like.  We're still in the early phases of
creating these, though, so if exploring the riak_pipe repository gives
you ideas, please don't hesitate to get in touch.

And, again, Riak Pipe is BETA software.  Basho does not support
running it in production at this time.

Cheers,

Bryan Fink
Senior Software Engineer
Basho Technologies

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Announcing Riak Pipe (BETA)

Eric Fong
Hi

Can we use the Riak Pipe to handle all HTTP requests so that we can use that as security/application layer and remove the need to put something in front of Riak.

Eric

On Tue, Jun 14, 2011 at 12:20 AM, Bryan Fink <[hidden email]> wrote:
Hello again, Community.

I'm excited to announce the opening of a new beta-status Basho project
today: Riak Pipe.

http://github.com/basho/riak_pipe

Riak Pipe is a new way to distribute work around a Riak cluster.

The README explains much more than I can here, but essentially Riak
Pipe allows you to specify work in the form of a chain of function
pairs.  One function of that pair describes how to produce output from
input, and the other describes where in the cluster an input should be
processed.  Riak Pipe handles the details of ferrying data between
workers by building atop Riak Core's distribution power.

At this point in time Riak Pipe is BETA-status software.  We'd like
anyone who is interested in it to take a look and send us feedback.
Please do not put it into production.  We will be continuing to
improve Riak Pipe toward a future release date.

We have two plans for Riak Pipe.  The first is to power Riak's
MapReduce system with it.  We think Riak Pipe provides a cleaner, more
manageable subsystem that will provide much easier monitoring,
debugging, and general use of MapReduce in Riak.  You can see our work
toward that goal in the "pipe" branch of Riak KV (start at
src/riak_kv_mrc_pipe.erl):

https://github.com/basho/riak_kv/tree/pipe

Our second plan for Riak Pipe is to expand Riak's MapReduce system
with more abilities (imagine a keyed-reduce phase, or additional
processing languages), possibly to the extent of providing an entirely
separate interface (new query syntax? offline/asynchronous
processing?).  But for this part, we need your help.

We have some ideas about what external client interfaces might look
like.  We also have some ideas about what an external processing
interface might look like.  We're still in the early phases of
creating these, though, so if exploring the riak_pipe repository gives
you ideas, please don't hesitate to get in touch.

And, again, Riak Pipe is BETA software.  Basho does not support
running it in production at this time.

Cheers,

Bryan Fink
Senior Software Engineer
Basho Technologies

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



--
Best Regards,
Eric Fong

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Announcing Riak Pipe (BETA)

Antonio Rohman Fernandez
In reply to this post by bryan-basho
Can Riak Pipe be used with Hadoop? That would be wonderful!

Rohman

Sent from my iPad

On Jun 14, 2011, at 12:20 AM, Bryan Fink <[hidden email]> wrote:

> Hello again, Community.
>
> I'm excited to announce the opening of a new beta-status Basho project
> today: Riak Pipe.
>
> http://github.com/basho/riak_pipe
>
> Riak Pipe is a new way to distribute work around a Riak cluster.
>
> The README explains much more than I can here, but essentially Riak
> Pipe allows you to specify work in the form of a chain of function
> pairs.  One function of that pair describes how to produce output from
> input, and the other describes where in the cluster an input should be
> processed.  Riak Pipe handles the details of ferrying data between
> workers by building atop Riak Core's distribution power.
>
> At this point in time Riak Pipe is BETA-status software.  We'd like
> anyone who is interested in it to take a look and send us feedback.
> Please do not put it into production.  We will be continuing to
> improve Riak Pipe toward a future release date.
>
> We have two plans for Riak Pipe.  The first is to power Riak's
> MapReduce system with it.  We think Riak Pipe provides a cleaner, more
> manageable subsystem that will provide much easier monitoring,
> debugging, and general use of MapReduce in Riak.  You can see our work
> toward that goal in the "pipe" branch of Riak KV (start at
> src/riak_kv_mrc_pipe.erl):
>
> https://github.com/basho/riak_kv/tree/pipe
>
> Our second plan for Riak Pipe is to expand Riak's MapReduce system
> with more abilities (imagine a keyed-reduce phase, or additional
> processing languages), possibly to the extent of providing an entirely
> separate interface (new query syntax? offline/asynchronous
> processing?).  But for this part, we need your help.
>
> We have some ideas about what external client interfaces might look
> like.  We also have some ideas about what an external processing
> interface might look like.  We're still in the early phases of
> creating these, though, so if exploring the riak_pipe repository gives
> you ideas, please don't hesitate to get in touch.
>
> And, again, Riak Pipe is BETA software.  Basho does not support
> running it in production at this time.
>
> Cheers,
>
> Bryan Fink
> Senior Software Engineer
> Basho Technologies
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Announcing Riak Pipe (BETA)

Sean Cribbs-2
You can already use Riak KV as persistence for Hadoop (although it can be a bit intensive on IO), I know of one production setup already doing this.  Riak Pipe on the other hand, could be used to build jobs like you would in Hadoop, with greater independence and flexibility than Riak KV's standard MapReduce gives you.  It's still very new, so we're not sure of all the ins-and-outs of supporting it, but we encourage you to give it a try.

Sean Cribbs <[hidden email]>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/

On Jun 13, 2011, at 9:26 PM, Antonio Rohman Fernandez wrote:

> Can Riak Pipe be used with Hadoop? That would be wonderful!
>
> Rohman
>
> Sent from my iPad
>
> On Jun 14, 2011, at 12:20 AM, Bryan Fink <[hidden email]> wrote:
>
>> Hello again, Community.
>>
>> I'm excited to announce the opening of a new beta-status Basho project
>> today: Riak Pipe.
>>
>> http://github.com/basho/riak_pipe
>>
>> Riak Pipe is a new way to distribute work around a Riak cluster.
>>
>> The README explains much more than I can here, but essentially Riak
>> Pipe allows you to specify work in the form of a chain of function
>> pairs.  One function of that pair describes how to produce output from
>> input, and the other describes where in the cluster an input should be
>> processed.  Riak Pipe handles the details of ferrying data between
>> workers by building atop Riak Core's distribution power.
>>
>> At this point in time Riak Pipe is BETA-status software.  We'd like
>> anyone who is interested in it to take a look and send us feedback.
>> Please do not put it into production.  We will be continuing to
>> improve Riak Pipe toward a future release date.
>>
>> We have two plans for Riak Pipe.  The first is to power Riak's
>> MapReduce system with it.  We think Riak Pipe provides a cleaner, more
>> manageable subsystem that will provide much easier monitoring,
>> debugging, and general use of MapReduce in Riak.  You can see our work
>> toward that goal in the "pipe" branch of Riak KV (start at
>> src/riak_kv_mrc_pipe.erl):
>>
>> https://github.com/basho/riak_kv/tree/pipe
>>
>> Our second plan for Riak Pipe is to expand Riak's MapReduce system
>> with more abilities (imagine a keyed-reduce phase, or additional
>> processing languages), possibly to the extent of providing an entirely
>> separate interface (new query syntax? offline/asynchronous
>> processing?).  But for this part, we need your help.
>>
>> We have some ideas about what external client interfaces might look
>> like.  We also have some ideas about what an external processing
>> interface might look like.  We're still in the early phases of
>> creating these, though, so if exploring the riak_pipe repository gives
>> you ideas, please don't hesitate to get in touch.
>>
>> And, again, Riak Pipe is BETA software.  Basho does not support
>> running it in production at this time.
>>
>> Cheers,
>>
>> Bryan Fink
>> Senior Software Engineer
>> Basho Technologies
>>
>> _______________________________________________
>> riak-users mailing list
>> [hidden email]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Announcing Riak Pipe (BETA)

Sean Cribbs-2
In reply to this post by Eric Fong
I think you misunderstand Riak Pipe; it's a distributed processing framework, not a security layer.

Sean Cribbs <[hidden email]>
Developer Advocate
Basho Technologies, Inc.

On Jun 13, 2011, at 9:09 PM, Eric Fong wrote:

Hi

Can we use the Riak Pipe to handle all HTTP requests so that we can use that as security/application layer and remove the need to put something in front of Riak.

Eric

On Tue, Jun 14, 2011 at 12:20 AM, Bryan Fink <[hidden email]> wrote:
Hello again, Community.

I'm excited to announce the opening of a new beta-status Basho project
today: Riak Pipe.

http://github.com/basho/riak_pipe

Riak Pipe is a new way to distribute work around a Riak cluster.

The README explains much more than I can here, but essentially Riak
Pipe allows you to specify work in the form of a chain of function
pairs.  One function of that pair describes how to produce output from
input, and the other describes where in the cluster an input should be
processed.  Riak Pipe handles the details of ferrying data between
workers by building atop Riak Core's distribution power.

At this point in time Riak Pipe is BETA-status software.  We'd like
anyone who is interested in it to take a look and send us feedback.
Please do not put it into production.  We will be continuing to
improve Riak Pipe toward a future release date.

We have two plans for Riak Pipe.  The first is to power Riak's
MapReduce system with it.  We think Riak Pipe provides a cleaner, more
manageable subsystem that will provide much easier monitoring,
debugging, and general use of MapReduce in Riak.  You can see our work
toward that goal in the "pipe" branch of Riak KV (start at
src/riak_kv_mrc_pipe.erl):

https://github.com/basho/riak_kv/tree/pipe

Our second plan for Riak Pipe is to expand Riak's MapReduce system
with more abilities (imagine a keyed-reduce phase, or additional
processing languages), possibly to the extent of providing an entirely
separate interface (new query syntax? offline/asynchronous
processing?).  But for this part, we need your help.

We have some ideas about what external client interfaces might look
like.  We also have some ideas about what an external processing
interface might look like.  We're still in the early phases of
creating these, though, so if exploring the riak_pipe repository gives
you ideas, please don't hesitate to get in touch.

And, again, Riak Pipe is BETA software.  Basho does not support
running it in production at this time.

Cheers,

Bryan Fink
Senior Software Engineer
Basho Technologies

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



--
Best Regards,
Eric Fong
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Announcing Riak Pipe (BETA)

Antonio Rohman Fernandez
In reply to this post by Sean Cribbs-2
> Riak Pipe on the other hand, could be used to build jobs like you would in Hadoop

Can a same job be performed by several servers distributed as a
cluster? or only 1 Pipe server will do the job and Riak KV will be the
distributed cluster?... the good thing of Hadoop is that you can have a
Hadoop cluster and a Riak cluster and both clusters would work
distributed to perform jobs ( Hadoop ) and to deliver data ( Riak KV
)... i wonder how Riak Pipe will work, i hope that it would be like
Hadoop and the computation of a job can be distributed.

Rohman

On Mon, 13 Jun 2011 21:33:44 -0400, Sean Cribbs <[hidden email]> wrote:

> You can already use Riak KV as persistence for Hadoop (although it
> can be a bit intensive on IO), I know of one production setup already
> doing this.  Riak Pipe on the other hand, could be used to build jobs
> like you would in Hadoop, with greater independence and flexibility
> than Riak KV's standard MapReduce gives you.  It's still very new, so
> we're not sure of all the ins-and-outs of supporting it, but we
> encourage you to give it a try.
>
> Sean Cribbs <[hidden email]>
> Developer Advocate
> Basho Technologies, Inc.
> http://basho.com/
>
> On Jun 13, 2011, at 9:26 PM, Antonio Rohman Fernandez wrote:
>
>> Can Riak Pipe be used with Hadoop? That would be wonderful!
>>
>> Rohman
>>
>> Sent from my iPad
>>
>> On Jun 14, 2011, at 12:20 AM, Bryan Fink <[hidden email]> wrote:
>>
>>> Hello again, Community.
>>>
>>> I'm excited to announce the opening of a new beta-status Basho project
>>> today: Riak Pipe.
>>>
>>> http://github.com/basho/riak_pipe
>>>
>>> Riak Pipe is a new way to distribute work around a Riak cluster.
>>>
>>> The README explains much more than I can here, but essentially Riak
>>> Pipe allows you to specify work in the form of a chain of function
>>> pairs.  One function of that pair describes how to produce output from
>>> input, and the other describes where in the cluster an input should be
>>> processed.  Riak Pipe handles the details of ferrying data between
>>> workers by building atop Riak Core's distribution power.
>>>
>>> At this point in time Riak Pipe is BETA-status software.  We'd like
>>> anyone who is interested in it to take a look and send us feedback.
>>> Please do not put it into production.  We will be continuing to
>>> improve Riak Pipe toward a future release date.
>>>
>>> We have two plans for Riak Pipe.  The first is to power Riak's
>>> MapReduce system with it.  We think Riak Pipe provides a cleaner, more
>>> manageable subsystem that will provide much easier monitoring,
>>> debugging, and general use of MapReduce in Riak.  You can see our work
>>> toward that goal in the "pipe" branch of Riak KV (start at
>>> src/riak_kv_mrc_pipe.erl):
>>>
>>> https://github.com/basho/riak_kv/tree/pipe
>>>
>>> Our second plan for Riak Pipe is to expand Riak's MapReduce system
>>> with more abilities (imagine a keyed-reduce phase, or additional
>>> processing languages), possibly to the extent of providing an entirely
>>> separate interface (new query syntax? offline/asynchronous
>>> processing?).  But for this part, we need your help.
>>>
>>> We have some ideas about what external client interfaces might look
>>> like.  We also have some ideas about what an external processing
>>> interface might look like.  We're still in the early phases of
>>> creating these, though, so if exploring the riak_pipe repository gives
>>> you ideas, please don't hesitate to get in touch.
>>>
>>> And, again, Riak Pipe is BETA software.  Basho does not support
>>> running it in production at this time.
>>>
>>> Cheers,
>>>
>>> Bryan Fink
>>> Senior Software Engineer
>>> Basho Technologies
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> [hidden email]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>> _______________________________________________
>> riak-users mailing list
>> [hidden email]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

--
ANTONIO ROHMAN FERNANDEZ
CEO, Founder & Lead Engineer
[hidden email]
PROJECTS
MaruBatsu.es
PupCloud.com
Wedding Album

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Announcing Riak Pipe (BETA)

Sean Cribbs-2

On Jun 13, 2011, at 9:40 PM, Antonio Rohman Fernandez wrote:
>
> Can a same job be performed by several servers distributed as a
> cluster? or only 1 Pipe server will do the job and Riak KV will be the
> distributed cluster?... the good thing of Hadoop is that you can have a
> Hadoop cluster and a Riak cluster and both clusters would work
> distributed to perform jobs ( Hadoop ) and to deliver data ( Riak KV
> )... i wonder how Riak Pipe will work, i hope that it would be like
> Hadoop and the computation of a job can be distributed.


Riak Pipe builds on the abstractions already available in riak_core for distribution, namely vnodes and consistent hashing. The existing MapReduce in Riak KV also uses these abstractions, but Pipe is a lot smarter about how it coordinates and monitors data processing.

Sean Cribbs <[hidden email]>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Announcing Riak Pipe (BETA)

Jonathan Langevin
So to be clear, will Riak Pipe negate the need for someone to use Hadoop in front of Riak KV?


Jonathan Langevin
Systems Administrator

Loom Inc.
Wilmington, NC: (910) 241-0433 - [hidden email] - www.loomlearning.com - Skype: intel352



On Mon, Jun 13, 2011 at 9:44 PM, Sean Cribbs <[hidden email]> wrote:

On Jun 13, 2011, at 9:40 PM, Antonio Rohman Fernandez wrote:
>
> Can a same job be performed by several servers distributed as a
> cluster? or only 1 Pipe server will do the job and Riak KV will be the
> distributed cluster?... the good thing of Hadoop is that you can have a
> Hadoop cluster and a Riak cluster and both clusters would work
> distributed to perform jobs ( Hadoop ) and to deliver data ( Riak KV
> )... i wonder how Riak Pipe will work, i hope that it would be like
> Hadoop and the computation of a job can be distributed.


Riak Pipe builds on the abstractions already available in riak_core for distribution, namely vnodes and consistent hashing. The existing MapReduce in Riak KV also uses these abstractions, but Pipe is a lot smarter about how it coordinates and monitors data processing.

Sean Cribbs <[hidden email]>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Announcing Riak Pipe (BETA)

Sean Cribbs-2
It will not negate that need necessarily (Hadoop is a large ecosystem), but it could be an alternative for run-of-the-mill batch mapred jobs.  There's still a long way to go for that to be possible, but we encourage you to push its limits and find the pain points.

Sean Cribbs <[hidden email]>
Developer Advocate
Basho Technologies, Inc.

On Jun 14, 2011, at 11:41 AM, Jonathan Langevin wrote:

So to be clear, will Riak Pipe negate the need for someone to use Hadoop in front of Riak KV?


Jonathan Langevin
Systems Administrator

Loom Inc.
Wilmington, NC: (910) 241-0433 - [hidden email] - www.loomlearning.com - Skype: intel352



On Mon, Jun 13, 2011 at 9:44 PM, Sean Cribbs <[hidden email]> wrote:

On Jun 13, 2011, at 9:40 PM, Antonio Rohman Fernandez wrote:
>
> Can a same job be performed by several servers distributed as a
> cluster? or only 1 Pipe server will do the job and Riak KV will be the
> distributed cluster?... the good thing of Hadoop is that you can have a
> Hadoop cluster and a Riak cluster and both clusters would work
> distributed to perform jobs ( Hadoop ) and to deliver data ( Riak KV
> )... i wonder how Riak Pipe will work, i hope that it would be like
> Hadoop and the computation of a job can be distributed.


Riak Pipe builds on the abstractions already available in riak_core for distribution, namely vnodes and consistent hashing. The existing MapReduce in Riak KV also uses these abstractions, but Pipe is a lot smarter about how it coordinates and monitors data processing.

Sean Cribbs <[hidden email]>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com