Using Riak KV with Amazon ML

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Using Riak KV with Amazon ML

ricardo.ekm
Hi all,
I've some events stored in Riak KV which I'd like to use as input to Amazon Machine Learning.
What I want to accomplish is to export the JSON entries from Riak transform it to a CSV file and upload it to S3.

What is the best way to export a full bucket (or a 2i subset) from Riak?

I've read there were some improvements in Riak in this sense in order to build Apache Spark Connector (https://databricks.com/blog/2016/08/11/the-quest-for-hidden-treasure-an-apache-spark-connector-for-the-riak-nosql-database.html)

Is there a way to take advantage of the new full bucket read and of the distributed export?

Thanks.

--
Ricardo Mayerhofer

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Using Riak KV with Amazon ML

Matt Digan
Hi Ricardo,

Full bucket read is not supported yet by Riak KV. If you're able to upgrade, Riak TS does support that feature.

Otherwise, with Riak KV, you could use a range of 2i indexes. SeeĀ https://github.com/basho/spark-riak-connector/blob/master/docs/using-connector.md#reading-data-from-kv-bucket for more info.

--Matt

On Thu, Sep 1, 2016 at 4:52 PM, Ricardo Mayerhofer <[hidden email]> wrote:
Hi all,
I've some events stored in Riak KV which I'd like to use as input to Amazon Machine Learning.
What I want to accomplish is to export the JSON entries from Riak transform it to a CSV file and upload it to S3.

What is the best way to export a full bucket (or a 2i subset) from Riak?

I've read there were some improvements in Riak in this sense in order to build Apache Spark Connector (https://databricks.com/blog/2016/08/11/the-quest-for-hidden-treasure-an-apache-spark-connector-for-the-riak-nosql-database.html)

Is there a way to take advantage of the new full bucket read and of the distributed export?

Thanks.

--
Ricardo Mayerhofer

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




--
Matt Digan
Engineering Director
Basho Technologies

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com