Updating data in a production setup

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Updating data in a production setup

vijayakumar
Hi,
  As our application moves from one release to another, we need to alter the existing records in riak [Say changes like adding new field to the json with a default value]. What's the ideal way to handle such schema changes (if indexing is required for such fields). Is it possible to run a mapreduce on existing buckets and update the records? I am not find any help links for the above mentioned migration.
Riak Version:1.1

Thanks and Regards,
Vijayakumar.



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Updating data in a production setup

Mark Phillips-4
Hi Vijayakumar, 


On Sat, Apr 14, 2012 at 2:57 AM, vijayakumar <[hidden email]> wrote:
Hi,
  As our application moves from one release to another, we need to alter the existing records in riak [Say changes like adding new field to the json with a default value]. What's the ideal way to handle such schema changes (if indexing is required for such fields). Is it possible to run a mapreduce on existing buckets and update the records? I am not find any help links for the above mentioned migration.
Riak Version:1.1


The short answer is "yes, that's possible". That said, at the moment I'm not aware of any existing code/resources that could walk you through it. Anyone have anything they can share?

Keep in mind that running something like this over all your data is going to put a lot of load on your cluster and might lead to some timeouts and interesting debugging. Out of curiosity, how many keys do you need to update? 

Mark 
 
Thanks and Regards,
Vijayakumar.



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Updating data in a production setup

vijayakumar
Mark,
     Thanks for your help. I could say the number of keys would be in the range of millions and total number of buckets is 4.

Regards,
Vijayakumar.

On Tue, Apr 17, 2012 at 9:15 PM, Mark Phillips <[hidden email]> wrote:
Hi Vijayakumar, 


On Sat, Apr 14, 2012 at 2:57 AM, vijayakumar <[hidden email]> wrote:
Hi,
  As our application moves from one release to another, we need to alter the existing records in riak [Say changes like adding new field to the json with a default value]. What's the ideal way to handle such schema changes (if indexing is required for such fields). Is it possible to run a mapreduce on existing buckets and update the records? I am not find any help links for the above mentioned migration.
Riak Version:1.1


The short answer is "yes, that's possible". That said, at the moment I'm not aware of any existing code/resources that could walk you through it. Anyone have anything they can share?

Keep in mind that running something like this over all your data is going to put a lot of load on your cluster and might lead to some timeouts and interesting debugging. Out of curiosity, how many keys do you need to update? 

Mark 
 
Thanks and Regards,
Vijayakumar.



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Updating data in a production setup

Mark Phillips-4
On Wed, Apr 18, 2012 at 6:43 AM, vijayakumar <[hidden email]> wrote:
Mark,
     Thanks for your help. I could say the number of keys would be in the range of millions and total number of buckets is 4.


What hardware are you running this on?

Mark 
 
Regards,
Vijayakumar.


On Tue, Apr 17, 2012 at 9:15 PM, Mark Phillips <[hidden email]> wrote:
Hi Vijayakumar, 


On Sat, Apr 14, 2012 at 2:57 AM, vijayakumar <[hidden email]> wrote:
Hi,
  As our application moves from one release to another, we need to alter the existing records in riak [Say changes like adding new field to the json with a default value]. What's the ideal way to handle such schema changes (if indexing is required for such fields). Is it possible to run a mapreduce on existing buckets and update the records? I am not find any help links for the above mentioned migration.
Riak Version:1.1


The short answer is "yes, that's possible". That said, at the moment I'm not aware of any existing code/resources that could walk you through it. Anyone have anything they can share?

Keep in mind that running something like this over all your data is going to put a lot of load on your cluster and might lead to some timeouts and interesting debugging. Out of curiosity, how many keys do you need to update? 

Mark 
 
Thanks and Regards,
Vijayakumar.



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Updating data in a production setup

vijayakumar
Mark,
  We run it with the following configurations:

Extra Large Instance (Ec2):

15 GB memory
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.xlarge

Large (Ec2):
7.5 GB memory
4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
850 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.large

Thanks and Regards,
Vijayakumar.


On Wed, Apr 18, 2012 at 10:58 PM, Mark Phillips <[hidden email]> wrote:
On Wed, Apr 18, 2012 at 6:43 AM, vijayakumar <[hidden email]> wrote:
Mark,
     Thanks for your help. I could say the number of keys would be in the range of millions and total number of buckets is 4.


What hardware are you running this on?

Mark 
 
Regards,
Vijayakumar.


On Tue, Apr 17, 2012 at 9:15 PM, Mark Phillips <[hidden email]> wrote:
Hi Vijayakumar, 


On Sat, Apr 14, 2012 at 2:57 AM, vijayakumar <[hidden email]> wrote:
Hi,
  As our application moves from one release to another, we need to alter the existing records in riak [Say changes like adding new field to the json with a default value]. What's the ideal way to handle such schema changes (if indexing is required for such fields). Is it possible to run a mapreduce on existing buckets and update the records? I am not find any help links for the above mentioned migration.
Riak Version:1.1


The short answer is "yes, that's possible". That said, at the moment I'm not aware of any existing code/resources that could walk you through it. Anyone have anything they can share?

Keep in mind that running something like this over all your data is going to put a lot of load on your cluster and might lead to some timeouts and interesting debugging. Out of curiosity, how many keys do you need to update? 

Mark 
 
Thanks and Regards,
Vijayakumar.



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com






_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com