Unit testing, Riak buckets

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Unit testing, Riak buckets

Toby Corkindale
Hi,
I'd like to hear how other people are approaching the problem of
cleaning Riak buckets up at the end of unit tests for their apps.

The problem I have is that multiple tests may be run at once (by
different developers or different Jenkins' jobs or even just a
parallelised test suite) so I can't really run a blanket delete-all at
the end of the test suite, unless I use a randomly-named bucket each
time. Yet if I do that, I'm concerned the test suite may crash out prior
to the end sometimes, and then never delete that randomly-named bucket.


If secondary indexes aren't required, then the easy solution is to use a
randomly-named Bitcask bucket which has a backend configured for a
fairly short TTL.


Otherwise, I have wondered about creating buckets with a certain format,
perhaps "test-XXXXXX-YYYY-MM-DD", (x=random) and then a nightly cron
script can run to find all buckets timestamped from the previous day or
earlier, and remove them. I gather listing all buckets is an expensive
operation though, although it'll only be running on a testing Riak cluster.


So I wondered how other developers are approaching this issue?


Cheers,
Toby

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Unit testing, Riak buckets

Jeremiah Peschka
For CorrugatedIron's integration tests, we frequently use a GUID as part of the bucket name and then destroy the bucket after tests finish. Since I'm frequently moving between different Riak builds, I destroy my data directories at the filesystem level on a regular basis. 

Your idea of using cron jobs to delete yesterday's buckets doesn't sound like a bad idea.

Yes, listing buckets is bad in production. No, this isn't production. Therefore: LIST ALL THE THINGS!

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Sun, Oct 13, 2013 at 8:27 PM, Toby Corkindale <[hidden email]> wrote:
Hi,
I'd like to hear how other people are approaching the problem of cleaning Riak buckets up at the end of unit tests for their apps.

The problem I have is that multiple tests may be run at once (by different developers or different Jenkins' jobs or even just a parallelised test suite) so I can't really run a blanket delete-all at the end of the test suite, unless I use a randomly-named bucket each time. Yet if I do that, I'm concerned the test suite may crash out prior to the end sometimes, and then never delete that randomly-named bucket.


If secondary indexes aren't required, then the easy solution is to use a randomly-named Bitcask bucket which has a backend configured for a fairly short TTL.


Otherwise, I have wondered about creating buckets with a certain format, perhaps "test-XXXXXX-YYYY-MM-DD", (x=random) and then a nightly cron script can run to find all buckets timestamped from the previous day or earlier, and remove them. I gather listing all buckets is an expensive operation though, although it'll only be running on a testing Riak cluster.


So I wondered how other developers are approaching this issue?


Cheers,
Toby

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Unit testing, Riak buckets

Sean Cribbs-2
Not sure it's the greatest way (sometimes slow), but our integration testing tool riak_test uses git to store a clean "devrel" and resets/cleans the git repository at the beginning of each test. If you're curious, I can point you to the relevant sections of code that do this.


On Sun, Oct 13, 2013 at 8:34 PM, Jeremiah Peschka <[hidden email]> wrote:
For CorrugatedIron's integration tests, we frequently use a GUID as part of the bucket name and then destroy the bucket after tests finish. Since I'm frequently moving between different Riak builds, I destroy my data directories at the filesystem level on a regular basis. 

Your idea of using cron jobs to delete yesterday's buckets doesn't sound like a bad idea.

Yes, listing buckets is bad in production. No, this isn't production. Therefore: LIST ALL THE THINGS!

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Sun, Oct 13, 2013 at 8:27 PM, Toby Corkindale <[hidden email]> wrote:
Hi,
I'd like to hear how other people are approaching the problem of cleaning Riak buckets up at the end of unit tests for their apps.

The problem I have is that multiple tests may be run at once (by different developers or different Jenkins' jobs or even just a parallelised test suite) so I can't really run a blanket delete-all at the end of the test suite, unless I use a randomly-named bucket each time. Yet if I do that, I'm concerned the test suite may crash out prior to the end sometimes, and then never delete that randomly-named bucket.


If secondary indexes aren't required, then the easy solution is to use a randomly-named Bitcask bucket which has a backend configured for a fairly short TTL.


Otherwise, I have wondered about creating buckets with a certain format, perhaps "test-XXXXXX-YYYY-MM-DD", (x=random) and then a nightly cron script can run to find all buckets timestamped from the previous day or earlier, and remove them. I gather listing all buckets is an expensive operation though, although it'll only be running on a testing Riak cluster.


So I wondered how other developers are approaching this issue?


Cheers,
Toby

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




--
Sean Cribbs <[hidden email]>
Software Engineer
Basho Technologies, Inc.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Unit testing, Riak buckets

Jeremiah Peschka

Go on.

---
sent from a tiny portion of the hive mind...
in this case, a phone

On Oct 13, 2013 9:19 PM, "Sean Cribbs" <[hidden email]> wrote:
Not sure it's the greatest way (sometimes slow), but our integration testing tool riak_test uses git to store a clean "devrel" and resets/cleans the git repository at the beginning of each test. If you're curious, I can point you to the relevant sections of code that do this.


On Sun, Oct 13, 2013 at 8:34 PM, Jeremiah Peschka <[hidden email]> wrote:
For CorrugatedIron's integration tests, we frequently use a GUID as part of the bucket name and then destroy the bucket after tests finish. Since I'm frequently moving between different Riak builds, I destroy my data directories at the filesystem level on a regular basis. 

Your idea of using cron jobs to delete yesterday's buckets doesn't sound like a bad idea.

Yes, listing buckets is bad in production. No, this isn't production. Therefore: LIST ALL THE THINGS!

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Sun, Oct 13, 2013 at 8:27 PM, Toby Corkindale <[hidden email]> wrote:
Hi,
I'd like to hear how other people are approaching the problem of cleaning Riak buckets up at the end of unit tests for their apps.

The problem I have is that multiple tests may be run at once (by different developers or different Jenkins' jobs or even just a parallelised test suite) so I can't really run a blanket delete-all at the end of the test suite, unless I use a randomly-named bucket each time. Yet if I do that, I'm concerned the test suite may crash out prior to the end sometimes, and then never delete that randomly-named bucket.


If secondary indexes aren't required, then the easy solution is to use a randomly-named Bitcask bucket which has a backend configured for a fairly short TTL.


Otherwise, I have wondered about creating buckets with a certain format, perhaps "test-XXXXXX-YYYY-MM-DD", (x=random) and then a nightly cron script can run to find all buckets timestamped from the previous day or earlier, and remove them. I gather listing all buckets is an expensive operation though, although it'll only be running on a testing Riak cluster.


So I wondered how other developers are approaching this issue?


Cheers,
Toby

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




--
Sean Cribbs <[hidden email]>
Software Engineer
Basho Technologies, Inc.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Unit testing, Riak buckets

Sean Cribbs-2
https://github.com/basho/riak_test/blob/master/src/rtdev.erl#L76 is the main function that does the resetting of the git repository (and stops nodes, etc).

https://github.com/basho/riak_test/blob/master/bin/rtdev-setup-releases.sh creates the necessary directory structure and git repo for rtdev.erl from a bunch of prebuilt Riak devrels.

https://github.com/basho/riak_test/blob/master/bin/rtdev-current.sh adds the "current" release to the existing directory structure/repo.

It also bears mentioning that the tests in the official Python client create random bucket and key names, which seems to work pretty well, but it can become troublesome when you have too many buckets and you go to list them or change their properties.


On Sun, Oct 13, 2013 at 9:19 PM, Jeremiah Peschka <[hidden email]> wrote:

Go on.

---
sent from a tiny portion of the hive mind...
in this case, a phone

On Oct 13, 2013 9:19 PM, "Sean Cribbs" <[hidden email]> wrote:
Not sure it's the greatest way (sometimes slow), but our integration testing tool riak_test uses git to store a clean "devrel" and resets/cleans the git repository at the beginning of each test. If you're curious, I can point you to the relevant sections of code that do this.


On Sun, Oct 13, 2013 at 8:34 PM, Jeremiah Peschka <[hidden email]> wrote:
For CorrugatedIron's integration tests, we frequently use a GUID as part of the bucket name and then destroy the bucket after tests finish. Since I'm frequently moving between different Riak builds, I destroy my data directories at the filesystem level on a regular basis. 

Your idea of using cron jobs to delete yesterday's buckets doesn't sound like a bad idea.

Yes, listing buckets is bad in production. No, this isn't production. Therefore: LIST ALL THE THINGS!

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Sun, Oct 13, 2013 at 8:27 PM, Toby Corkindale <[hidden email]> wrote:
Hi,
I'd like to hear how other people are approaching the problem of cleaning Riak buckets up at the end of unit tests for their apps.

The problem I have is that multiple tests may be run at once (by different developers or different Jenkins' jobs or even just a parallelised test suite) so I can't really run a blanket delete-all at the end of the test suite, unless I use a randomly-named bucket each time. Yet if I do that, I'm concerned the test suite may crash out prior to the end sometimes, and then never delete that randomly-named bucket.


If secondary indexes aren't required, then the easy solution is to use a randomly-named Bitcask bucket which has a backend configured for a fairly short TTL.


Otherwise, I have wondered about creating buckets with a certain format, perhaps "test-XXXXXX-YYYY-MM-DD", (x=random) and then a nightly cron script can run to find all buckets timestamped from the previous day or earlier, and remove them. I gather listing all buckets is an expensive operation though, although it'll only be running on a testing Riak cluster.


So I wondered how other developers are approaching this issue?


Cheers,
Toby

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




--
Sean Cribbs <[hidden email]>
Software Engineer
Basho Technologies, Inc.



--
Sean Cribbs <[hidden email]>
Software Engineer
Basho Technologies, Inc.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Unit testing, Riak buckets

Daniil Churikov
In reply to this post by Toby Corkindale
It depends on tests, but our test do allow us not to clean riak after tests at all. is it so bad? Just pick random bucket name and leave data there. Test environment could be nuked any time. Also as time goes more data stashes in riak and you could reveal some interesting facts: like misconfiguration of this environment or if the node crashes b/c of disk fullness how well your cluster tolerate this fault.