Read Before Writes on Distributed Counters

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Read Before Writes on Distributed Counters

wjossey
In the context of using distributed counters (introduced in 1.4), is it strictly necessary to perform a read prior to issue a write for a given key?  A la, if I want to blindly increment a value by 1, regardless of what its current value is, is it sufficient to issue the write without previously having read the object?

I ask because looking at some of the implementations for counters in the open source community, it's common to perform a read before a write, which impacts performance ceilings on clusters with high volume reads / writes.  I want to verify before issuing some PRs that this is in fact safe behavior.

Thank you!
-Wes Jossey
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

Sam Elliott
It is perfectly safe with Counters to "blindly" issue an update. Clients (for counters) should allow a way to blindly send updates.

You should only be aware that your updates are *not* idempotent - if you retry an update to a counter, both updates could be preserved.

Sam
--
Sam Elliott
Engineer
[hidden email]
--


On Thursday, 17 October 2013 at 10:03AM, Weston Jossey wrote:

> In the context of using distributed counters (introduced in 1.4), is it strictly necessary to perform a read prior to issue a write for a given key? A la, if I want to blindly increment a value by 1, regardless of what its current value is, is it sufficient to issue the write without previously having read the object?
>
> I ask because looking at some of the implementations for counters in the open source community, it's common to perform a read before a write, which impacts performance ceilings on clusters with high volume reads / writes. I want to verify before issuing some PRs that this is in fact safe behavior.
>
> Thank you!
> -Wes Jossey
> _______________________________________________
> riak-users mailing list
> [hidden email] (mailto:[hidden email])
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

Russell Brown
In reply to this post by wjossey
Hi Wes,

The client application does not need to perform a read before a write, the riak server must read from disk before updating the counter. Or at least it must with our current implementation.

What PRs did you have in mind? I'm curious.

Oh, it looks like Sam beat me to it…to elaborate on his "not idempotent" line, that means when riak tells you "error" for some counter increment, it may only be a partial failure, and re-running the operation may lead to over counting.

Cheers

Russell

On 17 Oct 2013, at 16:03, Weston Jossey <[hidden email]> wrote:

> In the context of using distributed counters (introduced in 1.4), is it strictly necessary to perform a read prior to issue a write for a given key?  A la, if I want to blindly increment a value by 1, regardless of what its current value is, is it sufficient to issue the write without previously having read the object?
>
> I ask because looking at some of the implementations for counters in the open source community, it's common to perform a read before a write, which impacts performance ceilings on clusters with high volume reads / writes.  I want to verify before issuing some PRs that this is in fact safe behavior.
>
> Thank you!
> -Wes Jossey
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

wjossey
Great everyone, thank you.  

@Russell:  I specifically work with either Go (https://github.com/tpjg/goriakpbc) or Ruby (basho client).  I haven't tested the ruby client, but I'd assume it will perform the write without the read (based on my reading of the code).  The Go library, on the other hand, currently always performs a read prior to the write.  It's an easy patch that I've already applied locally for benchmarking, I just didn't want to submit the PR till I was sure this was the correct behavior.

Somewhat off topic, but I dont' want to open up another thread if it's unnecessary.  This questions arose because I've been doing extensive benchmarking around distributed counters.  Are there pre-existing benchmarks out there that I can measure myself against?  I haven't stumbled across many at this point, probably because of how new it is.

Cheers,
Wes


On Thu, Oct 17, 2013 at 10:21 AM, Russell Brown <[hidden email]> wrote:
Hi Wes,

The client application does not need to perform a read before a write, the riak server must read from disk before updating the counter. Or at least it must with our current implementation.

What PRs did you have in mind? I'm curious.

Oh, it looks like Sam beat me to it…to elaborate on his "not idempotent" line, that means when riak tells you "error" for some counter increment, it may only be a partial failure, and re-running the operation may lead to over counting.

Cheers

Russell

On 17 Oct 2013, at 16:03, Weston Jossey <[hidden email]> wrote:

> In the context of using distributed counters (introduced in 1.4), is it strictly necessary to perform a read prior to issue a write for a given key?  A la, if I want to blindly increment a value by 1, regardless of what its current value is, is it sufficient to issue the write without previously having read the object?
>
> I ask because looking at some of the implementations for counters in the open source community, it's common to perform a read before a write, which impacts performance ceilings on clusters with high volume reads / writes.  I want to verify before issuing some PRs that this is in fact safe behavior.
>
> Thank you!
> -Wes Jossey
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

Russell Brown
I have some from a while back, if I can find my graphs I'll put them up somewhere.

Cheers

Russell

On 17 Oct 2013, at 16:35, Weston Jossey <[hidden email]> wrote:

> Great everyone, thank you.  
>
> @Russell:  I specifically work with either Go (https://github.com/tpjg/goriakpbc) or Ruby (basho client).  I haven't tested the ruby client, but I'd assume it will perform the write without the read (based on my reading of the code).  The Go library, on the other hand, currently always performs a read prior to the write.  It's an easy patch that I've already applied locally for benchmarking, I just didn't want to submit the PR till I was sure this was the correct behavior.
>
> Somewhat off topic, but I dont' want to open up another thread if it's unnecessary.  This questions arose because I've been doing extensive benchmarking around distributed counters.  Are there pre-existing benchmarks out there that I can measure myself against?  I haven't stumbled across many at this point, probably because of how new it is.
>
> Cheers,
> Wes
>
>
> On Thu, Oct 17, 2013 at 10:21 AM, Russell Brown <[hidden email]> wrote:
> Hi Wes,
>
> The client application does not need to perform a read before a write, the riak server must read from disk before updating the counter. Or at least it must with our current implementation.
>
> What PRs did you have in mind? I'm curious.
>
> Oh, it looks like Sam beat me to it…to elaborate on his "not idempotent" line, that means when riak tells you "error" for some counter increment, it may only be a partial failure, and re-running the operation may lead to over counting.
>
> Cheers
>
> Russell
>
> On 17 Oct 2013, at 16:03, Weston Jossey <[hidden email]> wrote:
>
> > In the context of using distributed counters (introduced in 1.4), is it strictly necessary to perform a read prior to issue a write for a given key?  A la, if I want to blindly increment a value by 1, regardless of what its current value is, is it sufficient to issue the write without previously having read the object?
> >
> > I ask because looking at some of the implementations for counters in the open source community, it's common to perform a read before a write, which impacts performance ceilings on clusters with high volume reads / writes.  I want to verify before issuing some PRs that this is in fact safe behavior.
> >
> > Thank you!
> > -Wes Jossey
> > _______________________________________________
> > riak-users mailing list
> > [hidden email]
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

Daniil Churikov
In reply to this post by Sam Elliott
Correct me if I wrong, but when you blindly do update without previous read, you create a sibling, which should be resolved on read. In case if you make a lot of increments for counter and rarely reads it will lead to siblings explosion.

I am not familiar with new counters datatypes, so I am curious.
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

Jeremiah Peschka
When you 'update' a counter, you send in an increment operation. That's added to an internal list in Riak. The operations are then zipped up to provide the correct counter value on read. The worst that you'll do is add a large(ish) number of values to the op list inside Riak. 

Siblings will be created, but they will not be visible to the end user who is reading from the counter.

Check out this demo of the new counter types from Sean Cribbs: https://vimeo.com/43903960

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Thu, Oct 17, 2013 at 9:55 AM, Daniil Churikov <[hidden email]> wrote:
Correct me if I wrong, but when you blindly do update without previous read,
you create a sibling, which should be resolved on read. In case if you make
a lot of increments for counter and rarely reads it will lead to siblings
explosion.

I am not familiar with new counters datatypes, so I am curious.



--
View this message in context: http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
Sent from the Riak Users mailing list archive at Nabble.com.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

Sean Cribbs-2
In reply to this post by Daniil Churikov
The reasons counters are interesting are:

1) You send an "increment" or "decrement" operation rather than the new value.
2) Any conflicts that were created by that operation get resolved automatically.

So, no, sibling explosion will not occur.


On Thu, Oct 17, 2013 at 3:55 PM, Daniil Churikov <[hidden email]> wrote:
Correct me if I wrong, but when you blindly do update without previous read,
you create a sibling, which should be resolved on read. In case if you make
a lot of increments for counter and rarely reads it will lead to siblings
explosion.

I am not familiar with new counters datatypes, so I am curious.



--
View this message in context: http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
Sent from the Riak Users mailing list archive at Nabble.com.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



--
Sean Cribbs <[hidden email]>
Software Engineer
Basho Technologies, Inc.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

Sean Cribbs-2
In reply to this post by Jeremiah Peschka
Since Jeremiah loves it when I'm pedantic, it bears mentioning that the list of operations are rolled up immediately (not kept around), grouping by which partition took the increment. So if I increment by 2 and then by 50, and the increment goes to different replicas, my counter will look like [{a, 2}, {b, 50}], for a sum of 52.


On Thu, Oct 17, 2013 at 4:21 PM, Jeremiah Peschka <[hidden email]> wrote:
When you 'update' a counter, you send in an increment operation. That's added to an internal list in Riak. The operations are then zipped up to provide the correct counter value on read. The worst that you'll do is add a large(ish) number of values to the op list inside Riak. 

Siblings will be created, but they will not be visible to the end user who is reading from the counter.

Check out this demo of the new counter types from Sean Cribbs: https://vimeo.com/43903960

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Thu, Oct 17, 2013 at 9:55 AM, Daniil Churikov <[hidden email]> wrote:
Correct me if I wrong, but when you blindly do update without previous read,
you create a sibling, which should be resolved on read. In case if you make
a lot of increments for counter and rarely reads it will lead to siblings
explosion.

I am not familiar with new counters datatypes, so I am curious.



--
View this message in context: http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
Sent from the Riak Users mailing list archive at Nabble.com.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




--
Sean Cribbs <[hidden email]>
Software Engineer
Basho Technologies, Inc.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

Jeremiah Peschka
That's why I linked to the video - it's 60 minutes of Cribbs™ brand pedantry.

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Thu, Oct 17, 2013 at 10:45 AM, Sean Cribbs <[hidden email]> wrote:
Since Jeremiah loves it when I'm pedantic, it bears mentioning that the list of operations are rolled up immediately (not kept around), grouping by which partition took the increment. So if I increment by 2 and then by 50, and the increment goes to different replicas, my counter will look like [{a, 2}, {b, 50}], for a sum of 52.


On Thu, Oct 17, 2013 at 4:21 PM, Jeremiah Peschka <[hidden email]> wrote:
When you 'update' a counter, you send in an increment operation. That's added to an internal list in Riak. The operations are then zipped up to provide the correct counter value on read. The worst that you'll do is add a large(ish) number of values to the op list inside Riak. 

Siblings will be created, but they will not be visible to the end user who is reading from the counter.

Check out this demo of the new counter types from Sean Cribbs: https://vimeo.com/43903960

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Thu, Oct 17, 2013 at 9:55 AM, Daniil Churikov <[hidden email]> wrote:
Correct me if I wrong, but when you blindly do update without previous read,
you create a sibling, which should be resolved on read. In case if you make
a lot of increments for counter and rarely reads it will lead to siblings
explosion.

I am not familiar with new counters datatypes, so I am curious.



--
View this message in context: http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
Sent from the Riak Users mailing list archive at Nabble.com.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




--
Sean Cribbs <[hidden email]>
Software Engineer
Basho Technologies, Inc.


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

Russell Brown
In reply to this post by Daniil Churikov
Hi Daniil,

On 17 Oct 2013, at 16:55, Daniil Churikov <[hidden email]> wrote:

> Correct me if I wrong, but when you blindly do update without previous read,
> you create a sibling, which should be resolved on read. In case if you make
> a lot of increments for counter and rarely reads it will lead to siblings
> explosion.
>
> I am not familiar with new counters datatypes, so I am curious.

The counters in riak 1.4 are the first of a few data types we are building. The main change, conceptually, is that Riak knows about the type of the data you're storing in a counter.
Riak already detects conflicting writes, (writes that are causally concurrent), but doesn't know how to merge your data to a single value, instead it presents all the conflicting values to the client to resolve. However, in the case of a counter Riak _does_ know the meaning of your data and we're using a data type that can automatically merge to a correct value.

There is code running on Riak that will automatically merge counter siblings on write. And if siblings are detected on read, they are merged that a single value is presented to the client application.

I think Sean Cribbs has replied faster than me this time, and he's hinted at how the data type is implemented.

Cheers

Russell

>
>
>
> --
> View this message in context: http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
> Sent from the Riak Users mailing list archive at Nabble.com.
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

Russell Brown
In reply to this post by Jeremiah Peschka

On 17 Oct 2013, at 17:21, Jeremiah Peschka <[hidden email]> wrote:

> When you 'update' a counter, you send in an increment operation. That's added to an internal list in Riak. The operations are then zipped up to provide the correct counter value on read. The worst that you'll do is add a large(ish) number of values to the op list inside Riak.

Just to borrow some Cribbs-brand pedantry here:- That isn't true. We read the data from disk, increment an entry in what is essentially a version vector, and write it back, (then replicate the result to N-1 vnodes.) The size of the counter depends on the number of actors that have incremented it (typically N) not the number of operations.

>
> Siblings will be created, but they will not be visible to the end user who is reading from the counter.

There won't be siblings on disk (we do create a temporary one in memory, does that count?) _unless_

1. you also write an object to that same key in a normal riak kv  way (don't do that)
2. AAE or MDC cause a sibling to be created (this is because we use the operation of incrementing a counter to identify a key as counter, to the rest of riak it is just a riak object)

In that last case, an increment operation to the key will resolve the sibling(s).

Cheers

Russell

>
> Check out this demo of the new counter types from Sean Cribbs: https://vimeo.com/43903960
>
> ---
> Jeremiah Peschka - Founder, Brent Ozar Unlimited
> MCITP: SQL Server 2008, MVP
> Cloudera Certified Developer for Apache Hadoop
>
>
> On Thu, Oct 17, 2013 at 9:55 AM, Daniil Churikov <[hidden email]> wrote:
> Correct me if I wrong, but when you blindly do update without previous read,
> you create a sibling, which should be resolved on read. In case if you make
> a lot of increments for counter and rarely reads it will lead to siblings
> explosion.
>
> I am not familiar with new counters datatypes, so I am curious.
>
>
>
> --
> View this message in context: http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
> Sent from the Riak Users mailing list archive at Nabble.com.
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Read Before Writes on Distributed Counters

wjossey
And, just to close the loop, I went ahead and patched the Go library to support the above functionality.

Thanks for the help everyone.