allow_multi VS HTTP Conditional PUT

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

allow_multi VS HTTP Conditional PUT

Eric Moritz
Hi I just read "Why Vector Clocks are Easy". I am having trouble
seeing the advantage of letting a stale PUT into production and merge
afterwards vs HTTP's Conditional PUT, which never let's a stale PUT
into production.

I feel like the way that HTTP handles optimistic concurrency control
works a little better, but feel free to tell me that I am wrong.

So consider the following flow, using the same actors and actions as
the blog post:

1.  Alice Creates new dinner plans for Wednesday: (If-None-Match: *
only succeeds if no record exists):

curl -i -X PUT -H "X-Riak-ClientId: Alice" -H "If-None-Match: *" -H
"Content-Type: text/plain" \
  http://localhost:8098/riak/plans/dinner --data "Wednesday"

2. Ben, Cathy and Dave all fetch the dinner plans:

curl -i http://localhost:8098/riak/plans/dinner
HTTP/1.1 200 OK
Last-Modified: Mon, 03 Jan 2011 05:47:29 GMT
ETag: 4hxy2rXigoJqGuM8ZX44GP

Wednesday

3. Ben, using the ETag from his GET, now PUTs a new date.  Essentially
saying that he derived his data from what he thinks is the most recent
data in the database.  Since no one has updated plans/dinner since he
fetched the dinner plans. The PUT succeeds.

curl -i -X PUT -H "X-Riak-ClientId: Ben" \
   -H "If-Match: 4hxy2rXigoJqGuM8ZX44GP" \
   -H "Content-Type: text/plain"
http://localhost:8098/riak/plans/dinner --data "Tuesday"

4. Dave downloads the latest dinner plans:

curl -i http://localhost:8098/riak/plans/dinner
HTTP/1.1 200 OK
Last-Modified: Mon, 03 Jan 2011 05:49:52 GMT
ETag: 3sJshP0UkUC1pTVyhdP6so
Date: Mon, 03 Jan 2011 05:50:19 GMT
Content-Type: text/plain
Content-Length: 7

Tuesday

5. Cathy took her sweet time to update the dinner plans, when she
tries to update the database, her update fails because it
is based on inaccurate, stale data:

curl -X PUT -H "X-Riak-ClientId: Cathy" \
  -H "If-Match: 4hxy2rXigoJqGuM8ZX44GP"\
  -H "Content-Type: text/plain"
http://localhost:8098/riak/plans/dinner --data "Thursday"

HTTP/1.1 412 Precondition Failed

6. Cathy will now have to fetch the latest data and merge what she
wants with what is current:

curl -i http://localhost:8098/riak/plans/dinner
HTTP/1.1 200 OK
Last-Modified: Mon, 03 Jan 2011 05:49:52 GMT
ETag: 3sJshP0UkUC1pTVyhdP6so
Date: Mon, 03 Jan 2011 05:50:19 GMT
Content-Type: text/plain
Content-Length: 7

Tuesday

7. Cathy uses the new ETag, and updates the record:
curl -X PUT -H "X-Riak-ClientId: Cathy" \
  -H "If-Match: 3sJshP0UkUC1pTVyhdP6so"\
  -H "Content-Type: text/plain"
http://localhost:8098/riak/plans/dinner --data "Thursday"

---

Thanks to webmachine's HTTP completeness, we are fully able to take
advantage of HTTP's Conditional PUTs.  This method has at least one
major advantage, unmerged changes never go into production.  The
disadvantage is the client has to hang onto the request body being PUT
in case the precondition fails and the user or application needs to
reconcile the conflict.

Eric.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: allow_multi VS HTTP Conditional PUT

Justin Sheehy
Hi, Eric.

On Mon, Jan 3, 2011 at 1:09 AM, Eric Moritz <[hidden email]> wrote:

> Hi I just read "Why Vector Clocks are Easy". I am having trouble
> seeing the advantage of letting a stale PUT into production and merge
> afterwards vs HTTP's Conditional PUT, which never let's a stale PUT
> into production.

This is an excellent question, and one that we could discuss for some time.

I am a big fan of HTTP conditional requests, but they are not always
compatible with the other operational needs imposed in the interest of
availability.

The main issue is that Riak's approach is designed for a
highly-available distributed system on the server side, while a
standard HTTP conditional PUT mostly makes sense for single-writer (or
at least single-leader) servers.

Riak is designed to accept requests even when arbitrary nodes are down
or unable to talk to each other.  Achieving that availability goal is
in conflict with the typical expectations around conditional PUT,
which are basically those of an atomic CAS operation.  Since not all
nodes that might hold a copy of some given data might be reached
during a write request, Riak cannot maintain its intended level of
availability and simultaneously ensure that you are really only
overwriting exactly the version that you specify.

I hope that this sheds some light on why we have made the choices that
you see in Riak.

-Justin

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com