list keys with key filters

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

list keys with key filters

Sam Lang

Hello,

I am trying to list a subset of keys in a bucket using the python client with the key filter functionality.  My objects are all binary data, with content-type of application/binary.  After adding a bunch of filters to my query, e.g.:

        filters = key_filter.tokenize("/", 1).eq("foo")
        filters = filters & key_filter.tokenize("/", 2).eq("bar")
        query.add_key_filters(filters)

I'm doing:

    res = query.map("""
                        function(v) {
                                return [[v.key]];
                        }""").run()

    for k in res: print k

This gives the following exception:

"error":"bad_utf8_character_code"

When I use ascii or json objects, I don't get the exception, and the keys get listed properly.  I assume that riak is trying to parse the data of my binary object and failing because the data isn't utf8 encoded.  Is it possible to do this without writing my own data extractor?  Is there a better way to list a subset of keys?

Thanks,
-sam
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: list keys with key filters

bryan-basho
Administrator
On Sat, Apr 21, 2012 at 12:39 PM, Sam Lang <[hidden email]> wrote:
> When I use ascii or json objects, I don't get the exception, and the keys get listed properly.  I assume that riak is trying to parse the data of my binary object and failing because the data isn't utf8 encoded.  Is it possible to do this without writing my own data extractor?  Is there a better way to list a subset of keys?

Hi, Sam.  Your intuition is correct: the bad_utf8_character_code error
is coming from Riak trying to encode each object to JSON for your
Javascript map phase to process.

If all you want is to get the keys back to your client, and you're
using Riak 1.1 or newer with the latest Python client, then the
simplest workaround is to call query.run() without any phases at all:

    >>> query = client.add("foo")
    >>> query.add_key_filter("ends_with", "z")
    <riak.mapreduce.RiakMapReduce object at 0x1006675d0>
    >>> v = query.run()
    >>> v[0]._key
    u'baz'

Note that you will still end up with an encoding error if your keys
cannot be encoded as JSON.

HTH,
Bryan

P.S. If you're using an older version of Riak and/or the Python
client, you may need to use the hack of a single-phase query, of just
one reduce phase implemented by the Erlang function
riak_kv_mapreduce:reduce_identity, instead of using the empty query.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: list keys with key filters

Sam Lang

On Apr 23, 2012, at 8:29 AM, Bryan Fink wrote:

> On Sat, Apr 21, 2012 at 12:39 PM, Sam Lang <[hidden email]> wrote:
>> When I use ascii or json objects, I don't get the exception, and the keys get listed properly.  I assume that riak is trying to parse the data of my binary object and failing because the data isn't utf8 encoded.  Is it possible to do this without writing my own data extractor?  Is there a better way to list a subset of keys?
>
> Hi, Sam.  Your intuition is correct: the bad_utf8_character_code error
> is coming from Riak trying to encode each object to JSON for your
> Javascript map phase to process.
>
> If all you want is to get the keys back to your client, and you're
> using Riak 1.1 or newer with the latest Python client, then the
> simplest workaround is to call query.run() without any phases at all:
>
>>>> query = client.add("foo")
>>>> query.add_key_filter("ends_with", "z")
>    <riak.mapreduce.RiakMapReduce object at 0x1006675d0>
>>>> v = query.run()
>>>> v[0]._key
>    u'baz'
>
> Note that you will still end up with an encoding error if your keys
> cannot be encoded as JSON.

That worked, thanks Bryan!   Any way to avoid sending the values of the object to the client as well?
-sam

>
> HTH,
> Bryan
>
> P.S. If you're using an older version of Riak and/or the Python
> client, you may need to use the hack of a single-phase query, of just
> one reduce phase implemented by the Erlang function
> riak_kv_mapreduce:reduce_identity, instead of using the empty query.


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: list keys with key filters

bryan-basho
Administrator
On Mon, Apr 23, 2012 at 4:18 PM, Sam Lang <[hidden email]> wrote:
> That worked, thanks Bryan!   Any way to avoid sending the values of the object to the client as well?

What I pasted shouldn't be sending the values of the objects to the
client.  It should send just the keys back.  I'm not a major user of
the Python client, but it looks like the result is a list of RiakLink
objects, which are just bucket-key-tag holders, not RiakObject
objects, which include values and metadata.

-Bryan

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com