Riak 2.1.3 - Multiple indexes created by Solr for the same Riak object

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Riak 2.1.3 - Multiple indexes created by Solr for the same Riak object

Weixi Yen-2
Sort of a unique case, my app was under heavy stress and one of my riak nodes got backed up (other 4 nodes were fine).

I think this caused Riak.update to create an extra index in Solr for the same object when users began running .update on that object.

I have basically 2 questions:

1) Is what I'm describing something that is possible?

2) Is there a way to tell Solr to re-index one single item and get rid of all other indexes of that item?

Considering RiakTS to resolve these issues long term, but have to stick with Solr for at least the next 3 months, would appreciate any insight into how to solve this duplicate index problem.

Thanks,

Weixi


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak 2.1.3 - Multiple indexes created by Solr for the same Riak object

Fred Dushin
Hi Weixi,

You might have to try describing your use case in more detail.  Solr Indices are independent from Riak objects.  They are, instead, associated with riak buckets (or bucket types), and an object (key/value) can only be associated with one bucket.  Therefore, a Riak object can only be associated with one Solr index.  A Solr index can be associated with multiple buckets, but in general the mapping from Riak objects to Solr indices is injective.

Is it possible that you changed the index associated with a bucket at some point in the bucket or bucket type lifecycle?

-Fred

On Sep 10, 2016, at 9:27 PM, Weixi Yen <[hidden email]> wrote:

Sort of a unique case, my app was under heavy stress and one of my riak nodes got backed up (other 4 nodes were fine).

I think this caused Riak.update to create an extra index in Solr for the same object when users began running .update on that object.

I have basically 2 questions:

1) Is what I'm describing something that is possible?

2) Is there a way to tell Solr to re-index one single item and get rid of all other indexes of that item?

Considering RiakTS to resolve these issues long term, but have to stick with Solr for at least the next 3 months, would appreciate any insight into how to solve this duplicate index problem.

Thanks,

Weixi

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak 2.1.3 - Multiple indexes created by Solr for the same Riak object

Magnus Kessler
In reply to this post by Weixi Yen-2
On 11 September 2016 at 02:27, Weixi Yen <[hidden email]> wrote:
Sort of a unique case, my app was under heavy stress and one of my riak nodes got backed up (other 4 nodes were fine).

I think this caused Riak.update to create an extra index in Solr for the same object when users began running .update on that object.

Hi Weixi,

Can you please confirm what you mean by "extra index"? Do you mean that an object was indexed more than once and gets counted / returned by Solr queries? If that's the case, can you please let me know how you query Solr?

 

I have basically 2 questions:

1) Is what I'm describing something that is possible?

Riak/Yokozuna indexes each replica of a Riak object into Solr. With the default n_val of 3, there will be 3 copies of any given object indexed in Solr. Depending on the version of Riak you are using, it's also possible that siblings of Riak objects get indexed independently. So yes, it is possible to find several additional objects in Solr for each KV object. When querying Solr through Riak/Yokozuna, the internal queries are structured in a way that only one replica is returned. Quering Solr nodes directly will typically lack these filters and may return more than one copy of an object.
 

2) Is there a way to tell Solr to re-index one single item and get rid of all other indexes of that item?

You can perform a GET/PUT cycle through Riak KV on an object. This will result in n_val copies of the objects across the Solr instances, that replace previous versions. It is not possible to have just 1 copy, unless the n_val for the object is exactly 1. AFAIK, there have been some fixes to Yokozuna in 2.0.7 and the upcoming 2.2 release that deal better with indexed siblings. Discrepancies between KV objects and their Solr counterparts should be detected and resolved by active anti-entropy (AAE).
 

Considering RiakTS to resolve these issues long term, but have to stick with Solr for at least the next 3 months, would appreciate any insight into how to solve this duplicate index problem.

Thanks,

Weixi


Regards,

Magnus

--
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak 2.1.3 - Multiple indexes created by Solr for the same Riak object

Weixi Yen-2
So yes, it is possible to find several additional objects in Solr for each KV object. When querying Solr through Riak/Yokozuna, the internal queries are structured in a way that only one replica is returned. Quering Solr nodes directly will typically lack these filters and may return more than one copy of an object.
 
Got it, that's what's happening.

You can perform a GET/PUT cycle through Riak KV on an object.

Perfect, will do this to fix when I notice the dupes coming in, thanks!

On Tue, Sep 13, 2016 at 9:21 AM, Weixi Yen <[hidden email]> wrote:
So yes, it is possible to find several additional objects in Solr for each KV object. When querying Solr through Riak/Yokozuna, the internal queries are structured in a way that only one replica is returned. Quering Solr nodes directly will typically lack these filters and may return more than one copy of an object.
 
Got it, that's what's happening.

You can perform a GET/PUT cycle through Riak KV on an object.

Perfect, will do this to fix when I notice the dupes coming in, thanks!

On Tue, Sep 13, 2016 at 4:35 AM, Magnus Kessler <[hidden email]> wrote:
On 11 September 2016 at 02:27, Weixi Yen <[hidden email]> wrote:
Sort of a unique case, my app was under heavy stress and one of my riak nodes got backed up (other 4 nodes were fine).

I think this caused Riak.update to create an extra index in Solr for the same object when users began running .update on that object.

Hi Weixi,

Can you please confirm what you mean by "extra index"? Do you mean that an object was indexed more than once and gets counted / returned by Solr queries? If that's the case, can you please let me know how you query Solr?

 

I have basically 2 questions:

1) Is what I'm describing something that is possible?

Riak/Yokozuna indexes each replica of a Riak object into Solr. With the default n_val of 3, there will be 3 copies of any given object indexed in Solr. Depending on the version of Riak you are using, it's also possible that siblings of Riak objects get indexed independently. So yes, it is possible to find several additional objects in Solr for each KV object. When querying Solr through Riak/Yokozuna, the internal queries are structured in a way that only one replica is returned. Quering Solr nodes directly will typically lack these filters and may return more than one copy of an object.
 

2) Is there a way to tell Solr to re-index one single item and get rid of all other indexes of that item?

You can perform a GET/PUT cycle through Riak KV on an object. This will result in n_val copies of the objects across the Solr instances, that replace previous versions. It is not possible to have just 1 copy, unless the n_val for the object is exactly 1. AFAIK, there have been some fixes to Yokozuna in 2.0.7 and the upcoming 2.2 release that deal better with indexed siblings. Discrepancies between KV objects and their Solr counterparts should be detected and resolved by active anti-entropy (AAE).
 

Considering RiakTS to resolve these issues long term, but have to stick with Solr for at least the next 3 months, would appreciate any insight into how to solve this duplicate index problem.

Thanks,

Weixi


Regards,

Magnus

--
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com