Skip to content

Result order is not preserved with records adapters (Mongoid in my case). #99

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dwkoogt opened this issue May 2, 2014 · 6 comments
Closed
Labels

Comments

@dwkoogt
Copy link
Contributor

dwkoogt commented May 2, 2014

Hi.
Let say i have two records and I'm running a query such that the order of the resulting records are reversed. I'm running a simple function score query and I can see that record #2 has higher score than record #1 (in response.response). But when I get the records, the result order is back to natural order. I tried to show the records' scores with each_with_hit function and saw that the scores are switched. I looked into the code and saw two problems. In the adapter code, it's using $in query for array of ids. I believe mongo returns results in natural order unless sort is defined. It will not return the results with the ids' array order. Then in the records class, this db result ( #1, #2) is zipped with elasticsearch result (#2, #1). Hence the scores get switched.

fyi
https://jira.mongodb.org/browse/SERVER-7528

@dwkoogt
Copy link
Contributor Author

dwkoogt commented May 2, 2014

I see that you have to call to_a on the records to have it in the order elasticsearch returned. However, each_with_hit function does not call to_a before it gets zipped. So the problem still exist. And should we have to be calling to_a on records to ensure order every time?

@karmi
Copy link
Contributor

karmi commented May 3, 2014

So, paraphrasing if I got you right:

  • You're wondering whether you have to call to_a on the records to preserve the order returned from Elasticsearch: yes, you have to. Notice that in many cases -- like when you iterate over the objects in a Rails application --, the to_a method is called automatically. Same with any first etc method.
  • You're saying that neither each_with_hit nor map_with_hit will preserve the order. That would be surprising, since I imagine the zip automatically calls to_a, but it might call just each. A good way of veryfying this would be a unit or integration test test.

@dwkoogt
Copy link
Contributor Author

dwkoogt commented May 5, 2014

I've checked in the broken test.
The reason i notice this error in the first place was calling first on records returned wrong result.
So, should I be doing records.to_a.first or should records.first be automatically call to_a?

@karmi
Copy link
Contributor

karmi commented May 6, 2014

Not sure I get the problem. The test in #102 runs fine for me, and when I check it manually, it all sorts out:

> response.records.to_a
  Article Load (0.3ms)  SELECT "articles".* FROM "articles" WHERE "articles"."id" IN (3, 2)
=> [#<Article id: 3, title: "Coding", created_at: "2014-05-06 13:22:07">,
 #<Article id: 2, title: "Testing Coding", created_at: "2014-05-06 13:22:07">]

> response.results.to_a
=> [#<Elasticsearch::Model::Response::Result:0x007fa8ae9a5f48
  @result=
   {"_index"=>"articles",
    "_type"=>"article",
    "_id"=>"3",
    "_score"=>1.0,
    "_source"=>
     {"id"=>3, "title"=>"Coding", "created_at"=>"2014-05-06T13:22:07Z"}}>,
 #<Elasticsearch::Model::Response::Result:0x007fa8ab21b4b0
  @result=
   {"_index"=>"articles",
    "_type"=>"article",
    "_id"=>"2",
    "_score"=>0.625,
    "_source"=>
     {"id"=>2,
      "title"=>"Testing Coding",
      "created_at"=>"2014-05-06T13:22:07Z"}}>]

response.records.each_with_hit { |r, h| puts "#{r.id}, #{h._id}, #{r.title}, #{h._source.title}" }
  Article Load (0.2ms)  SELECT "articles".* FROM "articles" WHERE "articles"."id" IN (3, 2)
3, 3, Coding, Coding
2, 2, Testing Coding, Testing Coding

> response.records.map_with_hit { |r,h| [r.id, h._id] }
  Article Load (0.2ms)  SELECT "articles".* FROM "articles" WHERE "articles"."id" IN (3, 2)
=> [[3, "3"], [2, "2"]]

@karmi karmi added the waiting label May 6, 2014
@karmi
Copy link
Contributor

karmi commented May 6, 2014

So, there's indeed a problem with Mongoid:

> response.results.to_a                                                      
=> [#<Elasticsearch::Model::Response::Result:0x007ff76bd6acd0
  @result=
   {"_index"=>"mongoid_articles",
    "_type"=>"mongoid_article",
    "_id"=>"5368e467616c755eac7b0000",
    "_score"=>1.0,
    "_source"=>{"title"=>"Coding"}}>,
 #<Elasticsearch::Model::Response::Result:0x007ff76bd6a910
  @result=
   {"_index"=>"mongoid_articles",
    "_type"=>"mongoid_article",
    "_id"=>"5368e467616c755eac7a0000",
    "_score"=>0.625,
    "_source"=>{"title"=>"Testing Coding"}}>]

> response.records.to_a                                                         
    MOPED: 127.0.0.1:27017 QUERY        database=mongoid_articles collection=mongoid_articles selector={"_id"=>{"$in"=>[BSON::ObjectId('5368e467616c755eac7b0000'), BSON::ObjectId('5368e467616c755eac7a0000')]}} flags=[] limit=0 skip=0 batch_size=nil fields=nil runtime: 0.7450ms
=> [#<MongoidArticle _id: 5368e467616c755eac7b0000, id: nil, title: "Coding">,
 #<MongoidArticle _id: 5368e467616c755eac7a0000, id: nil, title: "Testing Coding">]

> response.records.each_with_hit { |r, h| puts "#{r.id}, #{h._id}, #{r.title}, #{h._source.title}" };
   MOPED: 127.0.0.1:27017 QUERY        database=mongoid_articles collection=mongoid_articles selector={"_id"=>{"$in"=>[BSON::ObjectId('5368e467616c755eac7b0000'), BSON::ObjectId('5368e467616c755eac7a0000')]}} flags=[] limit=0 skip=0 batch_size=nil fields=nil runtime: 0.8590ms

5368e467616c755eac7a0000, 5368e467616c755eac7b0000, Testing Coding, Coding
5368e467616c755eac7b0000, 5368e467616c755eac7a0000, Coding, Testing Coding

Notice how the results are flipped on the last line. It doesn't matter, though, in what order Mongoid returns the records, since we reorder them in the adapter: https://github.com/elasticsearch/elasticsearch-rails/blob/master/elasticsearch-model/lib/elasticsearch/model/adapters/mongoid.rb#L23

For whatever reason, the reordering is not working as expected.

karmi added a commit that referenced this issue May 7, 2014
…rch::Model::Response::Records` call `to_a`

The `zip` method in `records.rb` didn't call `to_a` on the collection,
thus the records were not returned in the same order as Elasticsearch's results.

Related #99
Related #102
@karmi karmi closed this as completed in 104f91f May 7, 2014
@Naomarik
Copy link

This just tripped me up a bit until I found this. Mongoid does not seem to care about the order of the IDs passed into its in query.

Might want to update the documentation here: https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-model#search-results-as-database-records to be explicit that you should call to_a on response.records to preserve the sort order when using mongoid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants