Combine solr's document score with a static, indexed score in solr 7.x











up vote
0
down vote

favorite












I have people indexed into solr based on structured documents. For simplicity's sake, let's say they have the following schema



{      
personName: text,
games :[ { gamerScore: int, game: text } ]
}


An example of the above would be



{     
personName: john,
games: [
{ gamerScore: 80, game: Zelda },
{ gamerScore: 20, game: Space Invader },
{ gamerScore: 60, game: Tetris},
]
}


'gamerScore' (a value between 1 and 100 to indicate how good the person is in the specified game).
Relevance matching in solr is all done through the Text field 'game'. However, I want my final result list to be a combination of relevance to the query as provided by solr and my own gamerScore. Namely, I need to re-rank the results based on the following formula:



personFinalScore = (0.8 * solrScore) + (0.2 * gamerScore)


What am trying to achieve is the combination of two different scores in a weighted manner in solr. This question was asked a long time ago, and was wondering if there is something in solr v7.x. that can tackle this.



I can change the schema around if a solution requires it.










share|improve this question




























    up vote
    0
    down vote

    favorite












    I have people indexed into solr based on structured documents. For simplicity's sake, let's say they have the following schema



    {      
    personName: text,
    games :[ { gamerScore: int, game: text } ]
    }


    An example of the above would be



    {     
    personName: john,
    games: [
    { gamerScore: 80, game: Zelda },
    { gamerScore: 20, game: Space Invader },
    { gamerScore: 60, game: Tetris},
    ]
    }


    'gamerScore' (a value between 1 and 100 to indicate how good the person is in the specified game).
    Relevance matching in solr is all done through the Text field 'game'. However, I want my final result list to be a combination of relevance to the query as provided by solr and my own gamerScore. Namely, I need to re-rank the results based on the following formula:



    personFinalScore = (0.8 * solrScore) + (0.2 * gamerScore)


    What am trying to achieve is the combination of two different scores in a weighted manner in solr. This question was asked a long time ago, and was wondering if there is something in solr v7.x. that can tackle this.



    I can change the schema around if a solution requires it.










    share|improve this question


























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I have people indexed into solr based on structured documents. For simplicity's sake, let's say they have the following schema



      {      
      personName: text,
      games :[ { gamerScore: int, game: text } ]
      }


      An example of the above would be



      {     
      personName: john,
      games: [
      { gamerScore: 80, game: Zelda },
      { gamerScore: 20, game: Space Invader },
      { gamerScore: 60, game: Tetris},
      ]
      }


      'gamerScore' (a value between 1 and 100 to indicate how good the person is in the specified game).
      Relevance matching in solr is all done through the Text field 'game'. However, I want my final result list to be a combination of relevance to the query as provided by solr and my own gamerScore. Namely, I need to re-rank the results based on the following formula:



      personFinalScore = (0.8 * solrScore) + (0.2 * gamerScore)


      What am trying to achieve is the combination of two different scores in a weighted manner in solr. This question was asked a long time ago, and was wondering if there is something in solr v7.x. that can tackle this.



      I can change the schema around if a solution requires it.










      share|improve this question















      I have people indexed into solr based on structured documents. For simplicity's sake, let's say they have the following schema



      {      
      personName: text,
      games :[ { gamerScore: int, game: text } ]
      }


      An example of the above would be



      {     
      personName: john,
      games: [
      { gamerScore: 80, game: Zelda },
      { gamerScore: 20, game: Space Invader },
      { gamerScore: 60, game: Tetris},
      ]
      }


      'gamerScore' (a value between 1 and 100 to indicate how good the person is in the specified game).
      Relevance matching in solr is all done through the Text field 'game'. However, I want my final result list to be a combination of relevance to the query as provided by solr and my own gamerScore. Namely, I need to re-rank the results based on the following formula:



      personFinalScore = (0.8 * solrScore) + (0.2 * gamerScore)


      What am trying to achieve is the combination of two different scores in a weighted manner in solr. This question was asked a long time ago, and was wondering if there is something in solr v7.x. that can tackle this.



      I can change the schema around if a solution requires it.







      solr lucene information-retrieval






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 30 at 14:02

























      asked Nov 22 at 15:27









      J S

      61110




      61110
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote













          In effect your formula can be simplified to applying your gamerScore with 0.25 - the absolute value of the score is irrelevant, just how much the gamerScore field affects the score of the document.



          The dismax based handlers supports bf:




          The bf parameter specifies functions (with optional boosts) that will
          be used to construct FunctionQueries which will be added to the user’s
          main query as optional clauses that will influence the score.




          Since bf is an addtive boost, you can use bf=product(gamerScore,0.25) to make the gamerScore count 20% of the total score.






          share|improve this answer





















          • thank you, I might be missing something. Here is an example, how would your suggestion work if hypothetically, we have 3 players and all three played multiple games and they also played "Zelda" and they have the following scores for "Zelda" [Player1: 30, Player2: 50, Player3: 5]. if I search for Zelda, The results should come up as P2, P1, P3. I don't see how the function boost can work here. Or how it can take the Zelda Score only and compare it to other player's Zelda scores.
            – J S
            Dec 4 at 18:26








          • 1




            In your example you don't need any boosting - a simple sort=zelda desc would suffice. You'd index the "zelda" score into a field named zelda in that case, and order by it. Boosting is useful for what you described - moving other results around based on a different factor, not straight up ordering. To apply it as a boost to achieve your formula, index it into a field named as the game as well, then apply that as a boost as described.
            – MatsLindh
            Dec 4 at 21:25












          • Are you saying I should create a dynamic filed called "game_*" and at insertion time it would be "game_zelda" with a static value as above. if so, what would happen at query time? for instance, if a person query for a game that I don't have any player for. Example game_SLKHF. Moreover, what happens when the query contain multiple games. I understand sort will not take into account all the games then sort, rather sort on the first field, and then the second and so forth. "
            – J S
            Dec 4 at 22:09












          • Correct. At query time the boost from the field will be applied if the field exists. Boosting changes the score of documents - possibly reordering them - the selection of documents isn't affected. You'll use fq for that (or the regular query). If the query contains multiple games that you've detected, you add them as separate fields and add boosts for each game.
            – MatsLindh
            Dec 5 at 10:00










          • To add to my previous comment - the best way to actually implement this (through game detection) is to first use the query to search a collection with valid games, then use the list returned from that collection to generate field names (i.e. game_<game_id>, where game_id is unique for that game (i.e. zelda or zelda_2, etc.).
            – MatsLindh
            Dec 5 at 13:28











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53434105%2fcombine-solrs-document-score-with-a-static-indexed-score-in-solr-7-x%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          0
          down vote













          In effect your formula can be simplified to applying your gamerScore with 0.25 - the absolute value of the score is irrelevant, just how much the gamerScore field affects the score of the document.



          The dismax based handlers supports bf:




          The bf parameter specifies functions (with optional boosts) that will
          be used to construct FunctionQueries which will be added to the user’s
          main query as optional clauses that will influence the score.




          Since bf is an addtive boost, you can use bf=product(gamerScore,0.25) to make the gamerScore count 20% of the total score.






          share|improve this answer





















          • thank you, I might be missing something. Here is an example, how would your suggestion work if hypothetically, we have 3 players and all three played multiple games and they also played "Zelda" and they have the following scores for "Zelda" [Player1: 30, Player2: 50, Player3: 5]. if I search for Zelda, The results should come up as P2, P1, P3. I don't see how the function boost can work here. Or how it can take the Zelda Score only and compare it to other player's Zelda scores.
            – J S
            Dec 4 at 18:26








          • 1




            In your example you don't need any boosting - a simple sort=zelda desc would suffice. You'd index the "zelda" score into a field named zelda in that case, and order by it. Boosting is useful for what you described - moving other results around based on a different factor, not straight up ordering. To apply it as a boost to achieve your formula, index it into a field named as the game as well, then apply that as a boost as described.
            – MatsLindh
            Dec 4 at 21:25












          • Are you saying I should create a dynamic filed called "game_*" and at insertion time it would be "game_zelda" with a static value as above. if so, what would happen at query time? for instance, if a person query for a game that I don't have any player for. Example game_SLKHF. Moreover, what happens when the query contain multiple games. I understand sort will not take into account all the games then sort, rather sort on the first field, and then the second and so forth. "
            – J S
            Dec 4 at 22:09












          • Correct. At query time the boost from the field will be applied if the field exists. Boosting changes the score of documents - possibly reordering them - the selection of documents isn't affected. You'll use fq for that (or the regular query). If the query contains multiple games that you've detected, you add them as separate fields and add boosts for each game.
            – MatsLindh
            Dec 5 at 10:00










          • To add to my previous comment - the best way to actually implement this (through game detection) is to first use the query to search a collection with valid games, then use the list returned from that collection to generate field names (i.e. game_<game_id>, where game_id is unique for that game (i.e. zelda or zelda_2, etc.).
            – MatsLindh
            Dec 5 at 13:28















          up vote
          0
          down vote













          In effect your formula can be simplified to applying your gamerScore with 0.25 - the absolute value of the score is irrelevant, just how much the gamerScore field affects the score of the document.



          The dismax based handlers supports bf:




          The bf parameter specifies functions (with optional boosts) that will
          be used to construct FunctionQueries which will be added to the user’s
          main query as optional clauses that will influence the score.




          Since bf is an addtive boost, you can use bf=product(gamerScore,0.25) to make the gamerScore count 20% of the total score.






          share|improve this answer





















          • thank you, I might be missing something. Here is an example, how would your suggestion work if hypothetically, we have 3 players and all three played multiple games and they also played "Zelda" and they have the following scores for "Zelda" [Player1: 30, Player2: 50, Player3: 5]. if I search for Zelda, The results should come up as P2, P1, P3. I don't see how the function boost can work here. Or how it can take the Zelda Score only and compare it to other player's Zelda scores.
            – J S
            Dec 4 at 18:26








          • 1




            In your example you don't need any boosting - a simple sort=zelda desc would suffice. You'd index the "zelda" score into a field named zelda in that case, and order by it. Boosting is useful for what you described - moving other results around based on a different factor, not straight up ordering. To apply it as a boost to achieve your formula, index it into a field named as the game as well, then apply that as a boost as described.
            – MatsLindh
            Dec 4 at 21:25












          • Are you saying I should create a dynamic filed called "game_*" and at insertion time it would be "game_zelda" with a static value as above. if so, what would happen at query time? for instance, if a person query for a game that I don't have any player for. Example game_SLKHF. Moreover, what happens when the query contain multiple games. I understand sort will not take into account all the games then sort, rather sort on the first field, and then the second and so forth. "
            – J S
            Dec 4 at 22:09












          • Correct. At query time the boost from the field will be applied if the field exists. Boosting changes the score of documents - possibly reordering them - the selection of documents isn't affected. You'll use fq for that (or the regular query). If the query contains multiple games that you've detected, you add them as separate fields and add boosts for each game.
            – MatsLindh
            Dec 5 at 10:00










          • To add to my previous comment - the best way to actually implement this (through game detection) is to first use the query to search a collection with valid games, then use the list returned from that collection to generate field names (i.e. game_<game_id>, where game_id is unique for that game (i.e. zelda or zelda_2, etc.).
            – MatsLindh
            Dec 5 at 13:28













          up vote
          0
          down vote










          up vote
          0
          down vote









          In effect your formula can be simplified to applying your gamerScore with 0.25 - the absolute value of the score is irrelevant, just how much the gamerScore field affects the score of the document.



          The dismax based handlers supports bf:




          The bf parameter specifies functions (with optional boosts) that will
          be used to construct FunctionQueries which will be added to the user’s
          main query as optional clauses that will influence the score.




          Since bf is an addtive boost, you can use bf=product(gamerScore,0.25) to make the gamerScore count 20% of the total score.






          share|improve this answer












          In effect your formula can be simplified to applying your gamerScore with 0.25 - the absolute value of the score is irrelevant, just how much the gamerScore field affects the score of the document.



          The dismax based handlers supports bf:




          The bf parameter specifies functions (with optional boosts) that will
          be used to construct FunctionQueries which will be added to the user’s
          main query as optional clauses that will influence the score.




          Since bf is an addtive boost, you can use bf=product(gamerScore,0.25) to make the gamerScore count 20% of the total score.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 30 at 22:10









          MatsLindh

          24.5k22240




          24.5k22240












          • thank you, I might be missing something. Here is an example, how would your suggestion work if hypothetically, we have 3 players and all three played multiple games and they also played "Zelda" and they have the following scores for "Zelda" [Player1: 30, Player2: 50, Player3: 5]. if I search for Zelda, The results should come up as P2, P1, P3. I don't see how the function boost can work here. Or how it can take the Zelda Score only and compare it to other player's Zelda scores.
            – J S
            Dec 4 at 18:26








          • 1




            In your example you don't need any boosting - a simple sort=zelda desc would suffice. You'd index the "zelda" score into a field named zelda in that case, and order by it. Boosting is useful for what you described - moving other results around based on a different factor, not straight up ordering. To apply it as a boost to achieve your formula, index it into a field named as the game as well, then apply that as a boost as described.
            – MatsLindh
            Dec 4 at 21:25












          • Are you saying I should create a dynamic filed called "game_*" and at insertion time it would be "game_zelda" with a static value as above. if so, what would happen at query time? for instance, if a person query for a game that I don't have any player for. Example game_SLKHF. Moreover, what happens when the query contain multiple games. I understand sort will not take into account all the games then sort, rather sort on the first field, and then the second and so forth. "
            – J S
            Dec 4 at 22:09












          • Correct. At query time the boost from the field will be applied if the field exists. Boosting changes the score of documents - possibly reordering them - the selection of documents isn't affected. You'll use fq for that (or the regular query). If the query contains multiple games that you've detected, you add them as separate fields and add boosts for each game.
            – MatsLindh
            Dec 5 at 10:00










          • To add to my previous comment - the best way to actually implement this (through game detection) is to first use the query to search a collection with valid games, then use the list returned from that collection to generate field names (i.e. game_<game_id>, where game_id is unique for that game (i.e. zelda or zelda_2, etc.).
            – MatsLindh
            Dec 5 at 13:28


















          • thank you, I might be missing something. Here is an example, how would your suggestion work if hypothetically, we have 3 players and all three played multiple games and they also played "Zelda" and they have the following scores for "Zelda" [Player1: 30, Player2: 50, Player3: 5]. if I search for Zelda, The results should come up as P2, P1, P3. I don't see how the function boost can work here. Or how it can take the Zelda Score only and compare it to other player's Zelda scores.
            – J S
            Dec 4 at 18:26








          • 1




            In your example you don't need any boosting - a simple sort=zelda desc would suffice. You'd index the "zelda" score into a field named zelda in that case, and order by it. Boosting is useful for what you described - moving other results around based on a different factor, not straight up ordering. To apply it as a boost to achieve your formula, index it into a field named as the game as well, then apply that as a boost as described.
            – MatsLindh
            Dec 4 at 21:25












          • Are you saying I should create a dynamic filed called "game_*" and at insertion time it would be "game_zelda" with a static value as above. if so, what would happen at query time? for instance, if a person query for a game that I don't have any player for. Example game_SLKHF. Moreover, what happens when the query contain multiple games. I understand sort will not take into account all the games then sort, rather sort on the first field, and then the second and so forth. "
            – J S
            Dec 4 at 22:09












          • Correct. At query time the boost from the field will be applied if the field exists. Boosting changes the score of documents - possibly reordering them - the selection of documents isn't affected. You'll use fq for that (or the regular query). If the query contains multiple games that you've detected, you add them as separate fields and add boosts for each game.
            – MatsLindh
            Dec 5 at 10:00










          • To add to my previous comment - the best way to actually implement this (through game detection) is to first use the query to search a collection with valid games, then use the list returned from that collection to generate field names (i.e. game_<game_id>, where game_id is unique for that game (i.e. zelda or zelda_2, etc.).
            – MatsLindh
            Dec 5 at 13:28
















          thank you, I might be missing something. Here is an example, how would your suggestion work if hypothetically, we have 3 players and all three played multiple games and they also played "Zelda" and they have the following scores for "Zelda" [Player1: 30, Player2: 50, Player3: 5]. if I search for Zelda, The results should come up as P2, P1, P3. I don't see how the function boost can work here. Or how it can take the Zelda Score only and compare it to other player's Zelda scores.
          – J S
          Dec 4 at 18:26






          thank you, I might be missing something. Here is an example, how would your suggestion work if hypothetically, we have 3 players and all three played multiple games and they also played "Zelda" and they have the following scores for "Zelda" [Player1: 30, Player2: 50, Player3: 5]. if I search for Zelda, The results should come up as P2, P1, P3. I don't see how the function boost can work here. Or how it can take the Zelda Score only and compare it to other player's Zelda scores.
          – J S
          Dec 4 at 18:26






          1




          1




          In your example you don't need any boosting - a simple sort=zelda desc would suffice. You'd index the "zelda" score into a field named zelda in that case, and order by it. Boosting is useful for what you described - moving other results around based on a different factor, not straight up ordering. To apply it as a boost to achieve your formula, index it into a field named as the game as well, then apply that as a boost as described.
          – MatsLindh
          Dec 4 at 21:25






          In your example you don't need any boosting - a simple sort=zelda desc would suffice. You'd index the "zelda" score into a field named zelda in that case, and order by it. Boosting is useful for what you described - moving other results around based on a different factor, not straight up ordering. To apply it as a boost to achieve your formula, index it into a field named as the game as well, then apply that as a boost as described.
          – MatsLindh
          Dec 4 at 21:25














          Are you saying I should create a dynamic filed called "game_*" and at insertion time it would be "game_zelda" with a static value as above. if so, what would happen at query time? for instance, if a person query for a game that I don't have any player for. Example game_SLKHF. Moreover, what happens when the query contain multiple games. I understand sort will not take into account all the games then sort, rather sort on the first field, and then the second and so forth. "
          – J S
          Dec 4 at 22:09






          Are you saying I should create a dynamic filed called "game_*" and at insertion time it would be "game_zelda" with a static value as above. if so, what would happen at query time? for instance, if a person query for a game that I don't have any player for. Example game_SLKHF. Moreover, what happens when the query contain multiple games. I understand sort will not take into account all the games then sort, rather sort on the first field, and then the second and so forth. "
          – J S
          Dec 4 at 22:09














          Correct. At query time the boost from the field will be applied if the field exists. Boosting changes the score of documents - possibly reordering them - the selection of documents isn't affected. You'll use fq for that (or the regular query). If the query contains multiple games that you've detected, you add them as separate fields and add boosts for each game.
          – MatsLindh
          Dec 5 at 10:00




          Correct. At query time the boost from the field will be applied if the field exists. Boosting changes the score of documents - possibly reordering them - the selection of documents isn't affected. You'll use fq for that (or the regular query). If the query contains multiple games that you've detected, you add them as separate fields and add boosts for each game.
          – MatsLindh
          Dec 5 at 10:00












          To add to my previous comment - the best way to actually implement this (through game detection) is to first use the query to search a collection with valid games, then use the list returned from that collection to generate field names (i.e. game_<game_id>, where game_id is unique for that game (i.e. zelda or zelda_2, etc.).
          – MatsLindh
          Dec 5 at 13:28




          To add to my previous comment - the best way to actually implement this (through game detection) is to first use the query to search a collection with valid games, then use the list returned from that collection to generate field names (i.e. game_<game_id>, where game_id is unique for that game (i.e. zelda or zelda_2, etc.).
          – MatsLindh
          Dec 5 at 13:28


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53434105%2fcombine-solrs-document-score-with-a-static-indexed-score-in-solr-7-x%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          How to ignore python UserWarning in pytest?

          What visual should I use to simply compare current year value vs last year in Power BI desktop

          Script to remove string up to first number