How would a composite variable be strongly correlated with one variable but not the other?











up vote
1
down vote

favorite












I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.



However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?










share|cite|improve this question




















  • 1




    A scatter plot matrix should help.
    – Nick Cox
    47 mins ago






  • 1




    Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
    – sds
    16 mins ago















up vote
1
down vote

favorite












I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.



However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?










share|cite|improve this question




















  • 1




    A scatter plot matrix should help.
    – Nick Cox
    47 mins ago






  • 1




    Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
    – sds
    16 mins ago













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.



However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?










share|cite|improve this question















I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.



However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?







correlation






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited 46 mins ago









Nick Cox

37.9k480127




37.9k480127










asked 5 hours ago









hlinee

336




336








  • 1




    A scatter plot matrix should help.
    – Nick Cox
    47 mins ago






  • 1




    Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
    – sds
    16 mins ago














  • 1




    A scatter plot matrix should help.
    – Nick Cox
    47 mins ago






  • 1




    Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
    – sds
    16 mins ago








1




1




A scatter plot matrix should help.
– Nick Cox
47 mins ago




A scatter plot matrix should help.
– Nick Cox
47 mins ago




1




1




Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
16 mins ago




Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
16 mins ago










3 Answers
3






active

oldest

votes

















up vote
8
down vote













Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.






share|cite|improve this answer






























    up vote
    1
    down vote













    This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.



    You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.






    share|cite|improve this answer








    New contributor




    Gkhan Cebs is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.


















    • By "covariation", do you mean "covariance"?
      – Kodiologist
      4 hours ago


















    up vote
    0
    down vote













    You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.






    share|cite|improve this answer





















      Your Answer





      StackExchange.ifUsing("editor", function () {
      return StackExchange.using("mathjaxEditing", function () {
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
      });
      });
      }, "mathjax-editing");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "65"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f381477%2fhow-would-a-composite-variable-be-strongly-correlated-with-one-variable-but-not%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      8
      down vote













      Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.






      share|cite|improve this answer



























        up vote
        8
        down vote













        Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.






        share|cite|improve this answer

























          up vote
          8
          down vote










          up vote
          8
          down vote









          Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.






          share|cite|improve this answer














          Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited 4 hours ago

























          answered 5 hours ago









          Kodiologist

          16.4k22952




          16.4k22952
























              up vote
              1
              down vote













              This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.



              You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.






              share|cite|improve this answer








              New contributor




              Gkhan Cebs is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.


















              • By "covariation", do you mean "covariance"?
                – Kodiologist
                4 hours ago















              up vote
              1
              down vote













              This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.



              You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.






              share|cite|improve this answer








              New contributor




              Gkhan Cebs is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.


















              • By "covariation", do you mean "covariance"?
                – Kodiologist
                4 hours ago













              up vote
              1
              down vote










              up vote
              1
              down vote









              This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.



              You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.






              share|cite|improve this answer








              New contributor




              Gkhan Cebs is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.



              You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.







              share|cite|improve this answer








              New contributor




              Gkhan Cebs is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              share|cite|improve this answer



              share|cite|improve this answer






              New contributor




              Gkhan Cebs is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              answered 5 hours ago









              Gkhan Cebs

              211




              211




              New contributor




              Gkhan Cebs is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.





              New contributor





              Gkhan Cebs is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              Gkhan Cebs is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.












              • By "covariation", do you mean "covariance"?
                – Kodiologist
                4 hours ago


















              • By "covariation", do you mean "covariance"?
                – Kodiologist
                4 hours ago
















              By "covariation", do you mean "covariance"?
              – Kodiologist
              4 hours ago




              By "covariation", do you mean "covariance"?
              – Kodiologist
              4 hours ago










              up vote
              0
              down vote













              You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.






              share|cite|improve this answer

























                up vote
                0
                down vote













                You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.






                share|cite|improve this answer























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.






                  share|cite|improve this answer












                  You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.







                  share|cite|improve this answer












                  share|cite|improve this answer



                  share|cite|improve this answer










                  answered 54 mins ago









                  Acccumulation

                  1,51826




                  1,51826






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Cross Validated!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f381477%2fhow-would-a-composite-variable-be-strongly-correlated-with-one-variable-but-not%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      What visual should I use to simply compare current year value vs last year in Power BI desktop

                      How to ignore python UserWarning in pytest?

                      Alexandru Averescu