How would a composite variable be strongly correlated with one variable but not the other?
up vote
1
down vote
favorite
I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.
However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?
correlation
add a comment |
up vote
1
down vote
favorite
I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.
However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?
correlation
1
A scatter plot matrix should help.
– Nick Cox
47 mins ago
1
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
16 mins ago
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.
However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?
correlation
I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.
However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?
correlation
correlation
edited 46 mins ago
Nick Cox
37.9k480127
37.9k480127
asked 5 hours ago
hlinee
336
336
1
A scatter plot matrix should help.
– Nick Cox
47 mins ago
1
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
16 mins ago
add a comment |
1
A scatter plot matrix should help.
– Nick Cox
47 mins ago
1
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
16 mins ago
1
1
A scatter plot matrix should help.
– Nick Cox
47 mins ago
A scatter plot matrix should help.
– Nick Cox
47 mins ago
1
1
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
16 mins ago
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
16 mins ago
add a comment |
3 Answers
3
active
oldest
votes
up vote
8
down vote
Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.
add a comment |
up vote
1
down vote
This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.
You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.
New contributor
By "covariation", do you mean "covariance"?
– Kodiologist
4 hours ago
add a comment |
up vote
0
down vote
You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
8
down vote
Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.
add a comment |
up vote
8
down vote
Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.
add a comment |
up vote
8
down vote
up vote
8
down vote
Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.
Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.
edited 4 hours ago
answered 5 hours ago
Kodiologist
16.4k22952
16.4k22952
add a comment |
add a comment |
up vote
1
down vote
This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.
You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.
New contributor
By "covariation", do you mean "covariance"?
– Kodiologist
4 hours ago
add a comment |
up vote
1
down vote
This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.
You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.
New contributor
By "covariation", do you mean "covariance"?
– Kodiologist
4 hours ago
add a comment |
up vote
1
down vote
up vote
1
down vote
This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.
You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.
New contributor
This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.
You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.
New contributor
New contributor
answered 5 hours ago
Gkhan Cebs
211
211
New contributor
New contributor
By "covariation", do you mean "covariance"?
– Kodiologist
4 hours ago
add a comment |
By "covariation", do you mean "covariance"?
– Kodiologist
4 hours ago
By "covariation", do you mean "covariance"?
– Kodiologist
4 hours ago
By "covariation", do you mean "covariance"?
– Kodiologist
4 hours ago
add a comment |
up vote
0
down vote
You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.
add a comment |
up vote
0
down vote
You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.
add a comment |
up vote
0
down vote
up vote
0
down vote
You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.
You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.
answered 54 mins ago
Acccumulation
1,51826
1,51826
add a comment |
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f381477%2fhow-would-a-composite-variable-be-strongly-correlated-with-one-variable-but-not%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
A scatter plot matrix should help.
– Nick Cox
47 mins ago
1
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
16 mins ago