ELI5: The Logic Behind Coefficient Estimation in OLS Regression
up vote
3
down vote
favorite
Like a lot of people, I understand how to run a linear regression, I understand how to interpret its output, and I understand its limitations.
My understanding of the mathematical underpinnings of linear regression, however, are less developed. In particular, I do not understand the logic behind how we estimate beta using the following formula:
$$ beta = (X'X)^{-1}X'Y $$
Would anyone care to offer an intuitive explanation as to why/how this process works? For example, what function each step in the equation performs and why it is necessary.
regression theory
add a comment |
up vote
3
down vote
favorite
Like a lot of people, I understand how to run a linear regression, I understand how to interpret its output, and I understand its limitations.
My understanding of the mathematical underpinnings of linear regression, however, are less developed. In particular, I do not understand the logic behind how we estimate beta using the following formula:
$$ beta = (X'X)^{-1}X'Y $$
Would anyone care to offer an intuitive explanation as to why/how this process works? For example, what function each step in the equation performs and why it is necessary.
regression theory
3
How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
– Glen_b♦
5 hours ago
add a comment |
up vote
3
down vote
favorite
up vote
3
down vote
favorite
Like a lot of people, I understand how to run a linear regression, I understand how to interpret its output, and I understand its limitations.
My understanding of the mathematical underpinnings of linear regression, however, are less developed. In particular, I do not understand the logic behind how we estimate beta using the following formula:
$$ beta = (X'X)^{-1}X'Y $$
Would anyone care to offer an intuitive explanation as to why/how this process works? For example, what function each step in the equation performs and why it is necessary.
regression theory
Like a lot of people, I understand how to run a linear regression, I understand how to interpret its output, and I understand its limitations.
My understanding of the mathematical underpinnings of linear regression, however, are less developed. In particular, I do not understand the logic behind how we estimate beta using the following formula:
$$ beta = (X'X)^{-1}X'Y $$
Would anyone care to offer an intuitive explanation as to why/how this process works? For example, what function each step in the equation performs and why it is necessary.
regression theory
regression theory
asked 6 hours ago
Jack Bailey
414
414
3
How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
– Glen_b♦
5 hours ago
add a comment |
3
How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
– Glen_b♦
5 hours ago
3
3
How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
– Glen_b♦
5 hours ago
How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
– Glen_b♦
5 hours ago
add a comment |
1 Answer
1
active
oldest
votes
up vote
7
down vote
Suppose you have a model of the form:
$$X beta= Y$$
where X is a normal 2-D matrix, for ease of visualisation.
Now, if the matrix $X$ is square and invertible, then getting $beta$ is trivial:
$$beta= X^{-1}Y$$
And that would be the end of it.
If this is not the case, to get $beta$ you’ll have to find a way to “approximate” the result of an inverse matrix. $X^dagger = (X'X)^{-1}X'$ is called the (left)-pseudoinverse, and it has some nice properties that make it useful for this application.
In particular, it is unique, and $XX^dagger X=X$, so it kind of works like an inverse matrix would $(XX^{-1}X = XI = X)$. Also, for an invertible and square matrix (i.e. if the inverse matrix exists), it is equal to $X^{-1}$.
Also it gets the shape of the matrix right: If $X$ has order $n times m$, our pseudoinverse should be $m times n$ so we can multiply it with $Y$. This is achieved by multiplying $(X'X)^{-1}$, which is square $(m times m)$, with X' $(m times n)$.
New contributor
Thanks for your time. This was a great explanation and really useful.
– Jack Bailey
2 hours ago
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
7
down vote
Suppose you have a model of the form:
$$X beta= Y$$
where X is a normal 2-D matrix, for ease of visualisation.
Now, if the matrix $X$ is square and invertible, then getting $beta$ is trivial:
$$beta= X^{-1}Y$$
And that would be the end of it.
If this is not the case, to get $beta$ you’ll have to find a way to “approximate” the result of an inverse matrix. $X^dagger = (X'X)^{-1}X'$ is called the (left)-pseudoinverse, and it has some nice properties that make it useful for this application.
In particular, it is unique, and $XX^dagger X=X$, so it kind of works like an inverse matrix would $(XX^{-1}X = XI = X)$. Also, for an invertible and square matrix (i.e. if the inverse matrix exists), it is equal to $X^{-1}$.
Also it gets the shape of the matrix right: If $X$ has order $n times m$, our pseudoinverse should be $m times n$ so we can multiply it with $Y$. This is achieved by multiplying $(X'X)^{-1}$, which is square $(m times m)$, with X' $(m times n)$.
New contributor
Thanks for your time. This was a great explanation and really useful.
– Jack Bailey
2 hours ago
add a comment |
up vote
7
down vote
Suppose you have a model of the form:
$$X beta= Y$$
where X is a normal 2-D matrix, for ease of visualisation.
Now, if the matrix $X$ is square and invertible, then getting $beta$ is trivial:
$$beta= X^{-1}Y$$
And that would be the end of it.
If this is not the case, to get $beta$ you’ll have to find a way to “approximate” the result of an inverse matrix. $X^dagger = (X'X)^{-1}X'$ is called the (left)-pseudoinverse, and it has some nice properties that make it useful for this application.
In particular, it is unique, and $XX^dagger X=X$, so it kind of works like an inverse matrix would $(XX^{-1}X = XI = X)$. Also, for an invertible and square matrix (i.e. if the inverse matrix exists), it is equal to $X^{-1}$.
Also it gets the shape of the matrix right: If $X$ has order $n times m$, our pseudoinverse should be $m times n$ so we can multiply it with $Y$. This is achieved by multiplying $(X'X)^{-1}$, which is square $(m times m)$, with X' $(m times n)$.
New contributor
Thanks for your time. This was a great explanation and really useful.
– Jack Bailey
2 hours ago
add a comment |
up vote
7
down vote
up vote
7
down vote
Suppose you have a model of the form:
$$X beta= Y$$
where X is a normal 2-D matrix, for ease of visualisation.
Now, if the matrix $X$ is square and invertible, then getting $beta$ is trivial:
$$beta= X^{-1}Y$$
And that would be the end of it.
If this is not the case, to get $beta$ you’ll have to find a way to “approximate” the result of an inverse matrix. $X^dagger = (X'X)^{-1}X'$ is called the (left)-pseudoinverse, and it has some nice properties that make it useful for this application.
In particular, it is unique, and $XX^dagger X=X$, so it kind of works like an inverse matrix would $(XX^{-1}X = XI = X)$. Also, for an invertible and square matrix (i.e. if the inverse matrix exists), it is equal to $X^{-1}$.
Also it gets the shape of the matrix right: If $X$ has order $n times m$, our pseudoinverse should be $m times n$ so we can multiply it with $Y$. This is achieved by multiplying $(X'X)^{-1}$, which is square $(m times m)$, with X' $(m times n)$.
New contributor
Suppose you have a model of the form:
$$X beta= Y$$
where X is a normal 2-D matrix, for ease of visualisation.
Now, if the matrix $X$ is square and invertible, then getting $beta$ is trivial:
$$beta= X^{-1}Y$$
And that would be the end of it.
If this is not the case, to get $beta$ you’ll have to find a way to “approximate” the result of an inverse matrix. $X^dagger = (X'X)^{-1}X'$ is called the (left)-pseudoinverse, and it has some nice properties that make it useful for this application.
In particular, it is unique, and $XX^dagger X=X$, so it kind of works like an inverse matrix would $(XX^{-1}X = XI = X)$. Also, for an invertible and square matrix (i.e. if the inverse matrix exists), it is equal to $X^{-1}$.
Also it gets the shape of the matrix right: If $X$ has order $n times m$, our pseudoinverse should be $m times n$ so we can multiply it with $Y$. This is achieved by multiplying $(X'X)^{-1}$, which is square $(m times m)$, with X' $(m times n)$.
New contributor
edited 5 hours ago
New contributor
answered 5 hours ago
Purple Rover
735
735
New contributor
New contributor
Thanks for your time. This was a great explanation and really useful.
– Jack Bailey
2 hours ago
add a comment |
Thanks for your time. This was a great explanation and really useful.
– Jack Bailey
2 hours ago
Thanks for your time. This was a great explanation and really useful.
– Jack Bailey
2 hours ago
Thanks for your time. This was a great explanation and really useful.
– Jack Bailey
2 hours ago
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f381432%2feli5-the-logic-behind-coefficient-estimation-in-ols-regression%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
– Glen_b♦
5 hours ago