ELI5: The Logic Behind Coefficient Estimation in OLS Regression











up vote
3
down vote

favorite
1












Like a lot of people, I understand how to run a linear regression, I understand how to interpret its output, and I understand its limitations.



My understanding of the mathematical underpinnings of linear regression, however, are less developed. In particular, I do not understand the logic behind how we estimate beta using the following formula:



$$ beta = (X'X)^{-1}X'Y $$



Would anyone care to offer an intuitive explanation as to why/how this process works? For example, what function each step in the equation performs and why it is necessary.










share|cite|improve this question


















  • 3




    How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
    – Glen_b
    5 hours ago

















up vote
3
down vote

favorite
1












Like a lot of people, I understand how to run a linear regression, I understand how to interpret its output, and I understand its limitations.



My understanding of the mathematical underpinnings of linear regression, however, are less developed. In particular, I do not understand the logic behind how we estimate beta using the following formula:



$$ beta = (X'X)^{-1}X'Y $$



Would anyone care to offer an intuitive explanation as to why/how this process works? For example, what function each step in the equation performs and why it is necessary.










share|cite|improve this question


















  • 3




    How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
    – Glen_b
    5 hours ago















up vote
3
down vote

favorite
1









up vote
3
down vote

favorite
1






1





Like a lot of people, I understand how to run a linear regression, I understand how to interpret its output, and I understand its limitations.



My understanding of the mathematical underpinnings of linear regression, however, are less developed. In particular, I do not understand the logic behind how we estimate beta using the following formula:



$$ beta = (X'X)^{-1}X'Y $$



Would anyone care to offer an intuitive explanation as to why/how this process works? For example, what function each step in the equation performs and why it is necessary.










share|cite|improve this question













Like a lot of people, I understand how to run a linear regression, I understand how to interpret its output, and I understand its limitations.



My understanding of the mathematical underpinnings of linear regression, however, are less developed. In particular, I do not understand the logic behind how we estimate beta using the following formula:



$$ beta = (X'X)^{-1}X'Y $$



Would anyone care to offer an intuitive explanation as to why/how this process works? For example, what function each step in the equation performs and why it is necessary.







regression theory






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked 6 hours ago









Jack Bailey

414




414








  • 3




    How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
    – Glen_b
    5 hours ago
















  • 3




    How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
    – Glen_b
    5 hours ago










3




3




How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
– Glen_b
5 hours ago






How many five year olds have learned anything about algebra, let alone matrices? I don't think it's a feasible request. Better to be clear about what kind/level of explanation you realistically seek. It would also help to clarify what it is you seek (that's not especially clear); are you asking for some outline explanation of how the formula is derived, or why a formula something like that makes sense?
– Glen_b
5 hours ago












1 Answer
1






active

oldest

votes

















up vote
7
down vote













Suppose you have a model of the form:
$$X beta= Y$$
where X is a normal 2-D matrix, for ease of visualisation.
Now, if the matrix $X$ is square and invertible, then getting $beta$ is trivial:
$$beta= X^{-1}Y$$
And that would be the end of it.



If this is not the case, to get $beta$ you’ll have to find a way to “approximate” the result of an inverse matrix. $X^dagger = (X'X)^{-1}X'$ is called the (left)-pseudoinverse, and it has some nice properties that make it useful for this application.



In particular, it is unique, and $XX^dagger X=X$, so it kind of works like an inverse matrix would $(XX^{-1}X = XI = X)$. Also, for an invertible and square matrix (i.e. if the inverse matrix exists), it is equal to $X^{-1}$.



Also it gets the shape of the matrix right: If $X$ has order $n times m$, our pseudoinverse should be $m times n$ so we can multiply it with $Y$. This is achieved by multiplying $(X'X)^{-1}$, which is square $(m times m)$, with X' $(m times n)$.






share|cite|improve this answer










New contributor




Purple Rover is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.


















  • Thanks for your time. This was a great explanation and really useful.
    – Jack Bailey
    2 hours ago











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f381432%2feli5-the-logic-behind-coefficient-estimation-in-ols-regression%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
7
down vote













Suppose you have a model of the form:
$$X beta= Y$$
where X is a normal 2-D matrix, for ease of visualisation.
Now, if the matrix $X$ is square and invertible, then getting $beta$ is trivial:
$$beta= X^{-1}Y$$
And that would be the end of it.



If this is not the case, to get $beta$ you’ll have to find a way to “approximate” the result of an inverse matrix. $X^dagger = (X'X)^{-1}X'$ is called the (left)-pseudoinverse, and it has some nice properties that make it useful for this application.



In particular, it is unique, and $XX^dagger X=X$, so it kind of works like an inverse matrix would $(XX^{-1}X = XI = X)$. Also, for an invertible and square matrix (i.e. if the inverse matrix exists), it is equal to $X^{-1}$.



Also it gets the shape of the matrix right: If $X$ has order $n times m$, our pseudoinverse should be $m times n$ so we can multiply it with $Y$. This is achieved by multiplying $(X'X)^{-1}$, which is square $(m times m)$, with X' $(m times n)$.






share|cite|improve this answer










New contributor




Purple Rover is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.


















  • Thanks for your time. This was a great explanation and really useful.
    – Jack Bailey
    2 hours ago















up vote
7
down vote













Suppose you have a model of the form:
$$X beta= Y$$
where X is a normal 2-D matrix, for ease of visualisation.
Now, if the matrix $X$ is square and invertible, then getting $beta$ is trivial:
$$beta= X^{-1}Y$$
And that would be the end of it.



If this is not the case, to get $beta$ you’ll have to find a way to “approximate” the result of an inverse matrix. $X^dagger = (X'X)^{-1}X'$ is called the (left)-pseudoinverse, and it has some nice properties that make it useful for this application.



In particular, it is unique, and $XX^dagger X=X$, so it kind of works like an inverse matrix would $(XX^{-1}X = XI = X)$. Also, for an invertible and square matrix (i.e. if the inverse matrix exists), it is equal to $X^{-1}$.



Also it gets the shape of the matrix right: If $X$ has order $n times m$, our pseudoinverse should be $m times n$ so we can multiply it with $Y$. This is achieved by multiplying $(X'X)^{-1}$, which is square $(m times m)$, with X' $(m times n)$.






share|cite|improve this answer










New contributor




Purple Rover is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.


















  • Thanks for your time. This was a great explanation and really useful.
    – Jack Bailey
    2 hours ago













up vote
7
down vote










up vote
7
down vote









Suppose you have a model of the form:
$$X beta= Y$$
where X is a normal 2-D matrix, for ease of visualisation.
Now, if the matrix $X$ is square and invertible, then getting $beta$ is trivial:
$$beta= X^{-1}Y$$
And that would be the end of it.



If this is not the case, to get $beta$ you’ll have to find a way to “approximate” the result of an inverse matrix. $X^dagger = (X'X)^{-1}X'$ is called the (left)-pseudoinverse, and it has some nice properties that make it useful for this application.



In particular, it is unique, and $XX^dagger X=X$, so it kind of works like an inverse matrix would $(XX^{-1}X = XI = X)$. Also, for an invertible and square matrix (i.e. if the inverse matrix exists), it is equal to $X^{-1}$.



Also it gets the shape of the matrix right: If $X$ has order $n times m$, our pseudoinverse should be $m times n$ so we can multiply it with $Y$. This is achieved by multiplying $(X'X)^{-1}$, which is square $(m times m)$, with X' $(m times n)$.






share|cite|improve this answer










New contributor




Purple Rover is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









Suppose you have a model of the form:
$$X beta= Y$$
where X is a normal 2-D matrix, for ease of visualisation.
Now, if the matrix $X$ is square and invertible, then getting $beta$ is trivial:
$$beta= X^{-1}Y$$
And that would be the end of it.



If this is not the case, to get $beta$ you’ll have to find a way to “approximate” the result of an inverse matrix. $X^dagger = (X'X)^{-1}X'$ is called the (left)-pseudoinverse, and it has some nice properties that make it useful for this application.



In particular, it is unique, and $XX^dagger X=X$, so it kind of works like an inverse matrix would $(XX^{-1}X = XI = X)$. Also, for an invertible and square matrix (i.e. if the inverse matrix exists), it is equal to $X^{-1}$.



Also it gets the shape of the matrix right: If $X$ has order $n times m$, our pseudoinverse should be $m times n$ so we can multiply it with $Y$. This is achieved by multiplying $(X'X)^{-1}$, which is square $(m times m)$, with X' $(m times n)$.







share|cite|improve this answer










New contributor




Purple Rover is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|cite|improve this answer



share|cite|improve this answer








edited 5 hours ago





















New contributor




Purple Rover is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









answered 5 hours ago









Purple Rover

735




735




New contributor




Purple Rover is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Purple Rover is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Purple Rover is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • Thanks for your time. This was a great explanation and really useful.
    – Jack Bailey
    2 hours ago


















  • Thanks for your time. This was a great explanation and really useful.
    – Jack Bailey
    2 hours ago
















Thanks for your time. This was a great explanation and really useful.
– Jack Bailey
2 hours ago




Thanks for your time. This was a great explanation and really useful.
– Jack Bailey
2 hours ago


















draft saved

draft discarded




















































Thanks for contributing an answer to Cross Validated!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f381432%2feli5-the-logic-behind-coefficient-estimation-in-ols-regression%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

What visual should I use to simply compare current year value vs last year in Power BI desktop

How to ignore python UserWarning in pytest?

Alexandru Averescu