Linear Relationship vs Correlation












2














I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.



However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
My question is is there a difference between correlation, and a linear relationship?










share|cite|improve this question









New contributor




Jweir136 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

























    2














    I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.



    However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
    My question is is there a difference between correlation, and a linear relationship?










    share|cite|improve this question









    New contributor




    Jweir136 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.























      2












      2








      2







      I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.



      However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
      My question is is there a difference between correlation, and a linear relationship?










      share|cite|improve this question









      New contributor




      Jweir136 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.



      However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
      My question is is there a difference between correlation, and a linear relationship?







      regression correlation






      share|cite|improve this question









      New contributor




      Jweir136 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|cite|improve this question









      New contributor




      Jweir136 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|cite|improve this question




      share|cite|improve this question








      edited 4 hours ago









      kjetil b halvorsen

      28.4k980207




      28.4k980207






      New contributor




      Jweir136 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 4 hours ago









      Jweir136

      111




      111




      New contributor




      Jweir136 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Jweir136 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Jweir136 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          1 Answer
          1






          active

          oldest

          votes


















          2















          One of the assumptions of linear regression is a linear relationship.




          There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:



          $$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$



          where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:



          $$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$



          You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.




          ...is there a difference between correlation, and a linear relationship?




          Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:



          $$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$



          If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.






          share|cite|improve this answer





















            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "65"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });






            Jweir136 is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384699%2flinear-relationship-vs-correlation%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2















            One of the assumptions of linear regression is a linear relationship.




            There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:



            $$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$



            where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:



            $$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$



            You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.




            ...is there a difference between correlation, and a linear relationship?




            Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:



            $$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$



            If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.






            share|cite|improve this answer


























              2















              One of the assumptions of linear regression is a linear relationship.




              There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:



              $$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$



              where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:



              $$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$



              You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.




              ...is there a difference between correlation, and a linear relationship?




              Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:



              $$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$



              If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.






              share|cite|improve this answer
























                2












                2








                2







                One of the assumptions of linear regression is a linear relationship.




                There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:



                $$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$



                where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:



                $$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$



                You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.




                ...is there a difference between correlation, and a linear relationship?




                Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:



                $$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$



                If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.






                share|cite|improve this answer













                One of the assumptions of linear regression is a linear relationship.




                There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:



                $$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$



                where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:



                $$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$



                You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.




                ...is there a difference between correlation, and a linear relationship?




                Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:



                $$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$



                If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered 2 hours ago









                Ben

                21.6k224103




                21.6k224103






















                    Jweir136 is a new contributor. Be nice, and check out our Code of Conduct.










                    draft saved

                    draft discarded


















                    Jweir136 is a new contributor. Be nice, and check out our Code of Conduct.













                    Jweir136 is a new contributor. Be nice, and check out our Code of Conduct.












                    Jweir136 is a new contributor. Be nice, and check out our Code of Conduct.
















                    Thanks for contributing an answer to Cross Validated!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384699%2flinear-relationship-vs-correlation%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    How to ignore python UserWarning in pytest?

                    What visual should I use to simply compare current year value vs last year in Power BI desktop

                    Script to remove string up to first number