Why my residual-fitted plot looks like this? [duplicate]











up vote
3
down vote

favorite













This question already has an answer here:




  • Interpreting plot of residuals vs. fitted values from Poisson regression

    3 answers



  • Parallel straight lines on residual vs fitted plot

    1 answer




I'm using a glm poisson regression in R, and I did a model diagnostics after my model fitting, but the residual distribution is so wierd.
enter image description here










share|cite|improve this question







New contributor




geeh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











marked as duplicate by Glen_b 1 hour ago


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • I find those plots difficult to use for (non-normal) GLiMs, see: interpretation of plot (glm.model).
    – gung
    1 hour ago















up vote
3
down vote

favorite













This question already has an answer here:




  • Interpreting plot of residuals vs. fitted values from Poisson regression

    3 answers



  • Parallel straight lines on residual vs fitted plot

    1 answer




I'm using a glm poisson regression in R, and I did a model diagnostics after my model fitting, but the residual distribution is so wierd.
enter image description here










share|cite|improve this question







New contributor




geeh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











marked as duplicate by Glen_b 1 hour ago


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • I find those plots difficult to use for (non-normal) GLiMs, see: interpretation of plot (glm.model).
    – gung
    1 hour ago













up vote
3
down vote

favorite









up vote
3
down vote

favorite












This question already has an answer here:




  • Interpreting plot of residuals vs. fitted values from Poisson regression

    3 answers



  • Parallel straight lines on residual vs fitted plot

    1 answer




I'm using a glm poisson regression in R, and I did a model diagnostics after my model fitting, but the residual distribution is so wierd.
enter image description here










share|cite|improve this question







New contributor




geeh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












This question already has an answer here:




  • Interpreting plot of residuals vs. fitted values from Poisson regression

    3 answers



  • Parallel straight lines on residual vs fitted plot

    1 answer




I'm using a glm poisson regression in R, and I did a model diagnostics after my model fitting, but the residual distribution is so wierd.
enter image description here





This question already has an answer here:




  • Interpreting plot of residuals vs. fitted values from Poisson regression

    3 answers



  • Parallel straight lines on residual vs fitted plot

    1 answer








generalized-linear-model






share|cite|improve this question







New contributor




geeh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|cite|improve this question







New contributor




geeh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|cite|improve this question




share|cite|improve this question






New contributor




geeh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 3 hours ago









geeh

161




161




New contributor




geeh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





geeh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






geeh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




marked as duplicate by Glen_b 1 hour ago


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






marked as duplicate by Glen_b 1 hour ago


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • I find those plots difficult to use for (non-normal) GLiMs, see: interpretation of plot (glm.model).
    – gung
    1 hour ago


















  • I find those plots difficult to use for (non-normal) GLiMs, see: interpretation of plot (glm.model).
    – gung
    1 hour ago
















I find those plots difficult to use for (non-normal) GLiMs, see: interpretation of plot (glm.model).
– gung
1 hour ago




I find those plots difficult to use for (non-normal) GLiMs, see: interpretation of plot (glm.model).
– gung
1 hour ago










2 Answers
2






active

oldest

votes

















up vote
4
down vote













That is not unusual for a Poisson GLM: The Poisson GLM is a model used when the response variable is discrete (specifically, a non-negative integer), and often the explanatory variables are continuous. With this type of data and model it is not unusual to get lines of residual points that correspond to particular discrete response values, but with varying explanatory variables. In your initial residual plot, each of those lines of residuals corresponds to a particular value of the response variable, and the variation in the lines reflects the variation in the continuous explanatory variables. As you can see, the model has fit these lines so that it gives a residual mean that is roughly zero. That is exactly what you would expect from a Poisson GLM.



In this particular case there is not really any clear evidence to diagnose a model departure (though you might want to try some other related models as alternatives). For a Poisson GLM with a small number of response values we do not generally expect the deviance residuals to be normally distributed. From your plots it looks like there are only 8-10 outcomes for the response variable in your data, so the clear lines of residuals, and corresponding "kinky" QQ-plot are to be expected. If you want to test the fit of your model you could use a negative binomial GLM to generalise your analysis, to see if there is any over-dispersion.






share|cite|improve this answer





















  • Thanks a lot for your great interpretation.Well my response variable is "offspring" so they are indeed only 8-10 outcomes.I think there's no over-dispersion in my data because "deviance(model)/df.residual(model)=0.68" .
    – geeh
    13 mins ago










  • @geeh: Now I'm feeling very smug at being able to tell how many response outcomes you had. ;) In regard to over-dispersion, that is not something that can be checked by looking at the goodness-of-fit statistics of the Poisson GLM.
    – Ben
    5 mins ago


















up vote
1
down vote













These standard residual plots can be difficult to make sense of. It might be easier to explore plots of standardized residuals simulated from your fitted model. If your glm model object is called mod.glm, then:



install.packages("DHARMa")

library(DHARMa)

res.sim <- simulateResiduals(mod.glm)

plotSimulatedResiduals(res.sim) # or plot(res.sim)


See http://www.flutterbys.com.au/stats/tut/tut10.6a.html for a detailed example.



For a correctly specified Poisson regression model, you would expect a uniform (flat) distribution of the simulated residuals (not a normal distribution).



The last command listed above creates a qq-plot to detect overall deviations from the expected (uniform) distribution, and a plot of the residuals against the fitted values. As explained at https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html, the latter plot is accompanied by the fitted 0.25, 0.5 and 0.75 quantile regression lines; these lines "provide a visual aid in detecting deviations from uniformity in y-direction" (where y refers to the response variable in the Poisson regression model). "These lines should be straight, horizontal, and at y-values of 0.25, 0.5 and 0.75. Note, however, that some deviations from this are to be expected by chance, even for a perfect model, especially if the sample size is small."



You could also plot the simulated residuals against each of the predictor variables in your model:



plotResiduals(YOURPREDICTOR, res.sim$scaledResiduals)


In these plots of residuals against each predictor, you would expect uniformity in the y direction if the Poisson regression model is correctly specified.



You can also test for overdispersion, zero-inflation, etc., using functions available in the DHARMa package.






share|cite|improve this answer




























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    4
    down vote













    That is not unusual for a Poisson GLM: The Poisson GLM is a model used when the response variable is discrete (specifically, a non-negative integer), and often the explanatory variables are continuous. With this type of data and model it is not unusual to get lines of residual points that correspond to particular discrete response values, but with varying explanatory variables. In your initial residual plot, each of those lines of residuals corresponds to a particular value of the response variable, and the variation in the lines reflects the variation in the continuous explanatory variables. As you can see, the model has fit these lines so that it gives a residual mean that is roughly zero. That is exactly what you would expect from a Poisson GLM.



    In this particular case there is not really any clear evidence to diagnose a model departure (though you might want to try some other related models as alternatives). For a Poisson GLM with a small number of response values we do not generally expect the deviance residuals to be normally distributed. From your plots it looks like there are only 8-10 outcomes for the response variable in your data, so the clear lines of residuals, and corresponding "kinky" QQ-plot are to be expected. If you want to test the fit of your model you could use a negative binomial GLM to generalise your analysis, to see if there is any over-dispersion.






    share|cite|improve this answer





















    • Thanks a lot for your great interpretation.Well my response variable is "offspring" so they are indeed only 8-10 outcomes.I think there's no over-dispersion in my data because "deviance(model)/df.residual(model)=0.68" .
      – geeh
      13 mins ago










    • @geeh: Now I'm feeling very smug at being able to tell how many response outcomes you had. ;) In regard to over-dispersion, that is not something that can be checked by looking at the goodness-of-fit statistics of the Poisson GLM.
      – Ben
      5 mins ago















    up vote
    4
    down vote













    That is not unusual for a Poisson GLM: The Poisson GLM is a model used when the response variable is discrete (specifically, a non-negative integer), and often the explanatory variables are continuous. With this type of data and model it is not unusual to get lines of residual points that correspond to particular discrete response values, but with varying explanatory variables. In your initial residual plot, each of those lines of residuals corresponds to a particular value of the response variable, and the variation in the lines reflects the variation in the continuous explanatory variables. As you can see, the model has fit these lines so that it gives a residual mean that is roughly zero. That is exactly what you would expect from a Poisson GLM.



    In this particular case there is not really any clear evidence to diagnose a model departure (though you might want to try some other related models as alternatives). For a Poisson GLM with a small number of response values we do not generally expect the deviance residuals to be normally distributed. From your plots it looks like there are only 8-10 outcomes for the response variable in your data, so the clear lines of residuals, and corresponding "kinky" QQ-plot are to be expected. If you want to test the fit of your model you could use a negative binomial GLM to generalise your analysis, to see if there is any over-dispersion.






    share|cite|improve this answer





















    • Thanks a lot for your great interpretation.Well my response variable is "offspring" so they are indeed only 8-10 outcomes.I think there's no over-dispersion in my data because "deviance(model)/df.residual(model)=0.68" .
      – geeh
      13 mins ago










    • @geeh: Now I'm feeling very smug at being able to tell how many response outcomes you had. ;) In regard to over-dispersion, that is not something that can be checked by looking at the goodness-of-fit statistics of the Poisson GLM.
      – Ben
      5 mins ago













    up vote
    4
    down vote










    up vote
    4
    down vote









    That is not unusual for a Poisson GLM: The Poisson GLM is a model used when the response variable is discrete (specifically, a non-negative integer), and often the explanatory variables are continuous. With this type of data and model it is not unusual to get lines of residual points that correspond to particular discrete response values, but with varying explanatory variables. In your initial residual plot, each of those lines of residuals corresponds to a particular value of the response variable, and the variation in the lines reflects the variation in the continuous explanatory variables. As you can see, the model has fit these lines so that it gives a residual mean that is roughly zero. That is exactly what you would expect from a Poisson GLM.



    In this particular case there is not really any clear evidence to diagnose a model departure (though you might want to try some other related models as alternatives). For a Poisson GLM with a small number of response values we do not generally expect the deviance residuals to be normally distributed. From your plots it looks like there are only 8-10 outcomes for the response variable in your data, so the clear lines of residuals, and corresponding "kinky" QQ-plot are to be expected. If you want to test the fit of your model you could use a negative binomial GLM to generalise your analysis, to see if there is any over-dispersion.






    share|cite|improve this answer












    That is not unusual for a Poisson GLM: The Poisson GLM is a model used when the response variable is discrete (specifically, a non-negative integer), and often the explanatory variables are continuous. With this type of data and model it is not unusual to get lines of residual points that correspond to particular discrete response values, but with varying explanatory variables. In your initial residual plot, each of those lines of residuals corresponds to a particular value of the response variable, and the variation in the lines reflects the variation in the continuous explanatory variables. As you can see, the model has fit these lines so that it gives a residual mean that is roughly zero. That is exactly what you would expect from a Poisson GLM.



    In this particular case there is not really any clear evidence to diagnose a model departure (though you might want to try some other related models as alternatives). For a Poisson GLM with a small number of response values we do not generally expect the deviance residuals to be normally distributed. From your plots it looks like there are only 8-10 outcomes for the response variable in your data, so the clear lines of residuals, and corresponding "kinky" QQ-plot are to be expected. If you want to test the fit of your model you could use a negative binomial GLM to generalise your analysis, to see if there is any over-dispersion.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered 2 hours ago









    Ben

    20.3k22497




    20.3k22497












    • Thanks a lot for your great interpretation.Well my response variable is "offspring" so they are indeed only 8-10 outcomes.I think there's no over-dispersion in my data because "deviance(model)/df.residual(model)=0.68" .
      – geeh
      13 mins ago










    • @geeh: Now I'm feeling very smug at being able to tell how many response outcomes you had. ;) In regard to over-dispersion, that is not something that can be checked by looking at the goodness-of-fit statistics of the Poisson GLM.
      – Ben
      5 mins ago


















    • Thanks a lot for your great interpretation.Well my response variable is "offspring" so they are indeed only 8-10 outcomes.I think there's no over-dispersion in my data because "deviance(model)/df.residual(model)=0.68" .
      – geeh
      13 mins ago










    • @geeh: Now I'm feeling very smug at being able to tell how many response outcomes you had. ;) In regard to over-dispersion, that is not something that can be checked by looking at the goodness-of-fit statistics of the Poisson GLM.
      – Ben
      5 mins ago
















    Thanks a lot for your great interpretation.Well my response variable is "offspring" so they are indeed only 8-10 outcomes.I think there's no over-dispersion in my data because "deviance(model)/df.residual(model)=0.68" .
    – geeh
    13 mins ago




    Thanks a lot for your great interpretation.Well my response variable is "offspring" so they are indeed only 8-10 outcomes.I think there's no over-dispersion in my data because "deviance(model)/df.residual(model)=0.68" .
    – geeh
    13 mins ago












    @geeh: Now I'm feeling very smug at being able to tell how many response outcomes you had. ;) In regard to over-dispersion, that is not something that can be checked by looking at the goodness-of-fit statistics of the Poisson GLM.
    – Ben
    5 mins ago




    @geeh: Now I'm feeling very smug at being able to tell how many response outcomes you had. ;) In regard to over-dispersion, that is not something that can be checked by looking at the goodness-of-fit statistics of the Poisson GLM.
    – Ben
    5 mins ago












    up vote
    1
    down vote













    These standard residual plots can be difficult to make sense of. It might be easier to explore plots of standardized residuals simulated from your fitted model. If your glm model object is called mod.glm, then:



    install.packages("DHARMa")

    library(DHARMa)

    res.sim <- simulateResiduals(mod.glm)

    plotSimulatedResiduals(res.sim) # or plot(res.sim)


    See http://www.flutterbys.com.au/stats/tut/tut10.6a.html for a detailed example.



    For a correctly specified Poisson regression model, you would expect a uniform (flat) distribution of the simulated residuals (not a normal distribution).



    The last command listed above creates a qq-plot to detect overall deviations from the expected (uniform) distribution, and a plot of the residuals against the fitted values. As explained at https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html, the latter plot is accompanied by the fitted 0.25, 0.5 and 0.75 quantile regression lines; these lines "provide a visual aid in detecting deviations from uniformity in y-direction" (where y refers to the response variable in the Poisson regression model). "These lines should be straight, horizontal, and at y-values of 0.25, 0.5 and 0.75. Note, however, that some deviations from this are to be expected by chance, even for a perfect model, especially if the sample size is small."



    You could also plot the simulated residuals against each of the predictor variables in your model:



    plotResiduals(YOURPREDICTOR, res.sim$scaledResiduals)


    In these plots of residuals against each predictor, you would expect uniformity in the y direction if the Poisson regression model is correctly specified.



    You can also test for overdispersion, zero-inflation, etc., using functions available in the DHARMa package.






    share|cite|improve this answer

























      up vote
      1
      down vote













      These standard residual plots can be difficult to make sense of. It might be easier to explore plots of standardized residuals simulated from your fitted model. If your glm model object is called mod.glm, then:



      install.packages("DHARMa")

      library(DHARMa)

      res.sim <- simulateResiduals(mod.glm)

      plotSimulatedResiduals(res.sim) # or plot(res.sim)


      See http://www.flutterbys.com.au/stats/tut/tut10.6a.html for a detailed example.



      For a correctly specified Poisson regression model, you would expect a uniform (flat) distribution of the simulated residuals (not a normal distribution).



      The last command listed above creates a qq-plot to detect overall deviations from the expected (uniform) distribution, and a plot of the residuals against the fitted values. As explained at https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html, the latter plot is accompanied by the fitted 0.25, 0.5 and 0.75 quantile regression lines; these lines "provide a visual aid in detecting deviations from uniformity in y-direction" (where y refers to the response variable in the Poisson regression model). "These lines should be straight, horizontal, and at y-values of 0.25, 0.5 and 0.75. Note, however, that some deviations from this are to be expected by chance, even for a perfect model, especially if the sample size is small."



      You could also plot the simulated residuals against each of the predictor variables in your model:



      plotResiduals(YOURPREDICTOR, res.sim$scaledResiduals)


      In these plots of residuals against each predictor, you would expect uniformity in the y direction if the Poisson regression model is correctly specified.



      You can also test for overdispersion, zero-inflation, etc., using functions available in the DHARMa package.






      share|cite|improve this answer























        up vote
        1
        down vote










        up vote
        1
        down vote









        These standard residual plots can be difficult to make sense of. It might be easier to explore plots of standardized residuals simulated from your fitted model. If your glm model object is called mod.glm, then:



        install.packages("DHARMa")

        library(DHARMa)

        res.sim <- simulateResiduals(mod.glm)

        plotSimulatedResiduals(res.sim) # or plot(res.sim)


        See http://www.flutterbys.com.au/stats/tut/tut10.6a.html for a detailed example.



        For a correctly specified Poisson regression model, you would expect a uniform (flat) distribution of the simulated residuals (not a normal distribution).



        The last command listed above creates a qq-plot to detect overall deviations from the expected (uniform) distribution, and a plot of the residuals against the fitted values. As explained at https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html, the latter plot is accompanied by the fitted 0.25, 0.5 and 0.75 quantile regression lines; these lines "provide a visual aid in detecting deviations from uniformity in y-direction" (where y refers to the response variable in the Poisson regression model). "These lines should be straight, horizontal, and at y-values of 0.25, 0.5 and 0.75. Note, however, that some deviations from this are to be expected by chance, even for a perfect model, especially if the sample size is small."



        You could also plot the simulated residuals against each of the predictor variables in your model:



        plotResiduals(YOURPREDICTOR, res.sim$scaledResiduals)


        In these plots of residuals against each predictor, you would expect uniformity in the y direction if the Poisson regression model is correctly specified.



        You can also test for overdispersion, zero-inflation, etc., using functions available in the DHARMa package.






        share|cite|improve this answer












        These standard residual plots can be difficult to make sense of. It might be easier to explore plots of standardized residuals simulated from your fitted model. If your glm model object is called mod.glm, then:



        install.packages("DHARMa")

        library(DHARMa)

        res.sim <- simulateResiduals(mod.glm)

        plotSimulatedResiduals(res.sim) # or plot(res.sim)


        See http://www.flutterbys.com.au/stats/tut/tut10.6a.html for a detailed example.



        For a correctly specified Poisson regression model, you would expect a uniform (flat) distribution of the simulated residuals (not a normal distribution).



        The last command listed above creates a qq-plot to detect overall deviations from the expected (uniform) distribution, and a plot of the residuals against the fitted values. As explained at https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html, the latter plot is accompanied by the fitted 0.25, 0.5 and 0.75 quantile regression lines; these lines "provide a visual aid in detecting deviations from uniformity in y-direction" (where y refers to the response variable in the Poisson regression model). "These lines should be straight, horizontal, and at y-values of 0.25, 0.5 and 0.75. Note, however, that some deviations from this are to be expected by chance, even for a perfect model, especially if the sample size is small."



        You could also plot the simulated residuals against each of the predictor variables in your model:



        plotResiduals(YOURPREDICTOR, res.sim$scaledResiduals)


        In these plots of residuals against each predictor, you would expect uniformity in the y direction if the Poisson regression model is correctly specified.



        You can also test for overdispersion, zero-inflation, etc., using functions available in the DHARMa package.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered 1 hour ago









        Isabella Ghement

        5,883320




        5,883320















            Popular posts from this blog

            What visual should I use to simply compare current year value vs last year in Power BI desktop

            How to ignore python UserWarning in pytest?

            Alexandru Averescu