Differences Between h2o.gbm, h2o.xgboost and h2o4gpu.gradient_boosting_regressor











up vote
0
down vote

favorite












I would like to ask you a question about the different Gradient Boosting Machine functions of h2o package in R. In order to identify the speed difference between these functions; same parameters with same training data has been trained for h2o.gbm, h2o.xgboost and h2o4gpu.gradient_boosting_regressor. The models can be seen below;



model_cpu=h2o.gbm(x = x_col_names, y = y, training_frame = train, nfolds = 10, ntrees = 100, stopping_metric = "RMSE",max_depth = 20)   #02:57.36
model_xgb=h2o.xgboost(x = x_col_names, y = y, training_frame = train, nfolds = 10, ntrees = 100, stopping_metric = "RMSE", max_depth = 20,learn_rate = 0.1) #06:31.41
model_gpu=h2o4gpu.gradient_boosting_regressor(n_estimators = 100, nfolds= 10, stopping_metric ="RMSE", max_depth = 20) %>% fit(x_gpu, y_gpu) #2:19.83


"#" sign after the commands indicates the run time of that command. As clearly can be seen; h2o4gpu is the fastest one when we compare. Then, I've decided to go on a more detailed model just increasing the ntree parameter with only h2o4gpu and h2o.gbm. Speed of h2o4gpu was amazing. When h2o.gbm finished in approx. 18 minutes, h2o4gpu finished in 3 and half minutes. Then; I just wanted to compare these models on test data. Result was shocking for me. There were an important difference between the results of these models.



cor_for_h2o.gbm=0.9294249, rmse_for_h2o.gbm=5.822826, mae_for_h2o.gbm=4.024654
cor_for_h2o4gpu=0.9182083, rmse_for_h2o4gpu=6.249201, mae_for_h2o4gpu=4.288272


As I understand, the algorithm behind these two models are different although the parameters are same. What might be the reason behind it? Should I continue to use h2o.gbm even though its slower? Moreover, why h2o.xgboost is much more slower than the others?



Btw, with its grid search option, I would prefer h2o.gbm to h2o4gpu even though it's slower. On the other hand, if you say h2o4gpu is better. Can you suggest any option for hyperparameter tuning in h2o4gpu?










share|improve this question






















  • can you let us know big your training_frame was? And just as a quick note are you running h2o4gpu with gpus or cpus, because the other algorithms can only use cpus.
    – Lauren
    Nov 29 at 20:42















up vote
0
down vote

favorite












I would like to ask you a question about the different Gradient Boosting Machine functions of h2o package in R. In order to identify the speed difference between these functions; same parameters with same training data has been trained for h2o.gbm, h2o.xgboost and h2o4gpu.gradient_boosting_regressor. The models can be seen below;



model_cpu=h2o.gbm(x = x_col_names, y = y, training_frame = train, nfolds = 10, ntrees = 100, stopping_metric = "RMSE",max_depth = 20)   #02:57.36
model_xgb=h2o.xgboost(x = x_col_names, y = y, training_frame = train, nfolds = 10, ntrees = 100, stopping_metric = "RMSE", max_depth = 20,learn_rate = 0.1) #06:31.41
model_gpu=h2o4gpu.gradient_boosting_regressor(n_estimators = 100, nfolds= 10, stopping_metric ="RMSE", max_depth = 20) %>% fit(x_gpu, y_gpu) #2:19.83


"#" sign after the commands indicates the run time of that command. As clearly can be seen; h2o4gpu is the fastest one when we compare. Then, I've decided to go on a more detailed model just increasing the ntree parameter with only h2o4gpu and h2o.gbm. Speed of h2o4gpu was amazing. When h2o.gbm finished in approx. 18 minutes, h2o4gpu finished in 3 and half minutes. Then; I just wanted to compare these models on test data. Result was shocking for me. There were an important difference between the results of these models.



cor_for_h2o.gbm=0.9294249, rmse_for_h2o.gbm=5.822826, mae_for_h2o.gbm=4.024654
cor_for_h2o4gpu=0.9182083, rmse_for_h2o4gpu=6.249201, mae_for_h2o4gpu=4.288272


As I understand, the algorithm behind these two models are different although the parameters are same. What might be the reason behind it? Should I continue to use h2o.gbm even though its slower? Moreover, why h2o.xgboost is much more slower than the others?



Btw, with its grid search option, I would prefer h2o.gbm to h2o4gpu even though it's slower. On the other hand, if you say h2o4gpu is better. Can you suggest any option for hyperparameter tuning in h2o4gpu?










share|improve this question






















  • can you let us know big your training_frame was? And just as a quick note are you running h2o4gpu with gpus or cpus, because the other algorithms can only use cpus.
    – Lauren
    Nov 29 at 20:42













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I would like to ask you a question about the different Gradient Boosting Machine functions of h2o package in R. In order to identify the speed difference between these functions; same parameters with same training data has been trained for h2o.gbm, h2o.xgboost and h2o4gpu.gradient_boosting_regressor. The models can be seen below;



model_cpu=h2o.gbm(x = x_col_names, y = y, training_frame = train, nfolds = 10, ntrees = 100, stopping_metric = "RMSE",max_depth = 20)   #02:57.36
model_xgb=h2o.xgboost(x = x_col_names, y = y, training_frame = train, nfolds = 10, ntrees = 100, stopping_metric = "RMSE", max_depth = 20,learn_rate = 0.1) #06:31.41
model_gpu=h2o4gpu.gradient_boosting_regressor(n_estimators = 100, nfolds= 10, stopping_metric ="RMSE", max_depth = 20) %>% fit(x_gpu, y_gpu) #2:19.83


"#" sign after the commands indicates the run time of that command. As clearly can be seen; h2o4gpu is the fastest one when we compare. Then, I've decided to go on a more detailed model just increasing the ntree parameter with only h2o4gpu and h2o.gbm. Speed of h2o4gpu was amazing. When h2o.gbm finished in approx. 18 minutes, h2o4gpu finished in 3 and half minutes. Then; I just wanted to compare these models on test data. Result was shocking for me. There were an important difference between the results of these models.



cor_for_h2o.gbm=0.9294249, rmse_for_h2o.gbm=5.822826, mae_for_h2o.gbm=4.024654
cor_for_h2o4gpu=0.9182083, rmse_for_h2o4gpu=6.249201, mae_for_h2o4gpu=4.288272


As I understand, the algorithm behind these two models are different although the parameters are same. What might be the reason behind it? Should I continue to use h2o.gbm even though its slower? Moreover, why h2o.xgboost is much more slower than the others?



Btw, with its grid search option, I would prefer h2o.gbm to h2o4gpu even though it's slower. On the other hand, if you say h2o4gpu is better. Can you suggest any option for hyperparameter tuning in h2o4gpu?










share|improve this question













I would like to ask you a question about the different Gradient Boosting Machine functions of h2o package in R. In order to identify the speed difference between these functions; same parameters with same training data has been trained for h2o.gbm, h2o.xgboost and h2o4gpu.gradient_boosting_regressor. The models can be seen below;



model_cpu=h2o.gbm(x = x_col_names, y = y, training_frame = train, nfolds = 10, ntrees = 100, stopping_metric = "RMSE",max_depth = 20)   #02:57.36
model_xgb=h2o.xgboost(x = x_col_names, y = y, training_frame = train, nfolds = 10, ntrees = 100, stopping_metric = "RMSE", max_depth = 20,learn_rate = 0.1) #06:31.41
model_gpu=h2o4gpu.gradient_boosting_regressor(n_estimators = 100, nfolds= 10, stopping_metric ="RMSE", max_depth = 20) %>% fit(x_gpu, y_gpu) #2:19.83


"#" sign after the commands indicates the run time of that command. As clearly can be seen; h2o4gpu is the fastest one when we compare. Then, I've decided to go on a more detailed model just increasing the ntree parameter with only h2o4gpu and h2o.gbm. Speed of h2o4gpu was amazing. When h2o.gbm finished in approx. 18 minutes, h2o4gpu finished in 3 and half minutes. Then; I just wanted to compare these models on test data. Result was shocking for me. There were an important difference between the results of these models.



cor_for_h2o.gbm=0.9294249, rmse_for_h2o.gbm=5.822826, mae_for_h2o.gbm=4.024654
cor_for_h2o4gpu=0.9182083, rmse_for_h2o4gpu=6.249201, mae_for_h2o4gpu=4.288272


As I understand, the algorithm behind these two models are different although the parameters are same. What might be the reason behind it? Should I continue to use h2o.gbm even though its slower? Moreover, why h2o.xgboost is much more slower than the others?



Btw, with its grid search option, I would prefer h2o.gbm to h2o4gpu even though it's slower. On the other hand, if you say h2o4gpu is better. Can you suggest any option for hyperparameter tuning in h2o4gpu?







r machine-learning h2o h2o4gpu






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 22 at 15:20









Cyric

105




105












  • can you let us know big your training_frame was? And just as a quick note are you running h2o4gpu with gpus or cpus, because the other algorithms can only use cpus.
    – Lauren
    Nov 29 at 20:42


















  • can you let us know big your training_frame was? And just as a quick note are you running h2o4gpu with gpus or cpus, because the other algorithms can only use cpus.
    – Lauren
    Nov 29 at 20:42
















can you let us know big your training_frame was? And just as a quick note are you running h2o4gpu with gpus or cpus, because the other algorithms can only use cpus.
– Lauren
Nov 29 at 20:42




can you let us know big your training_frame was? And just as a quick note are you running h2o4gpu with gpus or cpus, because the other algorithms can only use cpus.
– Lauren
Nov 29 at 20:42

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53434009%2fdifferences-between-h2o-gbm-h2o-xgboost-and-h2o4gpu-gradient-boosting-regressor%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53434009%2fdifferences-between-h2o-gbm-h2o-xgboost-and-h2o4gpu-gradient-boosting-regressor%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to ignore python UserWarning in pytest?

What visual should I use to simply compare current year value vs last year in Power BI desktop

Script to remove string up to first number