nginx-lego and autoscaler don't play well after scaling down

up vote
1
down vote

favorite

I'm having troubles with nginx-lego (I know it's deprecated) and node autoscaler. I had to scale up manually through an HPA and patching temporarily minReplicas to a high number. All scaled well, new nodes were added because of pod increase.

After the traffic spike, I set the number back to normal (which is really low) and I can see a lot of bad gateway 502 errors. After I examined the nginx-lego pod's log, I was able to see that plenty of requests were going to pods that aren't there anymore (connection refused or No route to host).

2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"

2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"

2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"

Any idea on what could be wrong?

I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.

edited 2 days ago

Rico

24.3k94864

asked 2 days ago

OndrejK

328

add a comment |

up vote
1
down vote

favorite

2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"

2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"

2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"

Any idea on what could be wrong?

I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.

edited 2 days ago

Rico

24.3k94864

asked 2 days ago

OndrejK

328

add a comment |

up vote
1
down vote

favorite

2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"

2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"

2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"

Any idea on what could be wrong?

I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.

edited 2 days ago

Rico

24.3k94864

asked 2 days ago

OndrejK

328

2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"

2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"

2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"

Any idea on what could be wrong?

I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.

amazon-web-services kubernetes nginx-ingress

edited 2 days ago

Rico

24.3k94864

asked 2 days ago

OndrejK

328

edited 2 days ago

Rico

24.3k94864

asked 2 days ago

OndrejK

328

edited 2 days ago

Rico

24.3k94864

edited 2 days ago

Rico

24.3k94864

edited 2 days ago

Rico

24.3k94864

asked 2 days ago

OndrejK

328

asked 2 days ago

OndrejK

328

asked 2 days ago

OndrejK

328

add a comment |

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

Looks like a problem with your nginx ingress (lego) controller not updating the nginx.conf, when scaling down. I would examine the nginx.conf and see if it's pointing to backends that don't exist anymore.

$ kubectl cp <nginx-lego-pod>:nginx.conf .

If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.

$ kubectl delete <nginx-controller-pod>

Then examine the nginx.conf again.

Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:

$ kubectl get ep

And see if there are any that don't exist anymore.

answered 2 days ago

Rico

24.3k94864

I'm getting error: unexpected EOF when trying to cp the nginx.conf
– OndrejK
2 days ago

You can try shelling into the pod and checking where that file is: kubectl exec -it <pod-id> sh
– Rico
2 days ago

Hey Rico, I got this: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago

I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago

1

So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago

|
show 3 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53418115%2fnginx-lego-and-autoscaler-dont-play-well-after-scaling-down%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

$ kubectl cp <nginx-lego-pod>:nginx.conf .

If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.

$ kubectl delete <nginx-controller-pod>

Then examine the nginx.conf again.

Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:

$ kubectl get ep

And see if there are any that don't exist anymore.

answered 2 days ago

Rico

24.3k94864

I'm getting error: unexpected EOF when trying to cp the nginx.conf
– OndrejK
2 days ago

You can try shelling into the pod and checking where that file is: kubectl exec -it <pod-id> sh
– Rico
2 days ago

Hey Rico, I got this: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago

I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago

1

So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago

|
show 3 more comments

up vote
2
down vote

accepted

$ kubectl cp <nginx-lego-pod>:nginx.conf .

If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.

$ kubectl delete <nginx-controller-pod>

Then examine the nginx.conf again.

Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:

$ kubectl get ep

And see if there are any that don't exist anymore.

answered 2 days ago

Rico

24.3k94864

I'm getting error: unexpected EOF when trying to cp the nginx.conf
– OndrejK
2 days ago

You can try shelling into the pod and checking where that file is: kubectl exec -it <pod-id> sh
– Rico
2 days ago

Hey Rico, I got this: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago

I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago

1

So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago

|
show 3 more comments

up vote
2
down vote

accepted

$ kubectl cp <nginx-lego-pod>:nginx.conf .

If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.

$ kubectl delete <nginx-controller-pod>

Then examine the nginx.conf again.

Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:

$ kubectl get ep

And see if there are any that don't exist anymore.

answered 2 days ago

Rico

24.3k94864

$ kubectl cp <nginx-lego-pod>:nginx.conf .

If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.

$ kubectl delete <nginx-controller-pod>

Then examine the nginx.conf again.

Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:

$ kubectl get ep

And see if there are any that don't exist anymore.

answered 2 days ago

Rico

24.3k94864

answered 2 days ago

Rico

24.3k94864

answered 2 days ago

Rico

24.3k94864

answered 2 days ago

Rico

24.3k94864

I'm getting error: unexpected EOF when trying to cp the nginx.conf
– OndrejK
2 days ago

You can try shelling into the pod and checking where that file is: kubectl exec -it <pod-id> sh
– Rico
2 days ago

Hey Rico, I got this: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago

I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago

1

So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago

|
show 3 more comments

I'm getting error: unexpected EOF when trying to cp the nginx.conf
– OndrejK
2 days ago

You can try shelling into the pod and checking where that file is: kubectl exec -it <pod-id> sh
– Rico
2 days ago

Hey Rico, I got this: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago

I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago

1

So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago

I'm getting error: unexpected EOF when trying to cp the nginx.conf
– OndrejK
2 days ago

You can try shelling into the pod and checking where that file is: kubectl exec -it <pod-id> sh
– Rico
2 days ago

Hey Rico, I got this:

rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"

– OndrejK
2 days ago

Hey Rico, I got this:

rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"

– OndrejK
2 days ago

I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago

So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago

|
show 3 more comments

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Qfyilyi