nginx-lego and autoscaler don't play well after scaling down
up vote
1
down vote
favorite
I'm having troubles with nginx-lego (I know it's deprecated) and node autoscaler. I had to scale up manually through an HPA and patching temporarily minReplicas to a high number. All scaled well, new nodes were added because of pod increase.
After the traffic spike, I set the number back to normal (which is really low) and I can see a lot of bad gateway 502 errors. After I examined the nginx-lego pod's log, I was able to see that plenty of requests were going to pods that aren't there anymore (connection refused or No route to host).
2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"
2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"
2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"
Any idea on what could be wrong?
I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.
amazon-web-services kubernetes nginx-ingress
add a comment |
up vote
1
down vote
favorite
I'm having troubles with nginx-lego (I know it's deprecated) and node autoscaler. I had to scale up manually through an HPA and patching temporarily minReplicas to a high number. All scaled well, new nodes were added because of pod increase.
After the traffic spike, I set the number back to normal (which is really low) and I can see a lot of bad gateway 502 errors. After I examined the nginx-lego pod's log, I was able to see that plenty of requests were going to pods that aren't there anymore (connection refused or No route to host).
2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"
2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"
2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"
Any idea on what could be wrong?
I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.
amazon-web-services kubernetes nginx-ingress
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I'm having troubles with nginx-lego (I know it's deprecated) and node autoscaler. I had to scale up manually through an HPA and patching temporarily minReplicas to a high number. All scaled well, new nodes were added because of pod increase.
After the traffic spike, I set the number back to normal (which is really low) and I can see a lot of bad gateway 502 errors. After I examined the nginx-lego pod's log, I was able to see that plenty of requests were going to pods that aren't there anymore (connection refused or No route to host).
2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"
2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"
2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"
Any idea on what could be wrong?
I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.
amazon-web-services kubernetes nginx-ingress
I'm having troubles with nginx-lego (I know it's deprecated) and node autoscaler. I had to scale up manually through an HPA and patching temporarily minReplicas to a high number. All scaled well, new nodes were added because of pod increase.
After the traffic spike, I set the number back to normal (which is really low) and I can see a lot of bad gateway 502 errors. After I examined the nginx-lego pod's log, I was able to see that plenty of requests were going to pods that aren't there anymore (connection refused or No route to host).
2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"
2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"
2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"
Any idea on what could be wrong?
I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.
amazon-web-services kubernetes nginx-ingress
amazon-web-services kubernetes nginx-ingress
edited 2 days ago
Rico
24.3k94864
24.3k94864
asked 2 days ago
OndrejK
328
328
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
2
down vote
accepted
Looks like a problem with your nginx ingress (lego) controller not updating the nginx.conf
, when scaling down. I would examine the nginx.conf
and see if it's pointing to backends that don't exist anymore.
$ kubectl cp <nginx-lego-pod>:nginx.conf .
If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.
$ kubectl delete <nginx-controller-pod>
Then examine the nginx.conf
again.
Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:
$ kubectl get ep
And see if there are any that don't exist anymore.
I'm gettingerror: unexpected EOF
when trying to cp the nginx.conf
– OndrejK
2 days ago
You can try shelling into the pod and checking where that file is:kubectl exec -it <pod-id> sh
– Rico
2 days ago
Hey Rico, I got this:rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago
|
show 3 more comments
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
Looks like a problem with your nginx ingress (lego) controller not updating the nginx.conf
, when scaling down. I would examine the nginx.conf
and see if it's pointing to backends that don't exist anymore.
$ kubectl cp <nginx-lego-pod>:nginx.conf .
If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.
$ kubectl delete <nginx-controller-pod>
Then examine the nginx.conf
again.
Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:
$ kubectl get ep
And see if there are any that don't exist anymore.
I'm gettingerror: unexpected EOF
when trying to cp the nginx.conf
– OndrejK
2 days ago
You can try shelling into the pod and checking where that file is:kubectl exec -it <pod-id> sh
– Rico
2 days ago
Hey Rico, I got this:rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago
|
show 3 more comments
up vote
2
down vote
accepted
Looks like a problem with your nginx ingress (lego) controller not updating the nginx.conf
, when scaling down. I would examine the nginx.conf
and see if it's pointing to backends that don't exist anymore.
$ kubectl cp <nginx-lego-pod>:nginx.conf .
If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.
$ kubectl delete <nginx-controller-pod>
Then examine the nginx.conf
again.
Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:
$ kubectl get ep
And see if there are any that don't exist anymore.
I'm gettingerror: unexpected EOF
when trying to cp the nginx.conf
– OndrejK
2 days ago
You can try shelling into the pod and checking where that file is:kubectl exec -it <pod-id> sh
– Rico
2 days ago
Hey Rico, I got this:rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago
|
show 3 more comments
up vote
2
down vote
accepted
up vote
2
down vote
accepted
Looks like a problem with your nginx ingress (lego) controller not updating the nginx.conf
, when scaling down. I would examine the nginx.conf
and see if it's pointing to backends that don't exist anymore.
$ kubectl cp <nginx-lego-pod>:nginx.conf .
If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.
$ kubectl delete <nginx-controller-pod>
Then examine the nginx.conf
again.
Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:
$ kubectl get ep
And see if there are any that don't exist anymore.
Looks like a problem with your nginx ingress (lego) controller not updating the nginx.conf
, when scaling down. I would examine the nginx.conf
and see if it's pointing to backends that don't exist anymore.
$ kubectl cp <nginx-lego-pod>:nginx.conf .
If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.
$ kubectl delete <nginx-controller-pod>
Then examine the nginx.conf
again.
Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:
$ kubectl get ep
And see if there are any that don't exist anymore.
answered 2 days ago
Rico
24.3k94864
24.3k94864
I'm gettingerror: unexpected EOF
when trying to cp the nginx.conf
– OndrejK
2 days ago
You can try shelling into the pod and checking where that file is:kubectl exec -it <pod-id> sh
– Rico
2 days ago
Hey Rico, I got this:rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago
|
show 3 more comments
I'm gettingerror: unexpected EOF
when trying to cp the nginx.conf
– OndrejK
2 days ago
You can try shelling into the pod and checking where that file is:kubectl exec -it <pod-id> sh
– Rico
2 days ago
Hey Rico, I got this:rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago
I'm getting
error: unexpected EOF
when trying to cp the nginx.conf– OndrejK
2 days ago
I'm getting
error: unexpected EOF
when trying to cp the nginx.conf– OndrejK
2 days ago
You can try shelling into the pod and checking where that file is:
kubectl exec -it <pod-id> sh
– Rico
2 days ago
You can try shelling into the pod and checking where that file is:
kubectl exec -it <pod-id> sh
– Rico
2 days ago
Hey Rico, I got this:
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago
Hey Rico, I got this:
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
2 days ago
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
2 days ago
1
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
2 days ago
|
show 3 more comments
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53418115%2fnginx-lego-and-autoscaler-dont-play-well-after-scaling-down%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown