Critical issue with LetsEncrypt ssl renew

Sandeep · December 15, 2020, 7:30am

I’ve 3 SANs with same domain (2 subs and main) getting error renewing it :

Nginx + php-fpm

root@server:~# v-update-letsencrypt-ssl
Error: Let's Encrypt validation status 400. Details: Unable to update challenge :: authorization must be pending
Error: Let's Encrypt validation status 400. Details: Unable to update challenge :: authorization must be pending
Error: Let's Encrypt validation status 400. Details: Unable to update challenge :: authorization must be pending

Raphael · December 15, 2020, 7:48am

Hi @Sandeep

We will ship a new hestia version today, it is currently in the last testing phase - which is update our own productive infrastructure.

This version ships a slightly reworked let’s encrypt engine due to issues with cloudflare in combination with 301 forwardings, you can already install it with the following commands:

wget https://apt.hestiacp.com/beta/hestia_1.3.2_amd64.deb
dpkg -i hestia_1.3.2_amd64.deb

This will be the final version of 1.3.2, there is no need to do any steps after “official” release to our repository.

Please run then v-update-letsencrypt-ssl again and let us know, if all works properly.

eris · December 15, 2020, 7:48am

If not check contents of /var/log/hestia/LE-{user}-{domain}.{time}.log

Sandeep · December 15, 2020, 8:17am

@Raphael
probably you’ll solve this soon

root@server:/usr/local/src# wget https://apt.hestiacp.com/beta/hestia_1.3.2_amd64.deb
--2020-12-15 08:14:21--  https://apt.hestiacp.com/beta/hestia_1.3.2_amd64.deb
Resolving apt.hestiacp.com (apt.hestiacp.com)... 2606:4700:3031::ac43:8332, 2606:4700:3037::681c:1433, 2606:4700:3035::681c:1533, ...
Connecting to apt.hestiacp.com (apt.hestiacp.com)|2606:4700:3031::ac43:8332|:443... failed: Connection timed out.
Connecting to apt.hestiacp.com (apt.hestiacp.com)|2606:4700:3037::681c:1433|:443...

Sandeep · December 15, 2020, 8:25am

@Raphael @eris

can you confirm :

root@server:/var/log/hestia# cat LE-domain-server.domain.com-20201215-081913.log


=============================
WEB_SYSTEM: nginx
PROXY_SYSTEM:
user: domain
domain: server.domain.com


- aliases:
- proto: http-01
- wildcard:


==[Step 1]==
- status: 200
- nonce: 0103WxfS7Wudw69op_pxnAWygiv_kb2AVQKrmUTBW3R-ino
- answer: HTTP/2 200
server: nginx
date: Tue, 15 Dec 2020 08:19:45 GMT
content-type: application/json
content-length: 658
cache-control: public, max-age=0, no-cache
replay-nonce: 0103WxfS7Wudw69op_pxnAWygiv_kb2AVQKrmUTBW3R-ino
x-frame-options: DENY
strict-transport-security: max-age=604800



==[API call]==
exit status: 0


==[Step 2]==
- status: 201
- nonce: 0004akijJ_aUG5_FY2MNMF-gGkpI_7VUyDRAvvqp0wiSmqg
- authz: https://acme-v02.api.letsencrypt.org/acme/authz-v3/9325748247
- finalize: https://acme-v02.api.letsencrypt.org/acme/finalize/99337698/6752328766
- payload: {"identifiers":[{"type":"dns","value":"server.domain.com"}]}
- answer: HTTP/2 201
server: nginx
date: Tue, 15 Dec 2020 08:19:46 GMT
content-type: application/json
content-length: 351
boulder-requester: 99337698
cache-control: public, max-age=0, no-cache
link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
location: https://acme-v02.api.letsencrypt.org/acme/order/99337698/6752328766
replay-nonce: 0004akijJ_aUG5_FY2MNMF-gGkpI_7VUyDRAvvqp0wiSmqg
x-frame-options: DENY
strict-transport-security: max-age=604800

{
  "status": "pending",
  "expires": "2020-12-22T08:19:46.34688435Z",
  "identifiers": [
    {
      "type": "dns",
      "value": "server.domain.com"
    }
  ],
  "authorizations": [
    "https://acme-v02.api.letsencrypt.org/acme/authz-v3/9325748247"
  ],
  "finalize": "https://acme-v02.api.letsencrypt.org/acme/finalize/99337698/6752328766"
}


==[API call]==
exit status: 0


==[Step 3]==
- status: 200
- nonce: 0004A765b9ki4irX6YRLWzUQuTY9FWHY_8NXDlhNfjAWK1Y
- url: https://acme-v02.api.letsencrypt.org/acme/chall-v3/9325748247/5eJn4w
- token: gdJV2k20i4IWffd3lNJS7jZpGR01NwMpz0cFYqYKsyY
- answer: HTTP/2 200
server: nginx
date: Tue, 15 Dec 2020 08:19:46 GMT
content-type: application/json
content-length: 800
boulder-requester: 99337698
cache-control: public, max-age=0, no-cache
link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
replay-nonce: 0004A765b9ki4irX6YRLWzUQuTY9FWHY_8NXDlhNfjAWK1Y
x-frame-options: DENY
strict-transport-security: max-age=604800

{
  "identifier": {
    "type": "dns",
    "value": "server.domain.com"
  },
  "status": "pending",
  "expires": "2020-12-22T08:19:46Z",
  "challenges": [
    {
      "type": "http-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/9325748247/5eJn4w",
      "token": "gdJV2k20i4IWffd3lNJS7jZpGR01NwMpz0cFYqYKsyY"
    },
    {
      "type": "dns-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/9325748247/vmOtDg",
      "token": "gdJV2k20i4IWffd3lNJS7jZpGR01NwMpz0cFYqYKsyY"
    },
    {
      "type": "tls-alpn-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/9325748247/OerpNA",
      "token": "gdJV2k20i4IWffd3lNJS7jZpGR01NwMpz0cFYqYKsyY"
    }
  ]
}


==[API call]==
exit status: 0


==[Step 5]==
- status: 200
- nonce: 0003cIxPR4lR5zy6sW82-xc_YRVhhpG4bSUX9ah8_HF2Bh4
- validation: pending
- details:
- answer: HTTP/2 200
server: nginx
date: Tue, 15 Dec 2020 08:19:52 GMT
content-type: application/json
content-length: 185
boulder-requester: 99337698
cache-control: public, max-age=0, no-cache
link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
link: <https://acme-v02.api.letsencrypt.org/acme/authz-v3/9325748247>;rel="up"
location: https://acme-v02.api.letsencrypt.org/acme/chall-v3/9325748247/5eJn4w
replay-nonce: 0003cIxPR4lR5zy6sW82-xc_YRVhhpG4bSUX9ah8_HF2Bh4
x-frame-options: DENY
strict-transport-security: max-age=604800

{
  "type": "http-01",
  "status": "pending",
  "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/9325748247/5eJn4w",
  "token": "gdJV2k20i4IWffd3lNJS7jZpGR01NwMpz0cFYqYKsyY"
}


==[API call]==
exit status: 0


==[Step 5]==
- status: 400
- nonce: 0104m-3ky6iSRhWpojetxQNpJeHKH1TAuop8i5C0zvdk3jI
- validation:
- details: Unable to update challenge :: authorization must be pending
- answer: HTTP/2 400
server: nginx
date: Tue, 15 Dec 2020 08:19:56 GMT
content-type: application/problem+json
content-length: 144
boulder-requester: 99337698
cache-control: public, max-age=0, no-cache
link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
replay-nonce: 0104m-3ky6iSRhWpojetxQNpJeHKH1TAuop8i5C0zvdk3jI

{
  "type": "urn:ietf:params:acme:error:malformed",
  "detail": "Unable to update challenge :: authorization must be pending",
  "status": 400
}


==[Abort Step 5]==
=> Wrong status

Raphael · December 15, 2020, 8:30am

Sandeep:

@Raphael
probably you’ll solve this soon

root@server:/usr/local/src# wget https://apt.hestiacp.com/beta/hestia_1.3.2_amd64.deb
--2020-12-15 08:14:21--  https://apt.hestiacp.com/beta/hestia_1.3.2_amd64.deb
Resolving apt.hestiacp.com (apt.hestiacp.com)... 2606:4700:3031::ac43:8332, 2606:4700:3037::681c:1433, 2606:4700:3035::681c:1533, ...
Connecting to apt.hestiacp.com (apt.hestiacp.com)|2606:4700:3031::ac43:8332|:443... failed: Connection timed out.

This is a issue on your side, hestia is protected behind cloudflare, probaly ipv6 isnt configured properly.

Sandeep · December 15, 2020, 8:45am

ok will check it with server provider as i didn’t have any firewall using your default installation.

please also check the LE log and send some info whats going on ?

eris · December 15, 2020, 10:04am

Are you using Cloudflare?

systemctl reload nginx works fine?

Sandeep · December 15, 2020, 10:16am

cloudflare DNS only no proxy

yes sure nginx works fine and reload command too.

root@server:~# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

root@server:~# nginx -s reload
root@server:~#

root@server:~# systemctl reload nginx && systemctl status nginx -l
● nginx.service - nginx - high performance web server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2020-12-08 04:33:54 UTC; 1 weeks 0 days ago
Docs: http://nginx.org/en/docs/
Process: 426196 ExecReload=/bin/kill -s HUP $MAINPID (code=exited, status=0/SUCCESS)
Main PID: 3765414 (nginx)
Tasks: 7 (limit: 3488)
Memory: 95.8M
CGroup: /system.slice/nginx.service
├─ 167195 nginx: worker process is shutting down
├─ 424038 nginx: worker process is shutting down
├─ 426188 nginx: worker process
├─ 426189 nginx: worker process
├─ 426190 nginx: worker process
├─ 426191 nginx: cache manager process
└─3765414 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf

Dec 15 09:54:02 server.domain.com systemd[1]: Reloading nginx - high performance web server.
Dec 15 09:54:02 server.domain.com systemd[1]: Reloaded nginx - high performance web server.
Dec 15 10:18:08 server.domain.com systemd[1]: Reloading nginx - high performance web server.
Dec 15 10:18:08 server.domain.com systemd[1]: Reloaded nginx - high performance web server.
Dec 15 10:18:43 server.domain.com systemd[1]: Reloading nginx - high performance web server.
Dec 15 10:18:43 server.domain.com systemd[1]: Reloaded nginx - high performance web server.
Dec 15 10:18:58 server.domain.com systemd[1]: Reloading nginx - high performance web server.
Dec 15 10:18:58 server.domain.com systemd[1]: Reloaded nginx - high performance web server.
Dec 15 10:19:16 server.domain.com systemd[1]: Reloading nginx - high performance web server.
Dec 15 10:19:16 server.domain.com systemd[1]: Reloaded nginx - high performance web server.

Sandeep · December 15, 2020, 10:23am

for now I’ve manually installed ssl as it going to expire in few days and renew of cert is not working. But i like to have a solution on this.

Raphael · December 16, 2020, 9:18am

Renew of certs is working properly here, there is no known issue. Can’t say what’s wrong on your side :(.

eris · December 16, 2020, 9:21am

Normally this issue is caused due to Cloudflare or issues with reloading Nginx

Sandeep · December 16, 2020, 1:05pm

well as i mentioned all the thing above i didn’t use cloudflare proxy and reloading works fine.

Lupu · December 16, 2020, 3:54pm

Error 400 means that LE cannot verify the challange key (.well-known/acme-challange/…)
and this can happen because of the dns (TTL caching) or wrong response from the webserver
(primary domain and also all the aliases)

Test DNS : you can check with dig @8.8.8.8 <domain.tld>
Test WEB:
1- service nginx configtest
2- systemctl restart nginx
3- curl --location --insecure --verbose --resolve <domain.tld>:80:<server-ip-from-dig> http://<domain.tld>:80/.well-known/acme-challenge/123

Sandeep · December 16, 2020, 7:31pm

well you know what i found the issue when i remove the check mark auto https redirect and hsts it solved the issue so devs please look into it.

Raphael · December 16, 2020, 7:54pm

Using hsts and auto redirect for all my domains, also auto redirect was set prior as default, but has been reverted because it got auto enabled on renewal also for users that disabled the redirect.

There seems to be some system specific issues, maybe you can follow the inputs @Lupu gave and try a manual curl check.

Sandeep · December 16, 2020, 7:56pm

probably you missed i uncheck the those options and it got renewed

Raphael · December 16, 2020, 7:58pm

No, i didnt missed that. That’s why I wrote that I use this settings aswell. So please turn them back on and provide us more informations with a manual curl check. I dont have any system that produces issues, basicly we all not have one, otherwise we would be able to debug the issue (if it would be reproducable).

pluto · December 17, 2020, 2:45am

Server updated to hestia 1.3.2 a day or so ago, and I received a Letsencrypt error. Server is using Cloudflare proxy, which rang a bell, so I thought I’d mention it here. The error message emailed to me was
Error: Let’s Encrypt finalize bad status 500

And thanks to the recent addition of fantastically detailed logs I was able to get this as the last entry in my log directory. /var/log/hestia/LE-user-domain.com-date-time.log

==[Step 6]==
- status: 500
- nonce: xxxxxxxxxxxxxxxxxxxxxxxxx
- payload: {"csr":"<long string removed>"}
- certificate: 
- answer: HTTP/2 500 
server: nginx
date: Wed, 16 Dec 2020 23:21:26 GMT
content-type: application/problem+json
content-length: 174
cache-control: public, max-age=0, no-cache
link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
replay-nonce: xxxxxxxxxxxxxxxxxxxxx

{
  "type": "urn:ietf:params:acme:error:serverInternal",
  "detail": "Error retrieving account \"https://acme-v02.api.letsencrypt.org/acme/acct/90961111\"",
  "status": 500
}

The internal LE certificate is set to expire on Jan 16th, which is why the renewal was triggered. However the Cloudflare cert on the proxy is still good until July. I actually don’t mind turning off Cloudflare proxy on this site, but would like to help with the resolution if I can. Let me know what information I can provide. I’m OK with sending you the whole LE log by email / PM, but don’t want to make it public for obvious reasons.

I tried the curl command above (nice trick with the --resolve flag, must remember that), and it gave the correct response of 123.aosidfasidufpawhatever. As I read the logs, it seems the problem was when communicating back to LE servers? Anyway, will investigate a little more on this end. Work has been slow recently …

pluto · December 17, 2020, 3:09am

OK, so I did some detective work and it seems that LE error is often due to some sort of server glitch at Letsencrypt. So I thought I’d run the command manually
v-update-letsencrypt-ssl domain.com
This time there were no errors in /var/log/hestia/error.log
And the certificate was renewed successfully

openssl x509 -noout -dates -in /home/user/conf/web/domain.com/ssl/domain.com.crt
notBefore=Dec 17 02:04:44 2020 GMT
notAfter=Mar 17 02:04:44 2021 GMT

So seems like it was a false alarm, but hopefully useful for people reading this forum in the future, so I’ll leave it there rather than deleting it.