Uptime socket timeout issues

Hub > Uptime reports socket timeout error for one of my sites. (Reporting the site is down)
The site loads fine on Pingdom and on the browser as well.
Please help me to fix this.

  • Adam Czajczyk
    • Support Gorilla

    Hi Carter

    I hope you’re well today and thank you for contacting us!

    I checked the site and Hub and on “site’s side” it doesn’t look wrong. The “socket timeout” usually means that site didn’t respond to Uptime request “on time” (so withing 30 seconds threshold) but since it’s loading fine and responding to other tools (like Pingdom) it might mean that it’s something that’s “rejecting”/”breaking” our requests specifically.

    I’ve asked our developers to look into it so they’ll investigate the issue more and get back to us with their findings.

    We’ll update you here as soon as we get to know more so please keep track of this ticket for further information.

    Best regards,
    Adam

  • cloudCreative
    • WPMU DEV Initiate

    We are experiencing a similar issue. Site reports as down in hub. Logged into the site and Uptime said it was up (obviously.) I disabled uptime and then tried to enable again. It reported “Your domain does not appear to be publicly accessible on the Internet. We can only monitor the uptime of public websites.” Currently, it is unable to reactivate.

  • Adam Czajczyk
    • Support Gorilla

    Hello cloudCreative

    Our developers are still looking into original issue reported here but it doesn’t affect all the sites and I can’t be certain that it’s the same case with your site without checking it. The message that you’re getting also suggests that there is very likely a different issue going on.

    Could you please start a separate support ticket of your own so we could check it and provide proper support for you? I’d appreciate it a lot.

    You can start a new ticket here:

    https://wpmudev.com/forums/forum/support#question

    Best regards,
    Adam

  • Predrag Dubajic
    • Support

    Hi Carter,

    Apologies for the long delay here, I had a chat with the devs and they are still looking into it as the fix will most likely cause a bigger change in the backend so it’s being addressed carefully to ensure it doesn’t affect other functionality.
    We’re hoping to have a viable solution ready soon.

    Best regards,
    Predrag

    • Adam Czajczyk
      • Support Gorilla

      Hi Patrick

      There are some “ongoing” changes being made to The Hub and while the issue “looks” the same, it doesn’t necessarily have to be caused by the same thing. It may or may not be and it would be best to do some basic troubleshooting first, to make sure.

      Would you please start a separate ticket of your own (feel free to include link to this ticket for reference too) and also grant support access to some example sites of yours that are affected so we could take a closer look and check it?

      You can submit a ticket here:

      https://wpmudev.com/forums/forum/support#question

      Thank you in advance!

      Best regards,
      Adam

  • AMTRUP
    • Design Lord, Child of Thor

    Hello

    We also have + 25 websites not monitorering right, this also seems to be trouble with other WPMUDEV plugins, even though we have whitelisted IPs from WPMUDEV.

    So they get back, so they dont have access for a long time and all reported OFFLINE.

    What is happening with this issue, this is a major problem…

    AMTRUP

  • Predrag Dubajic
    • Support

    Hi Carter ,

    Apologies for the long delay here, there were some changes in Uptime that had other reports and had to be reverted so additional modifications were done after that.
    Is this issue still present on your sites?
    If it’s still happening for your site(s) can you please make sure with your hosting provider and any other firewall protection that these IPs are whitelisted:

    18.204.159.253
    34.196.51.17
    35.157.144.199

    Best regards,
    Predrag

    • Carter
      • Digital Marketing Testers

      Predrag Dubajic this issue still presents itself. This issue has been going on for nearly 6-months now and no further investigation has been done. As a matter of fact, it’s been forgotten and lost in the trenches. Can we please make a considerable effort to get this resolved before the end of November please. This is ridiculous that its had to drag on for this long and by the sounds of things and traffic to this thread it’s not a localised issue.

      The IP’s you’ve mentioned are all unblocked from the server so that is not the case there. Please investigate this further this is really annoying.

      [attachments are only viewable by logged-in members][attachments are only viewable by logged-in members][attachments are only viewable by logged-in members]

  • Predrag Dubajic
    • Support

    Hi Carter,

    Apologies for the delay here, we are doing some extensive checks from our end and the requests sent manually are still being denied when test on your server but can’t be replicated on other installations.

    We did have other reports as well but allowing access from our IPs usually fixes the issue so we’re further checking if there’s anything on our end to see if there’s anything that will shed some light on this.

    We will update you when we have further information form our end or in case that we would need some access to check this further from your server end.

    Best regards,
    Predrag

  • Predrag Dubajic
    • Support

    Hi Carter,

    Seems like there is definitely something going wrong with our service communicating with the server, and it’s quite likely that something went wrong with whitelisting our service IPs specifically.

    In debugging the issue with service request timeout, we set up a number of replicas of our production environment and were able to determine that sending requests from a different, randomly assigned IPs end up being a success, but if the service request is coming from our standard IPs it always ends up being a timeout.
    This was happening regardless of whether the request was originating from the production instances or one of the replicas and is specific to your server, as running the same checks on a sample pool of other domains worked fine in both circumstances.

    We have also been running extensive checks on each part of the system in isolation for your domain specifically to rule out possible issues in other areas, and the issue consistently was simply reaching out to your site, which always ended up in request timeout when originating from one of our IPs.

    Our service user agent is WPMUDEV Uptime Monitor 4.0 (https://wpmudev.com), and you should be able to see at least some entries with that UA in your HTTP server access log as we were able to get a non-timeout result with requests coming from different IP ranges in our testbed setup.

    Best regards,
    Predrag