Intermittent subscription emails not being sent

We’re having intermittent problems with sending emails.

Version 58.7 on Docker, JAVA_OPTS -Xms4g -Xmx8g

It appeared to be fixed by scheduling a restart every night. That’s not always working though.

Appears to be triggered by having dashboards with slower queries running (nothing more than 30 seconds).

Should I just bump up the RAM more or is there something else I should try first?

Bumping the JVM size only helps if you’re getting OOM exceptions or a lot of GC thrash during the job.

The query is throwing some exception and killing the job runner thread, which will keep later jobs from running. The exception should log when it happens.

Thanks, I’ll download the logs next time it happens.

Given that I don’t have total control over every query that runs, is there a way to automatically restart the job runner thread if it fails?

Alternatively, an API call that could check its status and restart?

I don’t know if there is a way to repair the scheduler externally. But if a scheduled job is crashing, it should log something, and we can use that to fix why it is crashing. Queries should not crash the scheduler unless there is a bug or unexpected event.

Looks like there were two things going on.

The original problem that I think was a memory shortage or leak was fixed by the nightly restart.

The more recent return of the problem is due to SMTP2Go having problems with sending via MXGuarddog. The lowest priority MX record was no longer valid and for some reason, SMTP2Go wouldn’t try either of the working records.

I doubt very much that anyone else will be affected by this combination of events!!

Certainly not the combination, but the memory leak is worrying. I’m not seeing anything here (I have long-term memory pool stats for my instance) but I don’t use notifications, so if there’s problems there I won’t see them.