Stability of metabase on elastic beanstalk


#1

I have MB up and running using this guide: https://metabase.com/docs/latest/operations-guide/running-metabase-on-elastic-beanstalk.html#runing-metabase-on-aws-elastic-beanstalk

The beanstalk instance constantly fluctuates between Severe and OK. Here’s an example alert from AWS:
“Message: Environment health has transitioned from Ok to Severe. ELB health is failing or not available for all instances.”

I had it running on t2.small, but updated to t2.medium this morning in an attempt to alleviate these issues. I’ve downloaded the logs from AWS but honestly a not sure what to dig into to identify what is causing this.

Are there any debugging/support guides that can assist in maintaining/running our own instance of MB?

Thanks,
Josh


#2

Since writing I’ve created an added an ec2 key pair to my beanstalk instance (which seems to have generated a new ec2 instance under the hood), yet I can’t seem to SSH in using ssh -i path/to/my.pem .compute-1.amazonaws.com – following this guide here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AccessingInstancesLinux.html

Not sure if that’s necessary, but figured it would be useful to poke around things that way.


#3

Hi, have you found what’s causing the instability? I’m dealing with the same issue.


#4

I upgraded out instance to medium - which seemed entirely unnecessary, but that seems to have made the problem go away. Still seems like a metabase issue that it isn’t stable on smaller instances.


#5

@burcturkoglu and @jsharpemr,

For future reference it would be good to know what Metabase version you both are experiencing this with?

Also – hopefully to help both of you out – try up’ing your AWS deployments to latest available version if you aren’t already on that (as of writing this it should be v0.30.3). Based on when you reported I’m guessing you’re either running 0.29.x or an early 0.30.x version. At least one really nasty bug affecting performance (see https://github.com/metabase/metabase/issues/8312#issuecomment-417326691) was fixed in v0.30.2.


#6

I’m currently on 0.30.0 – updating to 0.30.3 as I type this. Currently on a t2.small instance.

I read through that issue description, and we don’t seem to meet the requirements of that issue. We don’t have a dashboard with a lot of cards. And our issue isn’t (so far) reproducible. Our application seemingly falls over all by itself. e.g. In the middle of the night I get multiple warnings from AWS about the application missing health checks. Or maybe it’s simply a bird landing on the data center.

That said, I understand the 0.30.3 version could contain some more general perf improvements and I’ll keep an eye on things.


#7

Even though I’m using the latest version (0.30.4), I’m still having the same issue.