Best practice for having RDS as part of EB environment

:wave:

While in a discussion with AWS support, I was advised the following,

Please note that adding RDS instance as a part of EB environment is great for test environments, however it isn’t ideal for a production environment because it ties the lifecycle of the database instance to the lifecycle of your application’s environment which means if you terminate the environment, the database instance will be terminated as well.

Ref: https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/AWSHowTo.RDS.html

But, from the Metabase documentation, I see the following:

The Metabase team runs a number of production installations on AWS using Elastic Beanstalk and currently recommend it as the preferred choice for production deployments.

Ref: https://www.metabase.com/docs/latest/operations-guide/running-metabase-on-elastic-beanstalk.html

We are currently running Metabase in a Test environment and would want to move to Production as soon as possible. So, I would like to know the real-world best practices for running Metabase in production and scaling that 100+ users.

Thank you very much!

Hi @bkowshik

The note is about the coupling of RDS and EB, if RDS is created while creating EB.
There’s a document about decoupling RDS and EB: https://aws.amazon.com/premiumsupport/knowledge-center/decouple-rds-from-beanstalk/
But in short, just create the RDS before creating the EB, and then they’re not tied together.

It’s not really important how many users you have. It’s how many active sessions and how heavy the queries are, and which features of Metabase are being used.
That might be really difficult to answer until you go into production.

1 Like

If you’re planning on running Metabase in EB for production purposes (we do), I’ve always separated the two, meaning not using the EB Configuration to create/provision the RDS instance.

Like @flamber said above, I’ve always created the RDS instances before (or used existing ones), and then configured Metabase (thru EB Environment config values) to connect/use that RDS instance.

@jpipas
I’m trying to do the same thing right but I’m having trouble finding where to set the env variables for the
docker image using the Benastalk configuration.

If I want to run a standalone docker image I would do something like this:
docker run -d -p 3000:3000
-e “MB_DB_TYPE=postgres”
-e “MB_DB_DBNAME=metabase”
-e “MB_DB_PORT=5432”
-e “MB_DB_USER=”
-e “MB_DB_PASS=”
-e “MB_DB_HOST=my-database-host”
–name metabase metabase/metabase

But for beanstalk I don’t see any env parameters entry for Dockerrun.aws.json file
https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/single-container-docker-configuration.html#single-container-docker-configuration.dockerrun

Not sure what I’m missing here

Hi @fera320
What are you trying to do?
The EBS setup is almost automatic if you launch it through the site:
https://metabase.com/start/aws/

I want to:

  1. Create a RDS instance outside the beanstalk environment. So in case I terminate the Beanstalk enviornment the DB doesn’t get terminated.
  2. Start the Beanstalk image using this as a reference and connect to the previously created database.

But to do that I have to tell the metabase instance where to connect to. How do I do that? I’m not sure sure if is a docker problem, beanstalk or both.

@fera320
Okay. Just create the RDS before creating the EBS, then they are not coupled - meaning when you destroy the EBS, then it will not destroy the RDS.
You still select the RDS to connect to, when you create the EBS. The setup in Metabase should be done automatically, since it gets the variables from the setup of EBS.

@flamber
I created the RDS instance. I launched the beanstalk application adding environment variables modifying the 01_metabase.config file but still is not working. When I connect I have a 502 gateway error. Not sure what I’m missing. Only other thing I had to do is change the URL so it doesn’t force me to setup the RDS instance, by changing the URL I mean when I launch this link I remove all the rds% query parameters.

This is how the 01_metabase.config file looks like (The container_commands section I left it as is)

option_settings:

  • namespace: aws:elasticbeanstalk:command
    option_name: Timeout
    value: 600
  • option_name: RDS_DB_NAME
    value: metabase
  • option_name: RDS_PORT
    value: 5432
  • option_name: RDS_USERNAME
    value: dbadmin
  • option_name: RDS_PASSWORD
    value: 12345678
  • option_name: RDS_HOSTNAME
    value: postgres-test-1.XXXXX.us-east-1.rds.amazonaws.com

@fera320
What do you see in the Metabase log (or EBS log)?
But if you have already recreated the RDS, then you should be able to connect to that instead of having the EBS create a new instance, from the EBS setup - without modifying any config.
Otherwise create the RDS from EBS, and then decouple it following the AWS guide: https://aws.amazon.com/premiumsupport/knowledge-center/decouple-rds-from-beanstalk/

The problem was that metabase database wasn’t created. Looks like that’s not automatic when RDS exists (makes sense).

I’m putting the steps I did in case somebody needs this.

  1. Create postgres DB on RDS. Create metabase database.
  2. Get the file from here Beanstalk config file
  3. Modify 01_metabase.config to include DB credentials based on step 1) (see post above for a sample). Zip file.
  4. Launch the Beanstalk image following metabase documentation but remove querystring parameters related to rds so it doesn’t ask you to enter the values when lauching the environment.

The only think I don’t like about this is that the password shows in the config value but well.

@flamber Thanks for the help. If you see something that is not best practice let me know but this seems better than creating the RDS instance from beanstalk, detached and attach it again.

2 Likes

Thanks @fera320 for posting this for those of us looking to do the same thing, it helps allot.

Where in the Zip file are the querystring parameters in step 4?

That password issue, does it hang around in any way that could be exploited after all is said and done like in a stored rebuild or update file? If so can that be omitted or eliminated?

Any issues in doing this for MySQL rather than PostGres?

Where in the Zip file are the querystring parameters in step 4?

I’m talking about the AWS Console here. Just remove those parameters, I’ve been trying a lot of things so maybe this is not need it, you can play around with it, but after you change that you will notice that on the next screen beanstalk is no longer asking you to put the username and password for the RDS instance.

That password issue, does it hang around in any way that could be exploited after all is said and done like in a stored rebuild or update file? If so can that be omitted or eliminated?

That’s the part I don’t like about this. The password stays as a configuration variable on the beanstalk app. To avoid this you probably can encrypt it on S3 and read the values but you’ll probably need to change the metabase code for deployment which I really don’t want to do.

Another thing for production environments. This is mentionend in the documentation but not in the beanstalk guide. You want to add a secret so database connection strings are encrypted. https://www.metabase.com/docs/v0.33.3/operations-guide/encrypting-database-details-at-rest.html

Any issues in doing this for MySQL rather than PostGres?

Haven’t tried with MySQL