Finally we are ready to use metabase for production.
We have run multiple tests in our dev env on aws without any issue.
We go on our prod env, build the stack as specified here with the latest aws source bundle file.
I have done this step multiple times without any issue and there bim : error =>
2022-12-16 10:44:53 UTC+0100
Create environment operation is complete, but with command timeouts. Try increasing the timeout period. For more information, see troubleshooting documentation.
2022-12-16 10:43:51 UTC+0100
Command execution completed on all instances. Summary: [Successful: 0, TimedOut: 1].
2022-12-16 10:43:51 UTC+0100
The following instances have not responded in the allowed command timeout time (they might still finish eventually on their own): [i-022c337c3c1e4db9c].
2022-12-16 10:38:52 UTC+0100
Environment health has transitioned from Pending to Severe. ELB processes are not healthy on all instances. Initialization in progress on 1 instance. 0 out of 1 instance completed (running for 11 minutes). None of the instances are sending data. ELB health is failing or not available for all instances.
2022-12-16 10:30:53 UTC+0100
Added instance [i-022c337c3c1e4db9c] to your environment.
It seems a timeout occurred during the build and then the app does not work, the url result to : 502 Bad Gateway
I do not have any clue to solve this.
Does anyone have faced this problem ?
the endpoint is well set in my conf :
@newza Okay, there must be a difference between your environments if you say exactly the same setup works in your test environments.
The log you have provided indicates a health check is not working and causing it to restart.
@flamber ok, I will check my vpcs, to see differences, I think it the only thing that can stop the health check
I have been able to deploy even If I do not totally understand why, by changing the subnet from private to public for the metabase instance something must prevent the health check but did not find why.
But now I am facing another unexpected issue, I have created a RDS Psql instance exclusively for metabase.
On the same subnet group and vpc than the metabase instance, but when I change the sofware configuration it fails after 15 minutes of updating and rollback. No logs in cloudwatch, no log in metabase to understand why it is failing.
how can I find some log to help me understand ?
@newza Without any logs whatsoever, then I would not know what the problem is. And I don't know what "software configuration" change you have made.
it is part of the problem. No log at all
I wanted to add the infomation of the database : MB_DB_DBNAME, MB_DB_HOST, MB_DB_PASS, MB_DB_PORT, MB_DB_TYPE, MB_DB_USER
I have succeeded by adding a another security group to the instance.
And allowed this new security group to access the security group of the Database.
Why the the basic setup described in step 3 was not working ? I do not know, I think I have some specific rules on my subnets that are messing with the config.
So I managed to setup the project but not as expected.
I am sorry for the waste of time. VPC networks are not easy
@newza VPC is definitely complicated, but EB just makes everything more complex, so when something isn't working, then it's really difficult to troubleshoot.
If there are no logs at all, then something is completely wrong and it almost sounds like the logging has been excluded. I would recommend trying plain EC2 instead of EB.
I have already seen you recommend plain EC2.
Why that? easier to setup ? less magical ?
Do you have a kind of tutorial ?
@newza Too much "magic" going on with EB. If you just have a plain server you control, then you'll follow the normal JAR or Docker installs: https://www.metabase.com/docs/latest/installation-and-operation/start