I do not believe 0.18 itself is the cause of the problem. I highly suspect the problem is with federation logic of how peer servers retry and connect to each other. Multiple servers going down in a short period to upgrade to 0.18 could be causing lemmy.ml to internally swarm within the outbound code or some other resource problem.

Lemmy.ml has hundreds of peer instances to distribute comments to.

Other instances could be having cascading problems if they have a lot of outbound messages to distribute.