The full optimization of a quantum heat engine requires operating at high power, high efficiency, and high stability (i.e., low power fluctuations). However, these three objectives cannot be simultaneously optimized—as indicated by the so-called thermodynamic uncertainty relations—and a systematic approach to finding optimal balances between them including power fluctuations has, as yet, been elusive. Here we propose such a general framework to identify Pareto-optimal cycles for driven quantum heat engines that trade off power, efficiency, and fluctuations. We then employ reinforcement learning to identify the Pareto front of a quantum dot-based engine and find abrupt changes in the form of optimal cycles when switching between optimizing two and three objectives. We further derive analytical results in the fast- and slow-driving regimes that accurately describe different regions of the Pareto front.