Kernel methods are a cornerstone of classical machine learning. The idea of using quantum computers to compute kernels has recently attracted attention. Quantum embedding kernels (QEKs), constructed by embedding data into the Hilbert space of a quantum computer, are a particular quantum kernel technique that is particularly suitable for noisy intermediate-scale quantum devices. Unfortunately, kernel methods face three major problems: Constructing the kernel matrix has quadratic computational complexity in the number of training samples, choosing the right kernel function is nontrivial, and the effects of noise are unknown. In this work, we addressed the latter two. In particular, we introduced the notion of trainable QEKs, based on the idea of classical model optimization methods. To train the parameters of the QEK, we proposed the use of kernel-target alignment. We verified the feasibility of this method, and showed that for our experimental setup we could reduce the training error significantly. Furthermore, we investigated the effects of device and finite sampling noise, and we evaluated various mitigation techniques numerically on classical hardware. We took the best performing strategy and evaluated it on data from a real quantum processing unit. We found that using this mitigation strategy demonstrated an increased kernel matrix quality.