We aimed to assess the effects of hyperparameter tuning and automatic image augmentation for deep learning-based classification of orthodontic photographs along the Angle classes. Our dataset consisted of 605 images of Angle class I, 1038 images of class II, and 408 images of class III. We trained ResNet architectures for classification of different combinations of learning rate and batch size. For the best combination, we compared the performance of models trained with and without automatic augmentation using 10-fold cross-validation. We used GradCAM to increase explainability, which can provide heat maps containing the salient areas relevant for the classification. The best combination of hyperparameters yielded a model with an accuracy of 0.63-0.64, F1-score 0.61-0.62, sensitivity 0.59-0.65, and specificity 0.80-0.81. For all metrics, it was apparent that there was an ideal corridor of batch size and learning rate combinations; smaller learning rates were associated with higher classification performance. Overall, the performance was highest for learning rates of around 1-3 x 10(-6) and a batch size of eight, respectively. Additional automatic augmentation improved all metrics by 5-10% for all metrics. Misclassifications were most common between Angle classes I and II. GradCAM showed that the models employed features relevant for human classification, too. The choice of hyperparameters drastically affected the performance of deep learning models in orthodontics, and automatic image augmentation resulted in further improvements. Our models managed to classify the dental sagittal occlusion along Angle classes based on digital intraoral photos.