Reinforcement learning decoders for fault-tolerant quantum computation