dc.contributor.author
Erdman, Paolo A.
dc.contributor.author
Czupryniak, Robert
dc.contributor.author
Bhandari, Bibek
dc.contributor.author
Jordan, Andrew N.
dc.contributor.author
Noé, Frank
dc.contributor.author
Eisert, Jens
dc.contributor.author
Guarnieri, Giacomo
dc.date.accessioned
2025-05-16T07:33:11Z
dc.date.available
2025-05-16T07:33:11Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/47676
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-47394
dc.description.abstract
Feedback control of open quantum systems is of fundamental importance for practical applications in various contexts, ranging from quantum computation to quantum error correction and quantum metrology. Its use in the context of thermodynamics further enables the study of the interplay between information and energy. However, deriving optimal feedback control strategies is highly challenging, as it involves the optimal control of open quantum systems, the stochastic nature of quantum measurement, and the inclusion of policies that maximize a long-term time- and trajectory-averaged goal. In this work, we employ a reinforcement learning approach to automate and capture the role of a quantum Maxwell's demon: the agent takes the literal role of discovering optimal feedback control strategies in qubit-based systems that maximize a trade-off between measurement-powered cooling and measurement efficiency. Considering weak or projective quantum measurements, we explore different regimes based on the ordering between the thermalization, the measurement, and the unitary feedback timescales, finding different and highly non-intuitive, yet interpretable, strategies. In the thermalization-dominated regime, we find strategies with elaborate finite-time thermalization protocols conditioned on measurement outcomes. In the measurement-dominated regime, we find that optimal strategies involve adaptively measuring different qubit observables reflecting the acquired information, and repeating multiple weak measurements until the quantum state is 'sufficiently pure', leading to random walks in state space. Finally, we study the case when all timescales are comparable, finding new feedback control strategies that considerably outperform more intuitive ones. We discuss a two-qubit example where we explore the role of entanglement and conclude discussing the scaling of our results to quantum many-body systems.
en
dc.format.extent
34 Seiten
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
dc.subject
quantum thermodynamics
en
dc.subject
machine learning
en
dc.subject
optimal control theory
en
dc.subject
quantum feedback control
en
dc.subject.ddc
500 Naturwissenschaften und Mathematik::530 Physik::530 Physik
dc.title
Artificially intelligent Maxwell's demon for optimal control of open quantum systems
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation.articlenumber
025047
dcterms.bibliographicCitation.doi
10.1088/2058-9565/adbccf
dcterms.bibliographicCitation.journaltitle
Quantum Science and Technology
dcterms.bibliographicCitation.number
2
dcterms.bibliographicCitation.volume
10
dcterms.bibliographicCitation.url
https://doi.org/10.1088/2058-9565/adbccf
refubium.affiliation
Mathematik und Informatik
refubium.affiliation
Physik
refubium.affiliation.other
Institut für Mathematik

refubium.affiliation.other
Dahlem Center für komplexe Quantensysteme

refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.isPartOf.eissn
2058-9565
refubium.resourceType.provider
WoS-Alert