dc.description.abstract
Intrinsic disorder in proteins is a door to an entire section of biology that had remained invisible for
many years, escaping comprehensive study. Over time, as the methods to probe disorder became better
tailored to the task, it appeared that intrinsically disordered proteins (IDPs) were just as important and
central to biological processes as any other known factor. The focus of this work revolves around two main
axes. The first is the application of Nuclear Magnetic Resonance (NMR) to the study of the clathrin
assembly protein 180 (AP180), an intrinsically disordered protein central to the phenomenon of clathrin
mediated endocytosis in neurons. The second axis is about improving a conformational ensemble-building
strategy for the study of disordered proteins by incorporating data from different methods, namely NMR,
single-molecule Forster resonance energy transfer (smFRET), and small-angle X-ray scattering (SAXS).
Clathrin-mediated endocytosis (CME) is one of the most prominent ways for eukaryotic cells to
interact with their environment, this process allows them to internalize external cargo as well as membrane
and membrane associated components into vesicles, or clathrin-coated vesicles (CCV) in this case.
However, the molecular fine-tuning of the process escapes exhaustive description, since a number of the
many actors involved bear heavily disordered regions, who are by essence challenging to study. AP180 is
involved in the early stages of neuronal CME and is one of the most disordered proteins in the system. It
does bind the membrane through a folded domain that has been extensively characterized, and yet, most of
the knowledge about its long disordered cytosolic tail is limited to having located a number of hypothetical
binding regions, i.e. short consensus sequence stretches of 3 to 15 residues, known to promote interaction
with other main actors of CME. These other actors are the clathrin protein complex, the primary scaffolding
unit of the CCV, and the heterotetrameric clathrin adaptor protein 2 complex (AP2).
Here, the ~600 residue-long disordered AP180 was expressed in different ~200 amino-acid long
and slightly overlapping constructs to facilitate protein expression as well as NMR amide backbone
assignment, which hinted at a typical disordered chain without long-range self interactions. Numerous
partner titrations were performed and analyzed by 2-dimensional 1H-15N correlation NMR experiments, and
were conducted across all AP180 stretches as well as the full intrinsically disordered region (IDR), and
under different conditions to best accommodate for the needs of the tested interaction partners. The three
most tested partners were the alpha and beta appendages of AP2 (AP2α and AP2β2), and the clathrin heavy
chain terminal domain (CHCTD), all being known targets of the consensus binding sequence motifs that
exist along AP180. These titration experiments, analyzed through the scope of chemical shift perturbations,
intensity ratios, and R1ρ relaxation, first confirmed the relevance of the motifs, since the expected partner
proteins did bind to their associated target. This finding supported the idea that AP180 plays a role in early
endocytosis by allowing for weak interactions to be made increasingly more relevant closer to the
v
membrane as the local gradient of existing motifs in space gets denser, increasing the number of binding
events. Binding affinities could be estimated on a residue-wise basis by using local R1ρ relaxation rates,
showing that motifs mostly shared the same binding strength, differences could be ascribed to the nature of
the residues involved as well as broader sequence context, but all motif affinities seemed to be in the same
order of magnitude (hundreds of micromolar). An interesting observation arose from the apparent ability of
motifs to bind the other partners as well as their known target, pushing forward the premise of motif
promiscuity, useful to endocytosis in the context of maintaining a network of interactions between all
partners.
A considerable finding was unraveling an unknown, extended and strong binding site to AP2β2,
dubbed the extended interaction motif (EIM). This extended site is composed of two neighboring regions
within AP180-IDR (residues 435-445 and 470-500), and shorter constructs of AP180 were designed around
this specific region for further investigation. The AP180 EIM – AP2β2 experiments yielded an exchange
rate of 662 ± 35 s−1, a percentage of bound population of 8.9 ± 0.97%, and an affinity toward the appendage
in the order of only a few micromolar, much stronger than what was observed for the individual motifs.
This affinity as well as the exchange rate were calculated for the entire binding region rather than
individual residues, this was achieved by acquiring and fitting NMR relaxation dispersion data, and the
affinity was also later validated by isothermal titration calorimetry, confirming that the EIM drives the
overall interaction between AP180 and AP2β2. It was also shown that upon saturation of AP180 with
excess AP2β2, the motifs binding events became relevant again, suggesting a context-dependent interaction
pattern, where the EIM plays a larger role in earlier CME, i.e. lower-concentrated environments, where the
motifs could engage in crowd-control, interacting with all partners available and tightening the network as
the vesicle forms. The first hypothesis for the role of this EIM was thus the potential for AP2 recruitment or
AP2 sensing at the nascent vesicle pits. This stronger binding region also happens to span the area within
AP180 that is the most bereft of observed clathrin binding events, suggesting that in a context where both
AP2 and the clathrin are within reach of all AP180 motifs, such as the later stages of vesicle formation,
AP180 could be concertedly binding both partners without hindrance through the EIM and the motifs,
tightening the grip of the vesicle on the coat components by exploiting all the numerous binding sites.
While motifs could indeed predict binding events, it appeared clear through both the discovery of the EIM
and the promiscuity of the motifs that predictions were not enough to assess the behavior of this IDP,
supporting the importance of thorough screening of the interaction landscape, and highlighting the need for
developing tools that are able to efficiently provide these insights. This study has been published, and a
summary as well as the publication can be found in Chapter 5.
As mentioned above, the second axis of this work focuses on the development aspect of a strategy
to better study disordered proteins. Describing IDPs properly can be a tedious task, and one of the best
ways to achieve this is through the use of conformational ensembles ; collections of different conformers of
vi
an IDP population, expected to accurately capture the properties and favored dynamics of the population of
interest. Building an ensemble requires the proper computational tools, but also high quality experimental
data to use as input. The different methods of NMR perform remarkably well at providing such data,
ensembles generated by NMR inputs cover most of the relevant timescales for the dynamics of IDPs, but
lack high-quality descriptions of the furthest-reaching distance regimes, with NMR hitting a limit at around
2.5 nanometers, which is too short to inform on structural parameters such as the overall shape and size
sampled by a protein. That weakness can be nicely palliated by using restraints from other techniques in the
generation of ensembles. One such technique is smFRET, with a sensitive distance regime of ~2.5 to ~10
nanometers, synergistic with NMR. Part of my project revolved around aligning, characterizing, and then
using a confocal single-molecule FRET setup in order to incorporate long-distance restraints to an
ensemble-building approach. The system of interest was a 110 residue-long disordered stretch of the Nterminus
of the measles virus phosphoprotein, and the incorporation of longer-range distances produced
more reliable output ensembles that provided the new insight of an overall higher compaction of the protein
than previously thought. The setting up process is presented in Chapter 6, and the results of the proof-ofconcept
have been published, they are summarized alongside the publication in Chapter 7.
en