Mol Cell ProteomicsMol. Cell ProteomicsmcprotmcprotMCPMolecular & Cellular Proteomics : MCP1535-94761535-9484The American Society for Biochemistry and Molecular Biology262904984597156O115.05220910.1074/mcp.O115.052209Technological Innovations and ResourcesA High Through-put Platform for Recombinant Antibodies to Folded Proteins* Recombinant Antibody Platform for Folded ProteinsHornsbyMichael¶¶PaduchMarcin§¶¶MierschShane¶¶SääfAnnika§¶¶MatsuguchiTetLeeBrianWypisniakKarolinaDoakAllisonKingDaniel§UsatyukSvitlana§PerryKimberly§LuVince§ThomasWilliam§LukeJudy§GoodmanJay§HoeyRobert J.§LaiDarson§GriffinCarlyLiZhijianVizeacoumarFranco J.**DongDebbieCampbellElliotAndersonStephenZhongNan‡‡GräslundSusanne‡‡KoideShohei§MoffatJasonSidhuSachdev§§¶¶KossiakoffAnthony§§§¶¶WellsJames§§¶¶From the ‡Department of Pharmaceutical Chemistry University of California, San Francisco, California 94158; §Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637; ¶Donnelly Center for Cellular and Biomolecular Research, Department of Molecular Genetics, University of Toronto, Toronto, MG5 1L6, Canada; ‖Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, New Jersey 08854; **Saskatchewan Cancer Agency, University of Saskatchewan, Saskatoon, S7N 4H4, Canada; ‡‡Structural Genomics Consortium, Toronto, M5G Il7, Canada§§ To whom correspondence should be addressed: Department of Pharmaceutical Chemistry, University of California, San Francisco, CA. E-mails: Jim.Wells@ucsf.edu., Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, 60637;koss@bsd.uchicago.edu; Donnelly Center for Cellular and Biomolecular Research, Department of Molecular Genetics, University of Toronto, Toronto, MG51L6 Canada E-mail: sachdev.sidhu@utoronto.ca

¶¶ These authors contributed equally to the work.

1020151982015198201514102833284715620151382015© 2015 by The American Society for Biochemistry and Molecular Biology, Inc.2015The American Society for Biochemistry and Molecular Biology, Inc.Author's Choice—Final version free via Creative Commons CC-BY license.

Antibodies are key reagents in biology and medicine, but commercial sources are rarely recombinant and thus do not provide a permanent and renewable resource. Here, we describe an industrialized platform to generate antigens and validated recombinant antibodies for 346 transcription factors (TFs) and 211 epigenetic antigens. We describe an optimized automated phage display and antigen expression pipeline that in aggregate produced about 3000 sequenced Fragment antigen-binding domain that had high affinity (typically EC50<20 nm), high stability (Tm∼80 °C), good expression in E. coli (∼5 mg/L), and ability to bind antigen in complex cell lysates. We evaluated a subset of Fabs generated to homologous SCAN domains for binding specificities. These Fragment antigen-binding domains were monospecific to their target SCAN antigen except in rare cases where they cross-reacted with a few highly related antigens. Remarkably, immunofluorescence experiments in six cell lines for 270 of the TF antigens, each having multiple antibodies, show that ∼70% stain predominantly in the cytosol and ∼20% stain in the nucleus which reinforces the dominant role that translocation plays in TF biology. These cloned antibody reagents are being made available to the academic community through our web site recombinant-antibodies.org to allow a more system-wide analysis of TF and chromatin biology. We believe these platforms, infrastructure, and automated approaches will facilitate the next generation of renewable antibody reagents to the human proteome in the coming decade.

National Institutes of Health http://dx.doi.org/10.13039/1000000091U54HG0064361S10OD01237

Antibodies are crucial reagents for biological research and therapeutics. However, reproducibility for antibody reagents is a major concern, especially for polyclonals and even monoclonals where genetic drift of hybridoma stocks can be problematic (1, 2). Moreover, some have estimated that less than half of the animal derived antibodies bind their cognate native proteins (3, 4). The systematic generation of recombinant antibodies would provide a renewable collection of cloned and highly validated antibody genes and a permanent validation database (5, 6). Recombinant antibodies also afford a biosynthetic tool kit for recombination and gene fusions to generate new sensors and functional modulators. Other efforts for renewable antibody reagents (79) have highlighted the need to design robotics and high-throughput platforms for antigen production, antibody selections, and characterization (10).

One area of need for renewable antibody reagents are proteins involved in chromatin biology including transcription factors (TFs)1 and epigenetic antigens. According to The Human Protein Atlas (HPA; www.proteinatlas.org), there are commercially available antibodies to 362 of the estimated 1550 human TFs (11) and none are from recombinant sources (www.antibodypedia.org). Thus, the absence of validated recombinant antibodies to profile specific TF interactions and their spatial distribution is needed. As a protein class, TFs have been particularly challenging because they contain multiple domains, often intrinsically disordered (12, 13), and thus difficult to express as full-length proteins. Hence the NIH Common Fund funded this broad effort to generate renewable antibody reagents to this class of proteins (1U54HG006436). Such antibodies would be an important resource for biologists interested in understanding trafficking of TFs, their expression patterns in cells at the protein level, and ultimately their binding sites and partners during signaling.

Recombinant technologies for antigen production and antibody selections are poised for a large-scale effort to generate renewable antibodies to chromatin remodeling proteins. Recombinant antibody generation by phage display is not dependent on animal immunizations where control of the target protein is relinquished to the animal's immune system. Maintaining control of the protein status in vitro allows the user to customize selection conditions such as buffer, pH, temperature, and competitor proteins. In vitro methods eliminate antigen proteolysis, clearance, and auto-antigen antiselection in an animal setting (14). These bench-scale technologies are well-honed but we believe that by automating the in vitro selection technologies can fully realize their additional advantages of reducing the processing time from months to weeks, with much less antigen and reduced cost.

Here we present an industrialized platform (Fig. 1A and 1B) for generating high affinity renewable antibodies at large scale that is exemplified here for TFs and epigenetic factors, 557 total chromatin remodeling targets. Soluble domains of TFs were expressed in multiple high-throughput expression formats to ensure a soluble and intact antigen at a sufficient quantity for antibody selections (<1 mg). We utilized a highly stable and diverse synthetic Fab scaffold, displayed on filamentous phage (15). Robotic platforms were designed to rapidly select and characterize multiple Fabs of high affinity, specificity, stability, and expression in E. coli for each of 346 human TFs (representing >18 protein domain folds) and 211 different epigenetic proteins. Remarkably, immunofluorescence with multiple antibodies per TF in six different cell lines showed that about two thirds of human transcription factors tested reside predominantly in the cytosol, but the precise distribution is cell-line dependent. These data highlight the importance of translocation in TF biology. Our studies describe the pitfalls and viable solutions for a high-throughput platform that we believe will greatly accelerate the process of producing renewable, high quality, and evolvable antibody reagents to folded proteins. These cloned antibodies are available to the academic community for research purposes through the recombinant-antibodies.org.

EXPERIMENTAL PROCEDURES<sec><title/><sec><title>Antigen Sources

An NIH funded project was launched to rapidly develop and implement a high-throughput pipeline designed to generate high quality renewable recombinant antibodies to human transcription factors (U54 HG006436). TF antigens reported here were generated by Rutgers University, epigenetic antigens generated by the Structural Genomics Consortium (SCG), and in-house at the RAN facilities.

Bacterial Strains

E. coli strain XL1-blue (Stratagene, Santa Clara, CA) or T1 phage resistant cells were used for phage propagation, whereas strains DH10B (Invitrogen, Grand Island, NY), or other T1 phage resistant cells were used for Fab expression C43 (DE3) Pro+ (16) or BL21 (DE3) were used. XL1-blue or T1 phage resistant cells were grown in the presence of 5 μg/ml tetracycline (Tet) to ensure expression of the F‘ pilus that allowed for phage infection and DH10B cells were grown with 50 μg/ml carbenicillin (Carb) for production of plasmid DNA. Fab expression cells, C43 (DE3) Pro+, were grown with 50 μg/ml Carb and 25 μg/ml kanamycin (Kan). Where biotinylation was required (Antigen or Fab) recombinant BirA co-expressed or added during purification (Avidity, LLC., Aurora, CO).

Antigen Cloning and Expression

The genes for transcription factor domains were identified by bioinformatics (1719) and synthesized (Life Technologies, Carlsbad, CA), and were provided as E. coli expression constructs. In order to generate constructs for antigen expression by in vitro transcription and translation (IVTT), and E. coli, gene fragments were PCR amplified from transfer plasmids and recombined into their specific expression vectors by SLICE or Ligase Independent Cloning techniques (20, 21). All expression plasmids were sequence verified prior to expression studies.

The antigen production group at Rutgers University or at the SGC Consortium provided purified human biotinylated human transcription factor domains or chromatin remodeling antigens as N-terminal tagged AviTag fusions. For Rutgers antigens, expression constructs were synthesized (GenScript Inc., Piscataway, NJ), cloned into an expression vector containing an N-terminal AviTag, HisTag, and TEV cleavage site, and expressed in E. coli Tuner (DE3) harboring a pLysSRARE2 helper plasmid (EMD Millipore, Billerica, MA). Cells were harvested, lysed by sonication, and clarified lysate was loaded onto HisTrap HP (GE Healthcare, Piscataway, NJ) columns for purification. If the protein of interest was predominantly found in inclusion bodies, pellets were lysed in solubilization buffer (binding buffer containing 6 m urea), bound to the HisTrap column, and subjected to a slow refolding gradient into binding buffer without urea prior to elution. Pooled elution fractions were in vitro biotinylated (Avidity LLC), concentrated, and applied to a HiLoad 16/600 Superdex 75pg (GE Healthcare, Piscataway, NJ) column to remove unreacted biotin and ATP. Fractions corresponding to the monomeric peak were pooled and concentrated before being snap frozen in liquid N2 and stored at −80 °C until shipment (Campbell, E. Anderson, S. in preparation). SGC produced antigens were expressed in E. coli, purified by Immobilized Metal Affinity Chromatography, and tested for biotinylation prior to entry into the Fab selection pipeline (22, 23).

Protein expression system comparisons consisted of base-line TF expression in E. coli as intracellular GST fusion constructs (His-GST-Avi-TEV-Antigen) driven from the pTac promoter (supplemental Fig. S1C). E. coli based expression of soluble GST-TF fusion antigens was conducted by standard protein expression methods. Briefly E. coli cultures were grown in Terrific Broth with antibiotics to an OD600 of 0.6–0.8 with 1 mm IPTG for 3–6 h at 37°C, then harvested by centrifugation. Cells were resuspended, lysed by sonication, and clarified prior to purification with GST Sepharose (24). As a second system we expressed the TFs as His-Avi-TEV N-terminal fusion driven from a T7 promoter using an E. coli in vitro transcription-translation system (IVTT) that was kindly provided by Sutro Biosciences (25) (supplemental Fig. S1D). The cell-free reaction mixture contained an ATP regeneration system supplemented with chaperones and redox enzymes. Briefly, 10 μg of IVTT plasmid constructs were combined with 1 ml cell free extract containing recombinant BirA protein and 50 μm biotin and allowed to shake at 650rpm for 2 h in an Eppendorf Thermomixer Expressed and biotinylated protein was purified over NiNTA agarose (Qiagen, Valencia, CA) with PBS based buffer system and eluted with PBS + 400 mm Imidazole. Protein samples were buffer exchanged with Amicon spin concentrators (EMD Millipore), concentrations determined by Bradford BSA assay (Thermo Fisher, Grand Island, NY), and analyzed for purity by SDS-PAGE prior to entering the selection pipeline (25, 26). In the third system, we displayed the antigen for selection on the surface of yeast (27, 28). To generate constructs for YAD, gene fragments were PCR amplified from gene synthesis transfer plasmids and then recombined into YAD plasmid. YAD constructs utilized a N-terminal Avi-6xHis-Aga2-TEV protease fusion partner with a C-terminal V5-His tag. The YAD strain, EBY100, was engineered to overexpress agglutinin AGA1 under the galactose inducible Gal1–10 promoter, which formed a disulfide bonds with the antigen-Aga2 protein in the extracellular space. Constructs were biotinylated during secretion by a cytoplasmic or ER localized BirA, which allowed for efficient capture to streptavidin magnetic beads (Promega, Madison, WI) (2931). The TF domain was expressed with a C-terminal V5 epitope tag to monitor expression level (supplemental Fig. S1E and S4). The proteins generated from these three systems were then tested for the ability to generate antibodies by phage display.

Fab-phage Selections

All phage selections were done according to previously established protocols (24, 26) with several modifications as outlined below. Briefly, Fab-phage selections were automated allowing for multi-parallel processing of biotinylated target antigens either generated in-house or obtained from collaborators. Up to four rounds of soluble phage selections were conducted with biotinylated target antigens bound to streptavidin magnetic beads (Promega). The antigen concentration on the bead was systematically decreased with successive selection round as follows: 100 nm Round 1, 50 nm Round 2, 10 nm Rounds 3 and 4. To maximize throughput and reduce the amount of phage library used, up to 8 antigen-streptavidin beads complexes were combined into one selection well for round 1. Antigen-streptavidin beads were incubated with 1 × 1013 Fab-phage particles from either Library E or F for one hour with gentle mixing on a King Fisher Flex magnetic bead separator (Thermo Scientific) followed by three washes in phosphate buffered saline supplemented with 0.05% Tween-20 and 0.2% bovine serum albumin (PBST+BSA). To reduce the deleterious effects of nonspecific binding phage, we employed a “catch and release” strategy, where specific antigen binding Fab-phage were selectively eluted from the magnetic beads by the addition of 2 μg/well TEV Protease for 10 min. Once liberated from the streptavidin beads, Fab-phage were introduced to 5 ml (Selection Round 1) or 100 μl (Selection Rounds 2–4) of exponentially growing E. coli XL1-Blue or T1 phage resistant cells and propagated overnight at 37 °C with shaking.

Once propagated, Fab-phage were recovered from culture medium with Protein A magnetic beads (EMD Millipore) on the King Fisher Flex. Briefly, 20–50 μl of a Protein A magnetic bead slurry was incubated with up to 1 ml culture supernatant for 60 min then beads were collected and washed prior to elution of Fab-phage with 100 μm Acetic acid for 10 min. Tris-base pH 11 buffer was added to neutralize the purified Fab-phage prior to further processing. The initial Fab-phage library and amplified eluents can contain a small subpopulation of streptavidin binding Fab-phage, therefore prior to each selection round Fab-phage pools were incubated with 50 μl streptavidin-beads for 10 min then the beads were removed with the King Fisher Flex to deplete the library of any nonspecific binding Fab-phage. Plate based selections were done by immobilization of antigen on protein binding plates as described previously (24).

Purified phage from automated selections (rounds 3 or 4) were used to infect 100ul E. coli XL1-blue cells grown to log phase (OD600 0.6–0.8) for 20 minutes prior to plating on LB +50ug/ml Carb Omnitray (Nunc Thermo Fisher, Grand Island, NY) such that at least 200 individual colonies were grown per plate. Single colonies were then picked by either a K6–2 Colony Picker (KBiorsystems, San Diego, CA) or manually into 96-deep well blocks containing 0.5ml 2xYT +50ug/ml Carb and 1x109 pfu/ml KO7 helper phage (New England Biolabs, Ipswich, MA). Liquid cultures were grown for 16–18 hours at 37°C with shaking at 600–900 RPM prior to further processing.

Two 96-well immunoassay plates (Corning, Corning, NY) coated with 50ng/well neutravidin (Thermo Fisher) were prepared for the ELISA validation of Fab-phage binders to one target antigen. A biotinylated target antigen was diluted to 20 nm in PBST + BSA. 50 μl/well were dispensed in two neutravidin plates, then incubated for 20 min. Direct Binding: Fab-phage containing culture supernatant was added to one antigen coated plate and incubated for 15 min. Competition ELISA: In a separate plate 20 nM antigen was mixed with phage containing culture supernatant and incubated for 15 min with shaking. After incubation, the phage-antigen complex was added to the Competition plate and incubated for 15 min. Both plates were then washed 3 times with PBST a BioTek EL406 or similar plate washer to remove unbound Fab-phage (or Fab-phage-antigen) followed by the addition of 50 μl anti-M13 phage-HRP antibody (GE Healthcare Piscataway, NJ; diluted 1:5000 in PBST + BSA) and incubated for 30 minutes. Plates were again washed 3 times with PBST or PBS then developed with 50 μl TMB substrate (KPL Labs, Gaithersburg, MD) and analyzed with a Tecan M1000 or similar plate reader monitoring OD650 for 5 min. Alternatively, signal development was stopped by the addition of 1M Phosphoric acid and OD450 signals were determined. ELISA data analysis was conducted by first plotting the OD650/min (or OD450). The Direct Binding signal was plotted on the Y-axis and the ratio of Direct/Competition signals that was plotted on the X-axis. In order to be considered a passing Fab-phage a competition ratio of <0.5 and a Direct Binding signal >0.005 units was required. Passing Fab-phage were then subjected to DNA sequence analysis to determine uniqueness of the Fab CDR sequences.

Analysis of the sequencing results was automated by the generation of several scripts where the sequences flanking CDR's were recognized and both DNA and amino acid translations were returned for the 4 CDR's of interest (LC3, HC1, HC2, and HC3). Duplicate Fab-phage sequences were removed and only unique Fab-phage clones passed into the cloning pipeline.

Fab Expression and Purification

C43 (DE3) Pro+ E. coli containing expression plasmids were grown in Terrific Broth supplemented with 0.5% glycerol, Carb, Kan, Chlor, and 5 μm Biotin to an OD600 of 0.6–0.8 at 37 °C then Fab expression was induced by the addition of 1 mm IPTG. At the time of induction incubation temperature was reduced to 30°C and allowed to shake for 16–18 h at which time cells were harvested by centrifugation. Recombinant Fabs were purified by Protein A chromatography and buffer exchanged into PBS buffer for subsequent storage and validation assays.

Differential Scanning Fluorimetry

To assess the stability of the recombinant Fabs, we employed differential scanning fluorimetry (DSF) in which Sypro Orange (Invitrogen) binds to hydrophobic regions of partially or fully denatured proteins giving a high fluorescent signal (32). DSF was conducted either on Roche LC480 Lightcycler or similar qRT-PCR instrument in either 96- or 384-well formats. Briefly, purified recombinant Fab was diluted to 2 μm in DSF buffer containing Sypro Orange 4x and PBS then subjected to a temperature gradient (0.5°C/30 s) from 50 to 95°C. Data were continuously acquired at an ∼490 nm and ∼575 nm (excitation and emission wavelengths) then processed to generate first derivative curves where the curve peak corresponds to the melting temperature of the Fab.

Spiked Immunoprecipitation

To assess Fab specificity, we utilized an immunoprecipitation method where target Fabs were first bound to streptavidin magnetic beads (Dynabeads M-280, LifeTechnologies, Grand Island, NY). Fab loaded beads were then incubated for 30 minutes with 50 nm biotinylated target antigen in the presence or absence of a HEK293 cellular extract (OD280 2.5 and OD260 3.68). The samples were washed and labeled with a Streptavidin-Fluorophore prior to analysis by flow cytometric methods described elsewhere (Koide et al. in preparation).

ELISA Affinity Measurement

To assess the affinity or specificity of Fab's we utilized EC50 or single point ELISA assays conducted in 96- or 384-well ELISA plates pre-coated with neutravidin. Briefly, 20 nm biotinylated antigen was bound to ELISA plate and incubated for 20 min prior to washing with PBST. Purified Fab protein was either diluted in a two- or threefold dilution series (EC50) or at 20 nm (specificity assay) then transferred to prepared plates for 30 min prior to washing and development with anti-Flag-HRP antibody (1:5000 dilution; Sigma Aldrich, St Louis, MO) and TMB substrate.

Immunofluorescence

Ninety-six-well cell carrier microtiter plates (Perkin Elmer, Waltham, MA) were seeded with 5 × 103 A2780Cis, 3345A, A2780, HPAFII, HEPG2 or 293T cells in individual wells after which they were allowed to adhere by incubating overnight at 37 °C. After incubation, media was removed and cells washed 1× with ice-cold PBS pH 7.4, then fixed by incubating with ice-cold methanol at −20 °C. Fixed cells were then permeabilized with 0.25% Triton X-100 in PBS, washed 3X with 200 ml PBS pH 7.4 and blocked with 1% goat serum in PBS for 1h at RT, before the addition of a 10 μg/ml solution of primary Fab diluted in 1% goat serum in PBS. Cells were incubated with primary Fab for one hour at RT then washed 3x with PBS prior to the addition of and 30 min incubation with a 1:2000 dilution of FITC-labeled anti- human Fab secondary antibody protected from light.

Immuno-stained images were collected for Fab clones against 6 cell lines with an Opera High Content Screening System (Perkin Elmer) at identical incident light intensity and photomultiplier sensitivity and gain. Analysis was performed by visual inspection to determine localization and recorded using a limited number of commonly used GO terms to facilitate analysis and prioritization of downstream testing. Scanning multiple fields of view to determine nuclear or cytoplasmic staining localization was done in order to generate a nuclear/cytoplasmic ratio that could be visualized as a heat map.

siRNA Knockdown

The day before transfection, A2780cis cells (Sigma-Aldrich) were seeded at ∼5–7 × 104 cells per well of a 96-well plate (Perkin Elmer) in 100 μl of RPMI with 10% FBS without antibiotics. For each transfection well, Silencer Select Pre-designed siRNA (Ambion Life Technologies, Grand Island, NY) was diluted into 25 μl Opti-MEM reduced serum medium for final concentration of 10 nm per well. In a separate tube, 0.5 μl of lipofectamine RNAiMAX transfection reagent was added per well and diluted into 25 μl Opti-MEM reduced serum medium and mix. After a 5-min incubation the tubes were combined then incubated at room temperature for 30 min to allow siRNA-liposome complexes to form. Once complexed, the mix was added to cells (growth media removed), and then diluted with growth media to 100 μl final volume. Cells are then incubated with siRNA for 18 h at 37 °C in a CO2 incubator, at which time the mixture was removed and replaced with complete growth medium. Cells were screened for Fab binding after 48 h of siRNA addition.

RESULTS<sec><title/><sec><title>Antigen Expression

Production of soluble intact protein is important for successful selection of antibodies to native proteins (7, 9). Unstable or mis-folded target antigens typically result in poor binding reagents from phage display. For example, in small test sets we observed that proteins that are significantly proteolyzed or expressed in low levels typically did not produce antibodies of high quality whereas intact and soluble proteins do generate high quality antibodies (supplemental Fig. S2). Therefore, to avoid pitfalls with potentially disordered (12) multi-domain (13) TF antigens, we focused on those regions of TFs (10 to 25 kDa) that had clear domain architecture based on structural and bioinformatic considerations (1719) and thus a good chance of expressing a soluble protein (supplemental Table S1). Virtually all TF antigen domains were expressed in E. coli as fusion proteins to facilitate purification (His tag), antigen capture via a single biotinylated site (Avi tag), and tobacco etch viral (TEV) mediated protease cleavage to elute bound phage (supplemental Fig. S1A to S1C (9, 22, 23, 26). Purity and integrity of the TF to proteolysis was assessed by SDS-PAGE prior to entry into the antibody selection pipeline. Size exclusion chromatography (SEC) and thermostability measurements were occasionally conducted as a check of antigen quality; however these metrics were not usually integrated into the antigen expression pipeline because of the large number of antigens and the relatively low throughput nature of standard SEC.

Fab-phage Library and Robotic Selections

We chose a Fab phage display format where the heavy chain constant domain-1 and variable domain is fused to the gene III and the light chain is co-expressed using a phagemid system (33, 34). The Fabs assemble in the periplasm of E. coli through disulfide bond formation as a fusion protein with gene III that are subsequently packaged and displayed in bivalent format on the phage coat (35). The Fab scaffold is derived from the 4D5 anti-Her2 antibody, a very stable IgG1 (Tm 80 °C) for which several approved antibody drugs are based (3638). The Fab library is a synthetic and codon-restricted library that was designed to focus on the most important amino acids in four of the six complementary determining (CDR) loops (15). The CDRs were mutated in a combinatorial fashion based upon structural bioinformatic analysis of antibody–antigen structures in the PDB as previously described (15, 39, 40). This diverse library contains ∼3 × 1010 members and has been used successfully in manual selections for scores of target antigens with varying folds (41), altered conformations (26, 42, 43), and nucleic acids (44).

To industrialize the process we designed a pipeline approach augmented by custom assembled robotics to process antigens (Fig. 1A and 1B). This pipeline can accommodate various antigen formats through phage selection and primary validation of Fab affinity, sequence, and E. coli Fab expression. We typically used a “catch and release” proteolysis process for both soluble antigens and yeast cell-displayed antigen. To begin, biotinylated antigens are preferentially bound to streptavidin coated magnetic beads and excess antigen is washed away using a King Fisher Flex magnetic bead separator (Fig. 1B) (26, 39). The bead-based immobilization methods using streptavidin capture preserves the native fold of the antigen and reduces the possibility of localized antigen unfolding that sometimes occurs when antigens are adsorbed directly to plastic immunoassay plates (45). In some cases direct binding of antigen onto microtiter plates was conducted for selections, but this precludes the use of the King Fisher Flex device.

Overview of the RAN recombinant Fab selection pipeline, selection process, and examples of primary validation. A, Target antigens expressed in E. coli or other formats (Left Panel) enter the Fab-phage pipeline as biotinylated or GST-fusion antigens. Three to four rounds of antibody-phage selection generate pools of Fab-phage that are subsequently validated in a single point competition Fabphage ELISA and sequenced to identify unique binding sequences (Middle Panel). Unique Fabs that pass primary validation tests proceed into various secondary validation assays including ELISA EC50, Spiked-IP, and Immunofluorescence (Right Panel). B, UCSF robotic antibody production pipeline showing two liquid handling instruments for ELISA, KingFisher Flex magnetic bead separator for phage selections, and K6–2 colony picking robots. Similar robotics platforms are in-place at all three RAN locations. C, Solution based phage selection was conducted in a seven-step process that was repeated up to four cycles; 1. Biotinylated antigens were bound to streptavidin magnetic beads; 2. Bead-antigen complex was transferred to Fab-phage library; 3. Fab-phage-antigen-bead complex was transferred to wash buffer where any unbound or weakly associated Fab-phage was removed; 4. Specific Fab-phage bound to antigen was eluted from magnetic beads by TEV protease cleavage; 5. The processed magnetic beads and non-specifically bound Fab-phage were removed from well and discarded; 6. Antigen specific phage was propagated in E. coli; 7. Propagated Fab-phage were purified and the process repeated. D, Examples of Fab-phage that pass (green box) or fail the competition ELISA (red boxes) are shown. Direct binding ELISAs (y-axis) were conducted to measure the composite of expression and binding capability and single point competition ELISAs (x-axis) were measured to verify antigen specificity. For ELISAs 96 individual E. coli colonies were tested for each antigen with anti-M13-HRP phage secondary antibody with TMB chromagen development that is monitored kinetically at OD650/min. Each FAB-Phage expressing colony is represented as a single spot on the graph with the calculated Competition Ratio (x axis) and Total Fab-phage Binding (y axis). Two examples are shown for antigens selections that produced Fabs that pass and one that fail validation.

The Fab-phage library was mixed with the antigen-loaded magnetic beads and unbound phage was washed away (Fig. 1C, Steps 1–3). Beads were treated with TEV protease and eluted phage were recovered and propagated in E. coli for subsequent selection rounds (Fig. 1C, Steps 4–7). Unlike standard elution by low pH glycine elution from bare plastic plates (46), the proteolytic catch and release format ensures clean and efficient recovery of antigen bound Fab-phage regardless of antigen affinity. The catch and release approach, which is highly amenable to automation, reduces the recovery of nonspecific phage bound to the streptavidin bead or otherwise absorbed or aggregated onto the surface.

Several other improvements were made to the standard Fab-phage protocols. We found it useful to enrich phage particles that display functional Fab-phage with Protein A magnetic beads because only about 10% of phagemid have a Fab displayed. In addition, we exposed the phage to streptavidin magnetic beads to reduce the number of nonspecific or avidin binding Fab-phage. Lastly, the stringency of the selection was increased each round by systematically lowering the antigen concentration from 100 nm to 10 nm. From experience with hundreds of antigen selections, we found that ∼1nmol of antigen is needed for the selection process (∼25 μg for a 25 kDa antigen). This is considerably below what is typically used for animal immunizations. We monitored enrichment by positive binding Fab-phage versus the starting library and found that after three to four rounds of selection the Fab-phage pool was optimally enriched for antigen binding clones.

Fab-Phage Pool Deconvolution and Primary Validation

Usually 50–100 individual Fab-phage clones from the final selection pools were plated and grown for validation. These were tested in a high-throughput format (96- or 384-plate) for direct and competitive binding using a robotic ELISA assay (supplemental Fig. S3). We measured the OD650/min for direct binding of the Fab-phage to biotinylated antigen bound to a neutravidin or streptavidin coated ELISA plate (or an antigen coated ELISA plate) that was then probed with an anti-M13 phage antibody (Fig. 1D, y-axis). This high throughput assay provided a composite estimate of functional Fab-phage affinity and expression. We estimated the relative affinity by a single point Fab-phage competition binding ELISA. Here the Fab-phage were mixed with 20 nm soluble antigen and then allowed to bind to the same antigen plate (Fig. 1D, x-axis) where OD650/min was monitored and used to calculate a competition ratio. Competition ratio is defined as the OD650/min of competitive binding divided by the OD650/min of direct binding signal. Fab-phage clones were considered to have passed when they were shown to have expressed at OD650 >0.005 (or OD450 >0.1) and were competed by soluble antigen ratio <0.5 (Fig. 1D, upper left green quadrant). Representative graphs from ELISA validation show typical selection results for passing antigen (Fig. 1D; top right and lower left) and an unsuccessful antigen (Fig. 1D; bottom right). This simple ELISA triage step for 96 Fab-Phage generally requires less than 1nmol of antigen. The most promising clones (typically ∼48) were picked and prepared for DNA sequencing. Fab sequences are aligned and unique representative clones chosen for subsequent isolation, expression, and purification.

For expression of soluble monovalent Fab fragments, the Fab-phage clones were PCR amplified and subcloned into standardized E. coli expression plasmids, either Avi- or non-Avi-tagged (recombinant-antibodies.org/protocols), by SLICE or Ligase independent cloning methods (20, 21). Sequence verified plasmids were then transformed into C43 Pro+ E. coli (16) for expression studies. Expressed Fabs were purified by semi-automated Protein A chromatography that typically recovered 100–500 μg purified Fab proteins. Not all Fab frameworks bind to Protein A, but the Herceptin (4D5) framework chosen for our Fab-phage library binds efficiently to Protein A (47).

To better understand the global biophysical properties of the Fabs generated from this pipeline we analyzed a subset of primary validated Fabs for expression level, stability, and affinity. Expression levels in E. coli of over 700 monomeric Fabs showed that more than 90% express in the range of 1 to 10 mg/L, as assessed by Bradford assay or A280 measurements (Fig. 2A). Given that commercial antibody reagents are generally provided in 100 μg aliquots, simple 50–100 mL-shake flasks culture provide access to purified Fabs for many applications. To test the stability of the Fab antibodies we measured the thermostability of 96 randomly chosen Fabs by differential scanning fluorimetry (DSF) facilitated by robotic cycling PCR instruments (48). The Tm values of the Fabs varied over a narrow range, 75–85 °C (Fig. 2B). Detailed affinity measurements for ∼200 Fabs showed EC50 values that range from high pm to 50 nm with the average around 10 nm (Fig. 2C). The high expression level E. coli, the robust Fab stability, and high affinity encouraged us to scale up the process. To date, 537 TF antigens have been processed through the robotic Fab-phage selection pipeline and 435 antigens (81%) have produced about 3000 primary validated Fab antibodies that were moved into a secondary validation pipeline.

General parameters that affect successful Fab generation and characterization. Fabs expressed in E. coli were measured for antibody (A) expression, (B) stability, and (C) affinity. Fabs successfully passing primary validation were cloned into E. coli expression plasmids and expressed and purified for further testing. Fabs derived from Library E and F typically expressed between 1 and 10 mg/L of bacterial culture (n = 720). Fabs (n = 96) had melting points between 75C and 85C. EC50 ELISAs were conducted on Fabs to determine binding affinity values with a subset shown here (n = 201) with ∼85% of Fabs binding with an EC50 <20 nm.

Secondary Validation and Protein Properties That Correlate with Success

As a next level of antibody validation, we sought to determine how the Fabs would bind to their parent TF in a cellular lysate. Kelley and co-workers have shown that human antibodies that are rapidly cleared in serum exhibit high levels of nonspecific binding in assays containing cellular extract (49). Nonspecific binding was estimated by the reduction in the direct binding (ELISA EC50) of antibody to antigen in the presence of non-cognate protein from baculovirus extract. We developed a similar assay termed Spiked Immunoprecipitation (Spiked-IP) designed to rapidly screen all Fabs to identify any sticky or nonspecific binding Fabs (50). This allowed us to assess the ability of the Fab to bind and pull-down exogenously added TF antigen in the presence of complex human cell lysates compared with buffer alone. Briefly, biotinylated Fab is bound to streptavidin magnetic beads and mixed with antigen for binding in the presence or absence of a HEK cell lysate as monitored by flow cytometry (Fig. 3A). The assay was also run in reverse mode where biotinylated antigens were bound to beads and allowed to bind Fab in the presence of the HEK cell lysate or in buffer alone.

Overview of Spiked-IP with examples of data analysis and relative successes of two protein domains. A, Diagram of Spiked IP showing Fabs immobilized on magnetic beads binding to biotinylated antigens (Left) and after transfer to fresh well (Middle) the antigen-Fab-bead complex was bound to fluorescently labeled streptavidin then binding was quantified by Flow Cytometry. Samples were analyzed for binding characteristics in both buffer and a complex Hek293T lysate that showed specificity of the Fab to its cognate antigen. This assay was designed to run in either forward or reverse mode where antigen could be immobilized on beads with Fabs in solution. B, An example data set that shows 3 antigens with 3–4 Fabs each with both Passing and Failing Spiked-IP tests. Negative control (Blue) consists of beads without Fab in the presence of antigen then (Green) bars represent the assay run in the presence of buffer and (Red) lysate. To pass the assay, the differences in median fluorescence intensity between buffer and lysate tests must not be >2x. Fabs that passed the Spiked-IP test were denoted with (*). C, Representative success of Fabs generated from SCAN (n = 39) and Zinc Finger (n = 42) domains through secondary Spiked-IP test.

In order to pass the Spiked-IP assay, lysate to buffer binding signals had to be within 2-fold of each other. Those Fabs having lysate signals more than twofold lower than the buffer condition were considered to have too much nonspecific or sticky binding to other proteins in the lysate as shown in an example for three antigens with three to four Fabs each (Fig. 3B). Although this assay does not directly address specificity of the Fab to its target antigen, we believe this assay robustly screens for Fabs with a high probability of being specific. About 56% of the antibodies that passed the primary validation steps (affinity, stability, and expression) passed the Spiked-IP secondary validation test (Table I). The endogenous levels of TFs are typically more than 100-fold lower than we estimate on the bead so it is unlikely we would discard passing antibodies from endogenous TF competition. Nonetheless, false negatives are possible, and we did not evaluate them further.

Ability of antigens to produce primary and secondary validated Fabs as a function of antigen domain. Of the 537 TF antigens, representing at least 18 discrete domain types, 435 (81%) successfully generated sequence unique Fabs that passed competition ELISA. For 193 antigens with Fabs entering secondary validation, 108 (56%) passed the Spiked-IP validation test. The best performing domains are the Forkhead, SCAN, JMJ, and BTB domains with success rates of 100%, 88%, 100%, and 86% respectively
Domain typePrimary validation
Secondary validation
Total # of antigensPassing antigensDomain success (%)Total # of antigensPassing antigensDomain success (%)
BRD3732862150
BTB109907686
bZip1111100400
Cupin33100NTNT
DUF22100NTNT
Fork Head548033100
HDAC3267NTNT
HMG Box108805240
Homeobox658315533
Hormone Recep.65835360
JMJ12119233100
Misc14112287503468
PHD151280300
PWWP771002150
SCAN4646100403588
SET3121685240
SIR2441003133
ZNF14810571461226
Sum5374358119310856

The ability to successfully generate at least one high affinity Fab that passed the primary and secondary validation depended on TF domain type (Table I). For example, antibody selections for SCAN domains succeeded 88% of the time, whereas Zinc finger domains were only successful 26% of the time (Fig. 3C). It is possible these differences reflect the stability of the domains or that antigens form complexes with a molecule in the lysate (e.g. DNA) in such a way that prevents Fab binding.

We also tested a subset of Fabs (n = 18) that were generated to SCAN domain-containing antigens in a single point pairwise ELISA analysis to address specificity of the Fabs to the different SCAN domains of varying homology. In almost all cases the Fabs generated here mono-specific (Fig. 4, left panel). In the few cases where the Fab was poly-specific, for example RAB-S181, S145, and S169, the target antigens were >93% identical and only differed in amino acid composition by up to 5 residues (Fig. 4, right panel). This degree of selectivity was remarkable given we did not impose a counter selection against the closely related antigen. We also tested a Fab that failed the Spiked-IP assay as a negative control and this indeed showed a higher degree of poly-specificity (Fig. 4, left panel).

Assessment of the specificity of anti-SCAN domain Fabs. Left panel shows specificity heatmap of 18 Fabs tested against 17 closely related SCAN domains using the direct ELISA method. Fabs used in experiment were sorted by homology according to target protein homology as shown in the phylogenetic tree (left) and colors on the heatmap represent strength of the ELISA signal. anti-GFP Fab and eGFP protein was used as a positive control. An antibody raised again ZNF496 that failed Spiked-IP validation was used as a negative control to show nonspecific binding. Anticipated signals on diagonal represent interactions where Fabs recognize their intended antigens. Left panel shows pairwise identity heatmap of SCAN domain antigens tested. Multiple sequence alignment and phylogenetic tree of SCAN domains, as defined in SMART database, was constructed in MAFFT using l-INS-i and NJ methods respectively (63, 64). All pairwise identities were calculated in Jalview (65). Phylogenetic tree and heatmap annotations were visualized using EvolView (66).

We also noted that the number of unique Fabs obtained per antigen increased as antigen size increased over 14 kDa (Fig. 5A). This is not surprising, as the surface area for binding increases. We also find that the number of Fabs identified per antigen systematically decreases as the isoelectric point (pI) of the antigen increased from pI 4 to 10 (Fig. 5B). This could reflect the anionic nature of the phage coat to bind cationic proteins (51). These general parameters help to define antigen clones with high likelihood of success.

Analysis of Fab selection success rates. Success rates were analyzed as a function of antigen (A) molecular weight or (B) isoelectric point. Greater Fab selection success was appreciated for antigens with a MW ≥14kDA and pI <8 (n = 447 antigens; n = 3343 Fabs). C, Multiple antigen expression pipelines can increase likelihood of obtaining high quality Fabs. Test antigens (n = 32) were expressed in three different expression formats: E. coli, IVTT or YAD. Fab-phage selections were conducted for each antigen using the robotic Fab-phage selection pipeline. The number of primary validated Fabs is shown for each antigen.

Overall, 56% of all TF antigens screened generated Fabs that passed primary and secondary validation criteria. We believed some of this failure could be from inefficient protein folding in E. coli. Therefore, we sought to develop additional high through-put antigen expression systems to supplement E. coli expression system. To systematically compare expression methods and selections from them, we studied the ability to produce Fabs from 32 different TF antigens expressed in three different formats: E. coli, in vitro transcription translation, and yeast antigen display.

We put these 32 antigens through the automated Fab selection pipeline and scored them based on the number of Fabs that passed primary validation criteria. Indeed there was considerable variation in the success of obtaining primary validated Fabs for the 32 TF antigens across the different expression formats (Fig. 5C and supplemental Table S2). Although the three formats in aggregate produced at least one primary validated Fab-phage per antigen, IVTT antigens were somewhat more successful, 66%, with E. coli constructs and yeast antigen display (YAD) averaging 41 and 47% success, respectively. Some antigens produced many passing antibodies (>45 Fabs per 96 tested) and others produced only a few. We conclude that having multiple antigen formats increases the likelihood of successful selection by providing back-up systems.

Immunofluorescence Validation of Fabs and Location of Antigens in the Cell

Transcription factors can reside in almost any compartment in the cell from the cell membrane, to cytosol to nucleus (52). Having an unprecedented collection of renewable antibodies to TFs allows us to begin to probe the cellular distribution of these TFs by immunofluorescence (IF). We chose six human cell lines that are known to express detectable levels of mRNA based on RNA sequence data (not shown) for each of 270 TFs which we had multiple passing Fabs. We used a high-throughput IF assay to screen 1017 Fabs in permeabilized cells (supplemental Table S3). Columbus software associated with an Opera imager (Perkin Elmer) was utilized to quantify the fluorescence intensities across multiple fields for each antibody to generate a heat map showing a ratio of nuclear intensity to cytoplasmic intensity (Fig. 6A). Staining patterns were also scored by visual inspection using four standard GO-based descriptors typically applied to categorize cellular localization: cytoplasmic, nuclear, mixed cytoplasmic/nuclear, and miscellaneous. Interestingly, TF proteins appear to have different localization patterns depending on cell type and growth conditions. These results suggest that the Fabs produced here were specific for their target TF and could be used to monitor TF trafficking. Examples of each of these categories are shown in Fig. 6B. When analyzing the binding patterns of all Fabs tested by IF, we found that on average ∼70% of the Fabs tested localized predominantly to the cytosol and ∼20% to the nucleus (Fig. 6C). There was remarkable consistency between these two scoring methods and across all cell lines in terms of the proportion in the nucleus versus cytosol (20% versus 70%). Moreover, some TFs were found in different locations depending on the cell type likely reflecting their activation state differences. We also compared our IF data with those in common from the Protein Atlas, which generated affinity purified polyclonal antibodies against segments of TFs. Remarkably, we found that of the 55 antigens where an overlapping antibody has been developed, 50 (90.1%) show identical localization patterns (supplemental Table S5).

Photomicrographs and heat map showing Immunofluorescence staining patterns (cytosol in blue to nuclear in red) for 1017 Fabs directed to 270 TFs. A, Expressed and purified Fabs were screened against six commonly used human cell lines and images were screened by the Columbus imaging software to generate signal intensities across multiple fields. Intensities were used to generate a heatmap that shows the ratio of intensities of nuclear to cytoplasmic localization of Fabs. B, Representative photomicrographs for immunofluorescence staining of Fabs (n = 1017) against TFs (n = 270) in six different fixed and permeabilized mammalian cell lines known to express the TF. Scoring was represented as Cyto, Cyto/Nuc, Nuc, or Misc. C, Similar analysis was conducted by visually inspecting images and assessing staining patterns based on 4 GO terms; Nuclear (Nuc), Nuclear/Cytoplasmic (Nuc/Cyto), Cytoplasmic (Cyto), and Miscalenaous staining (Misc). Mean localization results were calculated and showed that ∼67% of all Fabs labeled a protein in the cytoplasm compared with ∼19% in the nucleus.

An example siRNA knockdown (KD) validation was done using the BATF Fab/antigen combination and showed significant reduction in staining intensity when cells were transfected with antigen specific siRNA and compared with the anti-flag secondary antibody only control, that showed no nonspecific signal because of secondary antibody binding (supplemental Fig. S5). Quantitation was done by ImageJ analysis and showed a significant decrease in staining with BATF (p < 0.0001) antibodies in siRNA-transfected cell in contract to control siRNA-transfected cells. Further studies will be useful to study what external stimuli alter the trafficking of these TFs. Fabs shown in supplemental Table S4 are currently available to the scientific community for further study.

Fabs to Epigenetic Proteins

In a parallel project, we collaborated with the Structural Genomics Consortium (SGC; http://www.thesgc.org/) to generate Fabs to an extensive group of chromatin remodeling proteins (supplemental Tables S1 and S4). Similar to the TFs, these chromatin-remodeling proteins are large multi-domain molecules that are difficult to express and are prone to aggregation in their full-length forms. Thus, as was the case for the TFs, the design of the antigens was based on identifying domain types that could be expressed as independent and stable entities. In many cases we had direct structural information to guide the domain expression experiments. From the phage display selections against 211 antigens we obtained 334 Fabs that passed primary affinity, and secondary Spiked-IP or IP-MS validation. Marcon and co-workers (53), developed an IP-MS method and tested 1154 Fabs to 154 of the epigenetic antigens for binding their endogenous target in HEK293 cells. They found that 452 of these antibodies pulled down the endogenous target for 98 antigens where the antigen (and in some cases known binding partners) were among the top three proteins identified. The other 54 targets were presumed not expressed at high enough endogenous levels to detect. This success rate from primary to secondary validation was roughly the same as the Spiked-IP method. Although the IP-MS assay is more labor intensive than Spiked-IP, it provides direct mass data for antigen capture in the presence of complex cellular lysate like the Spiked-IP assay. The overall success rate for antigens producing primary and secondary passing Fabs was 68% (123 out of 200 tested). The generally higher success rate for the epigenetic targets was probably because of the fact that the expressed domains benefited from knowledge of the actual structures of many of the domains, which had been determined previously in the SGC.

DISCUSSION

There is a recognized need for creating tool antibodies to the entire human proteome (54, 55). The first systematic effort to generate proteome-wide antibody reagents began with the Human Protein Atlas (HPA) over a decade ago focused on generating polyclonal antibodies in rabbits (3, 56). To increase antigenicity, antigens were produced to 100-residue segments, called protein epitope signature tags (PrEST). PrESTs were chosen from the human target predominantly to be highly non-homologous to the rabbit homolog and without consideration of domain structure or fold. This large-scale effort generated affinity purified rabbit polyclonal antibodies that reportedly react with PrESTs to over 80% of expressed human genes. Validation data is publically available through the HPA web site (proteinatlas.org). Many of these antibodies are now available through commercial vendors, but not yet for open source distribution given the expense of production. In comparing our IF data to those in the Protein Atlas we found 55 targets in common for staining the same cells. Remarkably, 50 of the 55 targets had the same staining patterns despite the one being a polyclonal and the other a recombinant monoclonal as well them having been raised to different regions of the common transcription factor. Polyclonal antibodies can have advantages over monoclonal antibodies because they can bind many epitopes thus ensuring greater probability of reacting with proteins in various formats. However the antibody epitopes and binding mechanisms are challenging to assess, as they are mixtures. Animal derived polyclonal antibodies suffer from the fact that they are not renewable and thus expensive validation data and production yields are lot dependent. Moreover, the human antigen can lead to self-selection against conserved proteins or epitopes an animal host and thus can preclude identification of functionally relevant epitopes (57).

Recombinant antibodies offer a renewable and stable source in the cloned monoclonal gene and hence there has been a call from the community to generate them systematically (5, 58). The fact that selections can be run in vitro avoids problems with self-selection in the animal and can allow for isolation of very high-resolution antibody reagents ensured to bind the native protein and even specific conformations or post-translationally modified forms (16, 5961). Moreover, cloned genes can be further manipulated as genetic tools and plasmids are easily made publically available to researchers at low cost. Our intent here was to develop an industrialized platform empowered by robotics for the high-throughput production of renewable recombinant antibodies using phage display and high throughput antigen production. We show this in generating multiple high affinity, well expressed, and cell reactive antibodies to 346 transcription factors and 211 epigenetic proteins.

Production of high quality antigens is a key step in the process because denatured or unstable proteins typically do not reliably yield good monoclonal antibodies to native proteins (7). We focused on expression of what we considered to be stable domains (10–20 kDa) based on bioinformatics analysis. It is well-known in the x-ray crystallography community that one often needs to try different constructs in different hosts to generate high purity and stable protein for crystallography (62). Thus, we investigated three high throughput protein expression platforms to supplement the antigen pipeline. Indeed, there was considerable variability in success in generation of validated antibodies from the same antigen produced in these formats. We suspect the differences in selection success are because of antigen stability and solubility from the different formats as has been noted by McCafferty and coworkers (7). These different expression formats provide alternative pathways for generating antibodies to recalcitrant antigens.

The success rate for generating antibodies varied from 20–100% depending on the fold-type of the TF. These differences may reflect the relative stabilities of these TF domains. Indeed, it was noted that zinc finger domains were particularly problematic to express and purify and they were typically the least successful in generating validated Fabs. There is also a systematic correlation to molecular weight and charge. For example, small domains generated fewer passing Fabs. This may reflect the smaller surface area and hence fewer epitopes, or that they are intrinsically less stable. Also, highly basic proteins with pIs above 8 to 9 are generally less successful than those with pI values between 4 and 7. This is possibly because of nonspecific interactions with the anionic phage coat precluding selection of specific binders. These trends help us predict which antigens might require more attention in order to produce high quality binders. Currently we have expressed 557 TFs, and 435 yielded about 3000 primary validated antibodies (based on unique sequence, affinity, stability, expressibility). Of the Fabs tested by Spiked-IP, 56% have passed. We suspect that the higher success rate for epigenetic antigens (68% of input) is because the antigens were chosen based on more accurate structural information.

Laboratory automation was critical to standardize the many routine and laborious activities ranging from phage sorting, ELISA, colony picking, PCR and stability measurements to facilitate large-scale generation of affinity reagents. The automated selection process greatly reduced the time from months in animals to weeks on the robot with reduced costs associated with antibody generation, and increased the throughput to up to 48 per batch of antibody selections. To take maximal advantage of the robotics it was necessary to make numerous adjustments in the bench scale phage and antigen protocols. For example, the automated magnetic bead selections coupled with the catch and release approach permitted clean elution of antigen-bound phage of very high affinity that can be problematic for low pH glycine elution. Moreover, protocols for preclearing the starting library and enriching selection pools for functional phage using Protein A purification have been added.

We attribute much of the success here to choosing a remarkably stable Fab library scaffold (average Tm 80 °C). The selected Fabs show consistently high expression by simple shake flask culturing in E. coli. They are easily purified in high-throughput using simple Protein A columns. This synthetic library built on a highly stable and well-expressed scaffold can provide more consistent stability and expression than for natural B-cell derived libraries (7), which have multiple scaffolds that vary widely in stability and recombinant expression.

The selected Fabs had EC50 values that ranged from high pm to 50 nm (average 20 nm) without additional affinity maturation. Moreover, these are readily converted to a Mab format that routinely affords a 10–50-fold boost in affinity because of avidity and can be expressed in HEK293 cells at levels comparable to those in E. coli (Sachdev Sidhu et al., unpublished results).

The Fabs typically showed good evidence for specificity in cell extracts by IP-MS (53) or here by Spiked-IP. Although the latter assay and others like it (49) do not directly identify possible off-targets, these are quite stringent for removing antibodies that react with non- or poly-specificity. Fabs generated to highly homologous SCAN domains showed high mono-specificity. In cases where cross-reactivity was observed this was to highly related domains, typically having homology between 80–90%. This is remarkable in that we did not employ counter selections against the closest homologs and nor did we affinity mature the antibodies to increase selectivity. Such measures could be employed with these recombinant antibodies should cross-reactivity be observed.

TFs are extremely important players in biology. We have tested 1017 Fabs for their ability to bind 270 TFs in six different human cell lines. Remarkably, an average of 70% the Fabs recognized their target antigen predominantly in the cytosol and only about 20% in the nucleus. These data are similar to data presented at the HPA where a low percentage of TF antibodies bind only in the nucleus. Although we found the average distribution of TFs in the cytosol and nucleus were similar across the six cell lines, the TFs often were found in different compartments in different cells. The IF staining pattern represents a different bar code for each cell line and we believe this reflects differences in activation states in different cells. Our studies reinforce many individual studies showing that TFs transit from the cytosol to the nucleus upon activation. We believe these reagents will be very useful for systems-wide studies to study how TF activation affects transport as well as other important studies.

The recombinant antibodies have characteristics of highly selective binding reagents based on the high throughput validation data presented: stringent selection by phage display, unique sequences for each Fab, high stability and affinity, high and soluble expression in E. coli, competitive binding with antigen, binding in the presence of complex cell lysates, and specific immunofluorescence binding patterns in cells. Although no systematic analysis of the suitability of the antibodies for Western blotting has been done, we do not anticipate high performance because the Fabs were intentionally raised to soluble protein domains. Nonetheless, for about 20 Fabs tested with collaborators, five showed a Western positive band corresponding to the TF suggesting a subset will work in Western blots (Gordon Whiteley, National Cancer Institute, unpublished results).

We have not systematically tested the performance of these antibodies in Chromatin immunopreciptitation (ChIP-Seq) experiments because of cost, but selective examples have been tested by Marcon and coworkers (53). Clearly much more validation is warranted for these and many other applications including their use for structural studies. Nearly 1000 of the genes and expression constructs for these Fabs are currently available to academic researchers along with validation data through the Recombinant Antibody Network (recombinant-antibodies.org) and the plasmid repository at DNASU (http://tinyurl.com/lm69t3s). More will be made available as they transit the pipeline. Although we have only scratched the surface, an industrial-scale platform is in place for the generation of renewable antibodies to the entire proteome.

Supplementary Material Supplemental Data

Author contributions: M.H., M.P., S.M., A.S., T.M., B.L., C.G., E.C., S.A., S.G., S.K., J.M., S.S., A.K., and J.W. designed research; M.H., M.P., S.M., T.M., B.L., K.W., A.D., D.K., S.U., K.P., V.L., W.T., J.G., R.J.H., D.L., C.G., Z.L., F.J.V., D.D., E.C., N.Z., S.G., and J.M. performed research; M.H., M.P., S.M., T.M., B.L., K.W., A.D., D.K., S.U., K.P., V.L., W.T., J.L., J.G., R.J.H., D.L., C.G., Z.L., F.J.V., D.D., E.C., N.Z., S.G., J.M., S.S., A.K., and J.W. contributed new reagents or analytic tools; M.H., M.P., S.M., A.S., T.M., B.L., K.W., A.D., D.K., S.U., K.P., V.L., W.T., J.L., J.G., R.J.H., D.L., C.G., Z.L., F.J.V., D.D., E.C., S.A., N.Z., S.G., S.K., J.M., S.S., A.K., and J.W. analyzed data; M.H., M.P., S.M., A.S., A.K., S.S., and J.W. wrote the paper; M.H., M.P., and S.M. co-first author.

* Funding was generously provided by a U54 grant from the NIH common fund (1U54HG006436). Additional funding was provided to build the UCSF robotics by the NIH Office of the Director (1S10OD012037) and California Institute for Quantitative Biosciences (QB3) at UCSF.

This article contains supplemental Figs. S1 to S5 and Tables S1 to S5.

1 The abbreviations used are:

TF

Transcription Factor

Fab

Fragment antigen-binding domain

TEV

Tobacco Etch Virus Protease

SEC

Size exclusion chromatography

CDR

complementary determining region

DSF

Differential Scanning Fluorimetry

Spiked-IP

Spiked Immunoprecipitation

pI

isoelectric point

YAD

yeast antigen display

IF

immunofluorescence

KD

knockdown.

We are grateful to useful advice from the Sidhu lab and TRAC at University of Toronto, the Kossiakoff and Koide labs at University of Chicago, James Wells lab at UCSF,and James Marks for providing yeast display vectors and advice, and Janet Huang and Tom Acton in the Anderson lab at Rutgers University for protein purification.

REFERENCES PasqualiniR., ArapW. (2004) Hybridoma-free generation of monoclonal antibodies. Proc. Natl. Acad. Sci. U.S.A. 101, 25725914688405 HarlowE., LaneD. Using antibodies : a laboratory manual. 1999, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press xiv, 495 p. BerglundL., BjorlingE., OksvoldP., FagerbergL., AsplundA., SzigyartoC. A., PerssonA., OttossonJ., WernerusH., NilssonP., LundbergE., SivertssonA., NavaniS., WesterK., KampfC., HoberS., PontenF., UhlenM. (2008) A genecentric Human Protein Atlas for expression profiles based on antibodies. Mol. Cell. Proteomics 7, 2019202718669619 BordeauxJ., WelshA., AgarwalS., KilliamE., BaqueroM., HannaJ., AnagnostouV., RimmD. (2010) Antibody validation. BioTechniques 48, 19720920359301 BradburyA., PluckthunA. (2015) Reproducibility: standardize antibodies used in research. Nature 518, 272925652980 GloriamD. E., OrchardS., BertinettiD., BjorlingE., Bongcam-RudloffE., BorrebaeckC. A., BourbeillonJ., BradburyA. R., de DaruvarA., DubelS., FrankR., GibsonT. J., GoldL., HaslamN., HerbergF. W., HiltkeT., HoheiselJ. D., KerrienS., KoeglM., KonthurZ., KornB., LandegrenU., Montecchi-PalazziL., PalcyS., RodriguezH., SchweinsbergS., SievertV., StoevesandtO., TaussigM. J., UeffingM., UhlenM., van der MaarelS., WingrenC., WoollardP., ShermanD. J., HermjakobH. (2010) A community standard format for the representation of protein affinity reagents. Mol. Cell. Proteomics 9, 11019674966 SchofieldD. J., PopeA. R., ClementelV., BuckellJ., ChappleS., ClarkeK. F., ConquerJ. S., CroftsA. M., CrowtherS. R., DysonM. R., FlackG., GriffinG. J., HooksY., HowatW. J., Kolb-KokocinskiA., KunzeS., MartinC. D., MaslenG. L., MitchellJ. N., O'SullivanM., PereraR. L., RoakeW., ShadboltS. P., VincentK. J., WarfordA., WilsonW. E., XieJ., YoungJ. L., McCaffertyJ. (2007) Application of phage display to high throughput antibody generation and characterization. Genome Biol. 8, R25418047641 TurunenL., TakkinenK., SoderlundH., PulliT. (2009) Automated panning and screening procedure on microplates for antibody generation from phage display libraries. J. Biomol. Screen. 14, 28229319224869 PershadK., PavlovicJ. D., GraslundS., NilssonP., ColwillK., Karatt-VellattA., SchofieldD. J., DysonM. R., PawsonT., KayB. K., McCaffertyJ. (2010) Generating a panel of highly specific antibodies to 20 human SH2 domains by phage display. Protein Eng. Des. Sel. 23, 27928820164216 ColwillK., GraslundS. (2011) A roadmap to generate renewable protein binders to the human proteome. Nat. Methods 8, 55155821572409 VaquerizasJ. M., KummerfeldS. K., TeichmannS. A., LuscombeN. M. (2009) A census of human transcription factors: function, expression and evolution. Nat. Rev. Genet. 10, 25226319274049 SpolarR. S., RecordM. T.Jr. (1994) Coupling of local folding to site-specific binding of proteins to DNA. Science 263, 7777848303294 FrankelA. D., KimP. S. (1991) Modular structure of transcription factors: implications for gene regulation. Cell 65, 7177192040012 BradburyA. R., SidhuS., DubelS., McCaffertyJ. (2011) Beyond natural antibodies: the power of in vitro display technologies. Nat. Biotechnol. 29, 24525421390033 PerssonH., YeW., WernimontA., AdamsJ. J., KoideA., KoideS., LamR., SidhuS. S. (2013) CDR-H3 diversity is not required for antigen recognition by synthetic antibodies. J. Mol. Biol. 425, 80381123219464 KoerberJ. T., ThomsenN. D., HanniganB. T., DegradoW. F., WellsJ. A. (2013) Nature-inspired design of motif-specific antibody scaffolds. Nat. Biotechnol. 31, 91692123955275 XiaoR., AndersonS., AraminiJ., BeloteR., BuchwaldW. A., CiccosantiC., ConoverK., EverettJ. K., HamiltonK., HuangY. J., JanjuaH., JiangM., KornhaberG. J., LeeD. Y., LockeJ. Y., MaL. C., MaglaquiM., MaoL., MitraS., PatelD., RossiP., SahdevS., SharmaS., ShastryR., SwapnaG. V., TongS. N., WangD., WangH., ZhaoL., MontelioneG. T., ActonT. B. (2010) The high-throughput protein sample production platform of the Northeast Structural Genomics Consortium. J. Struct. Biol. 172, 213320688167 ActonT. B., XiaoR., AndersonS., AraminiJ., BuchwaldW. A., CiccosantiC., ConoverK., EverettJ., HamiltonK., HuangY. J., JanjuaH., KornhaberG., LauJ., LeeD. Y., LiuG., MaglaquiM., MaL., MaoL., PatelD., RossiP., SahdevS., ShastryR., SwapnaG. V., TangY., TongS., WangD., WangH., ZhaoL., MontelioneG. T. (2011) Preparation of protein samples for NMR structure, function, and small-molecule screening studies. Methods Enzymol. 493, 216021371586 HarrisonS. C. (1991) A structural taxonomy of DNA-binding domains. Nature 353, 7157191944532 ZhangY., WerlingU., EdelmannW. (2012) SLiCE: a novel bacterial cell extract-based DNA cloning method. Nucleic Acids Res. 40, e5522241772 AslanidisC., Pde JongP. J. (1990) Ligation-independent cloning of PCR products (LIC-PCR). Nucleic Acids Res. 18, 606960742235490 KeatesT., CooperC. D., SavitskyP., AllerstonC. K., PhillipsC., HammarstromM., DagaN., BerridgeG., MahajanP., Burgess-BrownN. A., MullerS., GraslundS., GileadiO. (2012) Expressing the human proteome for affinity proteomics: optimizing expression of soluble protein domains and in vivo biotinylation. N. Biotechnol. 29, 51552522027370 Structural Genomics, C., China Structural Genomics, C., Northeast Structural Genomics C., GraslundS., NordlundP., WeigeltJ., HallbergB. M., BrayJ., GileadiO., KnappS., OppermannU., ArrowsmithC., HuiR., MingJ., dhe-PaganonS., ParkH. W., SavchenkoA., YeeA., EdwardsA., VincentelliR., CambillauC., KimR., KimS. H., RaoZ., ShiY., TerwilligerT. C., KimC. Y., HungL. W., WaldoG. S., PelegY., AlbeckS., UngerT., DymO., PriluskyJ., SussmanJ. L., StevensR. C., LesleyS. A., WilsonI. A., JoachimiakA., CollartF., DementievaI., DonnellyM. I., EschenfeldtW. H., KimY., StolsL., WuR., ZhouM., BurleyS. K., EmtageJ. S., SauderJ. M., ThompsonD., BainK., LuzJ., GheyiT., ZhangF., AtwellS., AlmoS. C., BonannoJ. B., FiserA., SwaminathanS., StudierF. W., ChanceM. R., SaliA., ActonT. B., XiaoR., ZhaoL., MaL. C., HuntJ. F., TongL., CunninghamK., InouyeM., AndersonS., JanjuaH., ShastryR., HoC. K., WangD., WangH., JiangM., MontelioneG. T., StuartD. I., OwensR. J., DaenkeS., SchutzA., HeinemannU., YokoyamaS., BussowK., GunsalusK. C. (2008) Protein production and purification. Nat. Methods 5, 13514618235434 MierschS., LiZ., HannaR., McLaughlinM. E., HornsbyM., MatsuguchiT., PaduchM., SaafA., WellsJ., KoideS., KossiakoffA., SidhuS. S. (2015) Scalable high throughput selection from phage-display synthetic antibody libraries. J. Vis. Exp. 95, 5149225651360 YinG., GarcesE. D., YangJ., ZhangJ., TranC., SteinerA. R., RoosC., BajadS., HudakS., PentaK., ZawadaJ., PollittS., MurrayC. J. (2012) Aglycosylated antibodies and antibody fragments produced in a scalable in vitro transcription-translation system. MAbs 4, 21722522377750 PaduchM., KoideA., UysalS., RizkS. S., KoideS., KossiakoffA. A. (2013) Generating conformation-specific synthetic antibodies to trap proteins in selected functional states. Methods 60, 31423280336 BoderE. T., WittrupK. D. (1997) Yeast surface display for screening combinatorial polypeptide libraries. Nat. Biotechnol. 15, 5535579181578 Weaver-FeldhausJ. M., LouJ., ColemanJ. R., SiegelR. W., MarksJ. D., FeldhausM. J. (2004) Yeast mating for combinatorial Fab library generation and surface display. FEBS Lett. 564, 243415094038 KayB. K., ThaiS., VolginaV. V. (2009) High-throughput biotinylation of proteins. Methods Mol. Biol. 498, 18519618988027 AckermanM., LevaryD., TobonG., HackelB., OrcuttK. D., WittrupK. D. (2009) Highly avid magnetic bead capture: an efficient selection method for de novo protein engineering utilizing yeast surface display. Biotechnol. Prog. 25, 77478319363813 RakestrawJ. A., AirdD., AhaP. M., BaynesB. M., LipovsekD. (2011) Secretion-and-capture cell-surface display for selection of target-binding proteins. Protein Eng. Des. Sel. 24, 52553021402751 KoerberJ. T., HornsbyM. J., WellsJ. A. (2015) An improved single-chain fab platform for efficient display and recombinant expression. J. Mol. Biol. 427, 57658625481745 GarrardL. J., YangM., O'ConnellM. P., KelleyR. F., HennerD. J. (1991) Fab assembly and enrichment in a monovalent phage display system. Biotechnology 9, 137313771369462 BassS., GreeneR., WellsJ. A.(1990) Hormone phage: an enrichment method for variant proteins with altered binding properties. Proteins 8, 3093141708882 LeeC. V., SidhuS. S., FuhG. (2004) Bivalent antibody phage display mimics natural immunoglobulin. J. Immunol. Methods 284, 11913214736422 LosM., RoodhartJ. M., VoestE. E. (2007) Target practice: lessons from phase III trials with bevacizumab and vatalanib in the treatment of advanced colorectal cancer. Oncologist 12, 44345017470687 SchulmanE. S. (2001) Development of a monoclonal anti-immunoglobulin E antibody (omalizumab) for the treatment of allergic respiratory disorders. Am. J. Respir. Crit. Care Med. 164, S61111704611 HudisC. A. (2007) Trastuzumab–mechanism of action and use in clinical practice. N. Engl. J. Med. 357, 395117611206 FellouseF. A., EsakiK., BirtalanS., RaptisD., CancasciV. J., KoideA., JhuraniP., VasserM., WiesmannC., KossiakoffA. A., KoideS., SidhuS. S. (2007) High-throughput generation of synthetic antibodies from highly functional minimalist phage-displayed libraries. J. Mol. Biol. 373, 92494017825836 MillerK. R., KoideA., LeungB., FitzsimmonsJ., YoderB., YuanH., JayM., SidhuS. S., KoideS., CollinsE. J. (2012) T cell receptor-like recognition of tumor in vivo by synthetic antibody fragment. PLoS One 7, e4374622916301 KanekoT., HuangH., CaoX., LiX., LiC., VossC., SidhuS. S., LiS. S. (2012) Superbinder SH2 domains act as antagonists of cell signaling. Sci. Signal. 5, ra6823012655 GaoJ., SidhuS. S., WellsJ. A. (2009) Two-state selection ofconformation-specific antibodies. Proc. Natl. Acad. Sci. U.S.A. 106, 3071307619208804 ThomsenN. D., KoerberJ. T., WellsJ. A. (2013) Structural snapshots reveal distinct mechanisms of procaspase-3 and -7 activation. Proc. Natl. Acad. Sci. U.S.A. 110, 8477848223650375 YeJ. D., TereshkoV., FrederiksenJ. K., KoideA., FellouseF. A., SidhuS. S., KoideS., KossiakoffA. A., PiccirilliJ. A. (2008) Synthetic antibodies for specific recognition and crystallization of structured RNA. Proc. Natl. Acad. Sci. U.S.A. 105, 828718162543 SchwabC., BosshardH. R. (1992) Caveats for the use of surface-adsorbed protein antigen to test the specificity of antibodies. J. Immunol. Methods 147, 1251341311735 SmithG. P. (1985) Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface. Science 228, 131513174001944 StarovasnikM. A., O'ConnellM. P., FairbrotherW. J., KelleyR. F. (1999) Antibody variable region binding by Staphylococcal protein A: thermodynamic analysis and location of the Fv binding site on E-domain. Protein Sci. 8, 1423143110422830 NiesenF. H., BerglundH., VedadiM. (2007) The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability. Nat. Protoc. 2, 2212222117853878 HotzelI., TheilF. P., BernsteinL. J., PrabhuS., DengR., QuintanaL., LutmanJ., SibiaR., ChanP., BumbacaD., FielderP., CarterP. J., KelleyR. F. (2012) A strategy for risk mitigation of antibodies with fast clearance. MAbs 4, 75376023778268 NishikoriS., HattoriT., FuchsS. M., YasuiN., WojcikJ., KoideA., StrahlB. D., KoideS. (2012) Broad ranges of affinity and specificity of antihistone antibodies revealed by a quantitative peptide immunoprecipitation assay. J. Mol. Biol. 424, 39139923041298 LamboyJ. A., ArterJ. A., KnoppK. A., DerD., OverstreetC. M., PalermoE. F., UrakamiH., YuT. B., TezgelO., TewG. N., GuanZ., KurodaK., WeissG. A. (2009) Phage wrapping with cationic polymers eliminates nonspecific binding between M13 phage and high pI target proteins. J. Am. Chem. Soc. 131, 164541646019856910 ReichN. C. (2013) STATs get their move on. JAKSTAT 2, e2708024470978 MarconE., JainH., BhattacharyaA., GuoH., PhanseS., PuS., ByramG., CollinsB. C., DowdellE., FennerM., GuoX., HutchinsonA., KennedyJ. J., KrastinsB., LarsenB., LinZ. Y., LopezM. F., LoppnauP., MierschS., NguyenT., OlsenJ. B., PaduchM., RavichandranM., SeitovaA., VadaliG., VogelsangM. S., WhiteakerJ. R., ZhongG., ZhongN., ZhaoL., AebersoldR., ArrowsmithC. H., EmiliA., FrappierL., GingrasA. C., GstaigerM., PaulovichA. G., KoideS., KossiakoffA. A., SidhuS. S., WodakS. J., GraslundS., GreenblattJ. F., EdwardsA. M. (2015) Assessment of a method to characterize antibody selectivity and specificity for use in immunoprecipitation. Nat. Methods 12, 72573126121405 UhlenM., OksvoldP., FagerbergL., LundbergE., JonassonK., ForsbergM., ZwahlenM., KampfC., WesterK., HoberS., WernerusH., BjorlingL., PontenF. (2010) Towards a knowledge-based Human Protein Atlas. Nat. Biotechnol. 28, 1248125021139605 HoberS., UhlenM. (2008) Human protein atlas and the use of microarray technologies. Curr. Opin. Biotechnol. 19, 303518187316 MersmannM., MeierD., MersmannJ., HelmsingS., NilssonP., GraslundS., ColwillK., HustM., DubelS. (2010) Towards proteome scale antibody selections using phage display. N. Biotechnol. 27, 11812819883803 LipmanN. S., JacksonL. R., TrudelL. J., Weis-GarciaF. (2005) Monoclonal versus polyclonal antibodies: distinguishing characteristics, applications, and information resources. ILAR J. 46, 25826815953833 KonthurZ., HustM., DubelS. (2005) Perspectives for systematic in vitro antibody generation. Gene, 364, 192916126351 KehoeJ. W., VelappanN., WalboltM., RasmussenJ., KingD., LouJ., KnoppK., PavlikP., MarksJ. D., BertozziC. R., BradburyA. R. (2006) Using phage display to select antibodies recognizing post-translational modifications independently of sequence context. Mol. Cell. Proteomics 5, 2350236316971384 KiernyM. R., CunninghamT. D., KayB. K. (2012) Detection of biomarkers using recombinant antibodies coupled to nanostructured platforms. Nano Rev. 3, online only SchillingJ., SchoppeJ., PluckthunA. (2014) From DARPins to LoopDARPins: novel LoopDARPin design allows the selection of low picomolar binders in a single round of ribosome display. J. Mol. Biol. 426, 69172124513107 ElsligerM. A., DeaconA. M., GodzikA., LesleyS. A., WooleyJ., WuthrichK., WilsonI. A. (2010) The JCSG high-throughput structural biology pipeline. Acta Crystallogr. Sect. F Struct. Biol. Cryst. Commun. 66, 11371142 LetunicI., DoerksT., BorkP. (2015) SMART: recent updates, new developments, and status in 2015. Nucleic Acids Res. 43, D257D26025300481 KatohK., StandleyD. M. (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 77278023329690 WaterhouseA. M., ProcterJ. B., MartinD. M., ClampM., BartonG. J. (2009) Jalview Version 2–a multiple sequence alignment editor and analysis workbench. Bioinformatics 25, 1189119119151095 ZhangH., GaoS., LercherM. J., HuS., ChenW. H. (2012) EvolView, an online tool for visualizing, annotating, and managing phylogenetic trees. Nucleic Acids Res. 40, W569W57222695796