| Collection Mémoires et thèses électroniques | ![]() |
| AccueilÀ proposNous joindre |
Table des matières
Lorsque cette étude a débuté en 2002, une importante quantité de données d’interactions protéine-protéine commençait à être disponible dans les bases de données, provenant notamment des expériences à grande échelle (cf. introduction), mais peu d’outils permettaient aux biologistes de manipuler et d’exploiter ces données. De plus, notre équipe à débuté un nouveau projet sur l’étude des interactions protéines protéines des Poly(ADP-ribose) polymérases (PARPs) lors du mécanisme d’élimination des cellules : l’apoptose. Il était par conséquent important et nécessaire de développer un système qui puisse organiser toutes ces données.
La découverte de la poly(ADP-ribosyl)ation a été décrite par Chambon il y a plus de 40 ans en 1963 comme étant la formation d’un homopolymère provenant de l’hydrolyse du NAD+et du relarguage de la nicotinamide (Chambon et al., 1963). Les auteurs étaient intrigués par la découverte d’un mécanisme de synthèse d’un biopolymère de haut poids moléculaire à partir de petites molécules. L’identification de la nouvelle enzyme, la PARP, ainsi que de son mécanisme associé est découvert en 1966.
Cette importante contribution scientifique a permis de mettre au jour le formidable potentiel biologique de la poly(ADP-ribosyl)ation. Il s’agit plus précisément d’une modification post-traductionnelle des protéines. Les cibles ainsi modifiées jouent un rôle dans une grande variété de processus biologique qui sont par exemple : l’organisation structurale de la chromatine (Poirier et al., 1982, de Murcia et al., 1986, Caria et al., 1997, Wang et al., 1997), la réparation (D'Amours et al., 1999), la réplication (Yoshida and Simbulan, 1994) et la transcription de l’ADN (D'Amours et al., 1999) ainsi que la mort cellulaire (Yu et al., 2002, Cregan et al., 2002). La PARP-1 est impliquée dans d’importants processus biologiques et son rôle dans autant de mécanismes reste encore un mystère pour beaucoup de chercheurs.
La synthèse du polymère est régulée par les PARPs et la poly(ADP-ribose) glycohydrolase (PARG). La PARG est responsable de la dégradation du poly(ADP-ribose) (figure 23).
La stratégie (figure 24) adoptée par le laboratoire consista à déterminer les protéines interagissant avec la PARP-1 par des expériences d’immunoprécipitation suivies de spectrométrie de masse. Face à la diversité des mécanismes biologiques des PARPs, de la quantité de données disponibles dans la littérature scientifique et celle qui va être générée par le travail du laboratoire, une approche classique des études des interactions protéines n’est pas suffisante. Des nouvelles méthodes d’études doivent être développées pour permettre d’explorer rapidement les données disponibles et de proposer de nouveaux interacteurs de la PARP-1.
Nous avons développé un outil, lesystème PARPs,pour permettre d’organiser les données du laboratoire sous forme de cahier électronique et d’inférer les interactions par recherche d’homologie entre un lot de protéines candidates (PARPs) et les partenaires des interactions décrites dans les bases de données publiques. Lesystème PARPspeut se décomposer en un mode d’administration qui se charge de mettre en place la base de données et de réaliser l’inférence des interactions et un mode consultation qui est constitué des différentes interfaces web. Ce serveur repose sur une base de données relationnelle gérée par un système de gestion de base de données Oracle, qui contient la liste la plus exhaustive possible des interactions connues et les séquences correspondantes. Pour cela, de nombreuses banques de données publiques d’interactions protéiques (cf. chapitre les interactions protéine-protéine, paragraphe Stockage des interactions protéine protéine) ont été installées sur lesystème PARPs. Une application JAVA a été développée pour visualiser les interactions protéines protéines. Elle est connectée directement sur le système PARPs afin de pouvoir visualiser les interactions provenant du laboratoire et celles provenant des bases publiques. Le logiciel Blast est utilisé pour identifier des relations d’homologies entre un lot de séquences soumises au serveur et les séquences impliquées dans les interactions de façon à mettre en évidence des situations de type interologue (interaction est connue chez un autre organisme). Cet outil a été mis à la disposition du laboratoire à travers un serveur web, sous la forme d’interfaces développées en HTML.
L’une des difficultés rencontrées pour la mise en place de cet outil, repose sur la production d’interactions protéine-protéine, non redondante à partir des bases de données publiques. En effet, celles-ci n’utilisent pas toujours les mêmes numéros d’accession pour décrire une même protéine. Ceci nous a conduit à développer un outil permettant de manipuler les différents alias utilisés pour référencer une protéine. Le service rendu par ce serveur répond à un besoin partagé par un très grand nombre de laboratoires confrontés à la multiplicité des identifiants de protéines. Il est accessible sous la forme d’une interface web depuis le serveur d’analyse.
Figure 25 : Architecture de la plateforme protéomique pour l’identification des interacteurs des PARPs.

Lesystème PARPsest représenté par la figure 25. Chaque spectromètre de masse est intégré dans un système appelé « BIO LIMS : Biology Laboratory Information Management System ». La connexion entre les instruments, les ordinateurs des chercheurs et la base de données s’effectue à l’aide du « LIMS ». Après chaque acquisition, elles sont prises en charge par ce système dans lequel l’utilisateur a la possibilité de lancer une identification des protéines ou d’archiver les fichiers. Plusieurs étapes sont nécessaires avant que le chercheur qui a démarré une nouvelle expérience puisse visualiser ses résultats. Chaque acquisition est soumise à un ou plusieurs algorithmes d’identification peptidiques, les protéines sont validées manuellement ou automatiquement. Enfin, un rapport final des identifications est disponible depuisla base de données PARPsavec des commentaires sur l’identification.
Une des possibilitées de l’application développée est de détecter les interactions par spectrométrie de masse ou par homologie, or cela peut comporter un risque élevé de surprédire des interactions. En effet, par spectrométrie de masse, il faut s’assurer que la protéine est bien présente et qu’il ne s’agit pas d’un artefact. Par homologie, la détection peut porter simplement sur des domaines qui ne sont pas forcément impliqués dans les interactions. Pour cette raison, notre serveur a été conçu au départ comme un outil permettant au biologiste de valider et/ou d’annoter les nouveaux résultats expérimentaux obtenus sur les interactions protéine-protéine.
Les utilisateurs du système se connectent via un navigateur web sur la base de données. La figure 26 présente les différentes interfaces développées dans le système PARPs. Le chercheur qui a démarré une nouvelle expérience d’immunoprécipitation peut visualiser le protocole expérimental d’un échantillon sous format pdf (figure 26 A). Les chercheurs de la plateforme protéomique (personnes manipulant les spectromètres de masse) se connectent sur le système et entrent les protocoles expérimentaux spécifiques aux analyses MS/MS (figure 26 C). Les paramètres d’identification des protéines pour l’analyse bioinformatique sont enregistrés dans le système (figure 26 B). Enfin, le chercheur peut de nouveau se connecter au système et consulter le résultat d’une identification protéique (figure 26 D). Plus de détail sont disponibles sur le développement du système dans la publication qui suit ce chapitre.
Interconnexion des banques disponibles depuis le système PARPs
Le nombre de banques de données disponibles dans le domaine public ne cesse de croitre comme nous le verrons dans le chapitre consacré à la comparaison des banques protéiques lors de l’identification. Si le choix de celle-ci peut avoir une influence lors des recherches de protéines, la qualité de l’annotation et les interrelations qui existent entre les banques sont des critères importants lors de l’exploration de données ou « Data Mining ». Lorsqu’une protéine est identifiée, l’utilisateur a la liberté de se promener dans plusieurs sources de données disponibles depuis notre système PARPs. La figure 27 représente les relations qui existent entre les différentes banques de données disponibles dans notre infrastructure. Nous avons sélectionné les banques par leurs valeurs ajoutées (qualité de l’annotation, mise à jour, type d’informations).
A partir des pages HTML du rapport d’identification accessible depuis le serveur PARPs, l’utilisateur peut accéder aux ressources protéiques comme IPI, UniProt, RefSeq, EntrezGene depuis nos serveurs. La banque Pfam spécialisée dans l’étude des domaines est disponible. Des liens sont aussi disponibles vers la ressource publique de l’EBI en cas ou
Figure 26 : Impression d’écran des différentes interfaces que l’utilisateur peut consulter depuis le système PARPs. A) Cahier de laboratoire électronique, B) paramètres expérimentaux d’une recherche, C) sélection des échantillons et de la méthode d’analyse en MS/MS, D) résultat d’une identification par un algorithme de recherche.

nous ne disposerions pas d’assez de contenu accessible via le moteur de recherche SRS « Sequence Retrieval System : http://srs.ebi.ac.uk» ou celui du NCBI ENTREZ « http://www.ncbi.nlm.nih.gov/ /gquery/gquery.fcgi).
La couleur rouge correspond aux données maintenues par l’EBI, en vert celle du NCBI et orange ce sont les banques d’interactions protéiques. Les tirets correspondent aux relations croisées entre les différentes banques.
La difficulté avec l’utilisation de banques aussi variées est qu’elles n’utilisent pas toutes les mêmes numéros d’identification. Ceci nous a conduit à développer un outil permettant de manipuler les différents alias utilisés pour référencer une protéine. Nous avons utilisé des fichiers de référence croisés disponibles depuis les centres de bioinformatique publiques et créé nos propres règles de correspondance (figure 28).
L’ajout de banques d’interactions protéiques publiques à notre structure permet de faciliter l’exploration de données et d’avoir une représentation rapide des interacteurs référencés dans la littérature. Par exemple, la figure 29 représente les interactions de la PARP-1 accessible depuis le système PARPs. Les références bibliographiques sont aussi accéssibles. Le point de départ vers l’exploration de données est de recouper les informations issues de nos propres expériences (données internes) et celles issues de la littérature (données externes) pour comparer et prédire de nouvelles interactions et d’identifier ou de caractériser un rôle des protéines PARPs. Par exemple, nous avons présenté un exemple d’interaction entre PARP-1 et le complexe protéique RFC « Replication Factor Complex » dans la publication qui suit ce chapitre.
Des scripts Perl permettent de se connecter à distance vers les serveurs de banques de données publiques et de mettre à jour automatiquement les informations disponibles dans le système PARPs.
Au cours du développement du système d’étude des interactions protéines-protéines, de nombreuses questions ont dû être résolues sur le choix des banques de données, des algorithmes d’identification et enfin l’utilisation d’outils statistiques. Les différents points seront discutés lors de la troisième partie dans les chapitres sur la comparaison.
Avant propos
L’article suivant intitulé “PARPS Database: A web based tool for protein-protein interaction data analyses” va être soumis dans les prochains jours au journal “Journal of Proteome Research”.
Résumé en français
Un des défis de l’ère post-génomique est de déterminer la fonction des protéines et plus précisément d’établir une cartographie protéomique de la cellule. Ainsi le défi de la génomique fonctionnelle et notamment de la protéomique est de comprendre les événements qui ont lieu lors de la maturation des protéines.
Plusieurs approches ont été décrites pour déterminer la fonction des protéines, par exemple, l’une d’entre elles repose sur la caractérisation des interacteurs de la protéine d’intérêt. Traditionnellement, elles étaient basées sur des approches ciblées. Récemment, le développement des analyses à haut débit a généré une quantité impressionnante d’informations. Face à l’accumulation de ces données, une stratégie uniquement expérimentale n’apparaît plus suffisante. Par conséquent, la création de méthodes bioinformatiques développant des procédures de prospection de données couplées avec des approches expérimentales permettra de prédire les ciblesin silico. L’objectif principal de notre étude est de développer un outil bioinformatique dans un contexte d’expérience d’interactions protéine-protéine afin de mieux caractériser le rôle dynamique de la poly(ADP-ribosyl)ation. L’identification des interacteurs des PARPs s’effectuera par spectrométrie de masse. Cette technique va générer d’importantes quantités de données et nécessitera une plate-forme d’analyse et de grande capacité de calculs informatiques. Avec l’aide de ce système, nous avons identifié de nouvelles protéines cibles jouant un rôle important dans le complexe de réplication de l’ADN (RFC1-5).
PARPs Database: A web-based tool for protein-protein interaction data analyses
Arnaud Droit1, Joanna M. Hunter2, Michele Rouleau1, Chantal Ethier1,Aude Picard-Cloutier1, David Bourgais1and Guy G. Poirier1*
1Health and Environment Unit, Laval University Medical research Center, CHUQ, Faculty of Medicine, 2705 Boulevard Laurier, Québec, Québec, G1V 4G2, Canada
2Current address:Caprion Pharmaceuticals,7150 Alexander-Fleming,Montreal, Québec, H4S 2C8, Canada
Keywords: Database, protein-protein interactions, PARP-1, proteomics, bioinformatics
*Correspondence to: Dr. Guy G. Poirier,
Health and Environment Unit /Eastern Quebec Proteomic Center,
Laval University Medical Research Center (CHUL),
2705 Boul. Laurier, Ste-Foy,
Quebec, Canada, G1V 4G2.
Abstract:
In the “post-genome” era, mass spectrometry (MS) has become an important method for the analysis of proteome data. The rapid advancement of this technique in combination with other methods used in proteomics results in an increasing number of high-throughput projects. This leads to an increasing amount of data that needs to be archived and analyzed.
Here, we describe “PARPs database” a data analysis and management pipeline for liquid chromatography tandem mass spectrometry (LC-MS/MS) proteomics. PARPs database is a web-based tool whose features include experiment annotation, protein database searching, protein sequence management, and peptide and protein identification data-mining. Using the described pipeline, we have successfully identified several interactions of biological significance between PARP-1 and other proteins, namely RFC-1, 2, 3, 4, 5.The RFC complex plays a key role in DNA replication with its partner called Proliferation Cell Nuclear Antigen (PCNA). Together, RFC and PCNA are essential to the DNA Polymerase delta loading on the DNA and to its displacement on the fragment. Alone, RFC have a little or no effect on DNA synthesis meaning that protein-protein interaction is crucial to accomplish its function in replication process.
Introduction:
Proteomics aims to identify, characterize and quantify all of the proteins expressed by a given living organism, tissue or cell line(Hunt, 2002). Typically, this approach subjects protein mixtures to proteolytic digestion prior to liquid chromatographic separation and MS/MS analysis of the resulting peptides(Link et al., 1999). Several database search engines, notably Mascot(Perkins et al., 1999), Sequest(Yates et al., 1995b), and X!Tandem(Craig and Beavis, 2004b) assign probable peptide sequences to MS/MS spectra and infer the identity of present in the sample. High-throughput proteomics generate large data sets of protein identifications which can only be properly validated and reported through adequate data processing(Hunt, 2002, Fenyo, 2000, Gomez et al., 2003). Subsequent integration, sorting and comparison of these datasets pose significant challenges, especially when simultaneously analyzing multiple experiments.
The most effective approaches to elucidating the biological function of proteins is by analyzing protein-protein interactions. We are only now beginning to appreciate the nature and complexity of the networks, and construction of such a network of interacting proteins. The unravelling of any such network using traditional biochemical approaches remains a significant challenge. Recently, however the application of high-throughput technologies, such as large-scale yeast two-hybrid analysis and mass spectrometry coupled to immuno affinity-based capture has generated an enormous amount of protein interaction data(Ito et al., 2001, Newman et al., 2000, Uetz et al., 2000). Often researchers face the dilemma of how to effectively utilize all available data. Investigators relying solely on a traditional wet-lab approach to draw conclusions or set research priorities are likely to find themselves outpaced by peers who combinein silicobiology and empirical methods. Thus for protein interaction studies, there is clearly a need to develop a systematic and stepwisein silicoapproach that can predict potential interactors or are most likely to improve our understanding of how complex biological systems work.
The focus of our laboratory is the study of the action of poly(ADP-ribose) polymerases (PARPs) and their role in the cell. Poly(ADP-ribosylation) is a post-synthetic protein modification consisting of long chains of poly(ADP-ribose) (pADPr) synthesized by PARPs at the expense of NAD+. Poly(ADP-ribose) chains are short-lived owing to the activity of the poly(ADP-ribose) glycohydrolase enzyme, which catabolizes the pADPr within minutes after synthesis(D'Amours et al., 1999). The PARP family may comprise as many as 17 members which share a common catalytic domain responsible for the synthesis of poly(ADP-ribose)(Ame et al., 2004, Otto et al., 2005). The best characterized and abundant member is PARP-1, a 113-kDa nuclear protein comprising a DNA-binding domain made of two zinc fingers that allow PARP-1 to be rapidly activated in response to DNA damage. Poly(ADP-ribose) crucially contributes to chromatin remodelling, DNA damage repair, regulation of transcription, and cell division(Tulin et al., 2003, Dynek and Smith, 2004, Rouleau et al., 2004),and PARP-1 is an important actor in many key cellular processes, including BER, transcription, and apoptosis(D'Amours et al., 1999, Gagne et al., 2006).
We describe herein the architecture and major features of a web-based utility called “PARPs database”, which is designed to rationally organize protein and peptide data generated by the tandem mass spectrometric analysis of tryptic digest of co-immunoprecipited proteins into reports meaningful to biological researchers. For this, we have developed a Laboratory Information Management System (LIMS) work environment to annotate and study protein-protein interaction. PARPs database was designed to be an easy-to-use relational data management system that can rapidly supply information pertaining to the biological characteristics of a majority of proteins in a proteomic dataset.
Also presented here is a list of new PARP-1 interactors that were found via co-/immunoprecipitation coupled to tandem mass spectrometry, as well as the processing and analysis of the data generated by these techniques with the PARPs database software.
Given the advantages provided by anin silicoapproach that can predict or prioritize potential interactors, it seems reasonable to propose that PARPs database will become an essential tool for initially evaluating novel hypotheses and will offer an improved rationale for target prioritization, which will in theory result in only the most promising targets needing to be subjected to empirical testing. Finally, the PARPs-DB reduce the complexity and the time needed to construct the predicted human protein-protein interaction This system can automatically perform the BLAST, CUSTALW analysis and then provide the existing yeast and predicted human protein-protein interaction maps.
Experimental Section
Design of the PARPs Database software
The PARPs Database consists of a core system of services that provide underlying system functionality. Modules, which provide most data handling and analytical support (such as LC-MS/MS data mining), plug into the core. This design means the platform is easily extensible: the architecture allows new analytical modules to be added and integrated without having to modify the core system.PARPs database was designed to be cross-platform, easily scalable, maintainable. The PARP database is implemented in Solaris Sun Operating system 10 (Sun Microsystems, Santa Clara, CA, USA). It requires access to a relational database Oracle 10g (Oracle, Redwood Shores, CA, USA) with which it communicates through an abstraction layer that isolates the core system from subtle differences between Oracle database builds to avoid repetition. The user interface supports the use of the Apache server (http://apache.org) for external access via the Hyper Text Transfer Protocol (HTTP). It consists of a set of programs, written in the Practical Extraction and Report Language (Perl) and PL/SQL, that generate the user interface in Hypertext Transfer Markup Language (HTML), using Cascading Style Sheets (CSS), eXtensible Markup Language (XML) Style sheet Language Transformation (XSLT), and the Scalable Vector Graphics language (SVG). Dialects for PostGreSQL server and MySQL were implemented, and support for Microsoft SQL is under development.
Database Design
Figure 1 outlines the database schema for the data pertaining to experimental protocols, data analysis and results (the full-scale schema is available on-line as Supplemental Figure 1). The database is defined in the Unified Modeling Language (UML; http://www.uml.org), which is a standard notation designed to improve the process of developing large software systems(Rumbaugh et al., 1999). The “Protocol” set of tables (shown in red in figure 1) stores description of the experimental methods used for PARP-1. The “experiment run” set of tables (shown in purple in figure 2) stores a record of the execution of a protocol, acting on a specific material and/or data inputs and producing specific material and/or data outputs. The “mass spectrometer”, “Sequest and X!Tandem parameters” (shown in green) and “mascot parameters (shown in grey)” tables store mass spectrometry and bioinformatics parameters and results. The type of mass spectrometer, liquid chromatography column information, and search engine parameters, among others are stored in these tables. The protein tables (shown in orange) store identifiers (accession numbers) that point to external web-based information sources. Short text annotations such as Gene Ontology(Ashburner et al., 2000) descriptions, descriptions of functional or structural regions within the protein sequence, and information about associated and easily searchable. PARPs-DB also supports input of local protein sequences and annotations diseases and biological pathways are also stored when available. While the identifiers serve as links to external databases and web pages, the annotations stored within PARPs-DB are human readable, as well as pointers to local databases. A sequence or annotation marked as “defunct” will not automatically be deleted from the database, which means old FASTA files can be reanalyzed with new annotations even if their records have been deleted or replaced by subsequent information in the primary source.
The database was designed to contain a minimal amount of information but still sufficient data to allow meaningful Structured Query Language (SQL) queries. These queries enable ready access to any information stored in the database as well as in the XML files generated by the data analysis server. The tables and XML files serve as the primary data storage objects, allowing the design of a relational dataset that is relatively easy to build, maintain and query.
LC-MS/MS Data Analytic Module
A key design element of PARPs database is the ability to generate analytic modules that plug into and use the core of PARPs system. The LC-MS/MS tools included in PARPs systems are Sequest, X!Tandem and Mascot. We have also included a set of open source tools that perform many of these steps: Peptide Prophet validates peptides assigned to MS/MS spectra (Keller et al., 2002a) and ProteinProphet infers sample proteins(Nesvizhskii et al., 2003a). These tools are components of the Trans Proteomic Pipeline (TPP) from sashimi (http://proteomecenter.og).
Links to Public Databases
The underlying protein knowledgebase used by PARPs database was extracted from multiple online resources, based on cross-references. Five gene and protein data sources were integrated within PARPs database: protein databases maintained by IPI (Kersey et al., 2004), UniProt(Bairoch et al., 2005), gene and protein resources at NCBI, including Entrez Gene (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene), RefSeq(Pruitt et al., 2005), and GenPept. Three protein-protein interactions databases were also included in PARPs-DB’s knowledge base : Biomolecular Interaction Network Database (BIND)(Bader et al., 2003), Database of Interacting Proteins (DIP)(Xenarios et al., 2002) and Human Protein Reference Database (HPRD)(Peri et al., 2004).
For each identified protein stored in PARPs-DB, the data analysis server gather the protein’s function, sequence and post-translational modifications from the above underlying sources and shows the extracted data along with the identified protein.
Protein-Protein interaction viewer
In order to analyze protein-protein interactions, we have developed a protein-protein interaction viewer, in Java language (Java JDK 1.4.2_05 and Netbeans 3.6). This viewer uses three libraries: Xerces Java Parser 2.6.2 (http://xerces.apache.org/), Piccolo Java 1.1 (http://www.cs.umd.edu/hcil/piccolo/), and JDOM 1.0 (http://www.jdom.org/) is used to manipulate and parse the XML files.
Through combining JDOM and Xerces XML files are validated against HUPO-PSI (http://psidev.sourceforge.net/mi/rel25/src/MIF25.xsd ) schema file or, optionally, against a local schema file included in the viewer package.
PARP-1 Co-immunoprecipitation
Cell culture
Human cervical carcinoma HeLa cells obtained from ATCC (Manassas, VA) were cultured in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% fetal bovine serum, 2 mM L-glutamine, 100 U/ml penicillin and 100 µg/ml streptomycin in an humidified atmosphere of 5% CO2at 37°C. All the above-mentioned reagents were purchased from Invitrogen (Burlington, ON).
Immunoprecipitation of endogeneous PARP-1
Cells grown in 150 mm culture dish were washed with ice-cold phosphate-buffered saline (PBS). 400 µl/dish of ice-cold lysis buffer (175mM KPO4, pH 8.0, 150 mM NaCl, 1% NP-40, 1mM DTT, 0.5mM PMSF and CompleteTMprotease-inhibitor cocktail (according to Roche diagnostics instructions)) was added to the cells. Cells were harvested using a cell scraper. Lysed cells coming from three dishes were pooled then gently mixed by inversion for 1 hour at 4°C and centrifuged 10 minutes at 6000g at 4°C to remove insoluble cellular debris. The cellular extract was mixed with 180 µl of magnetic beads coupled to protein G (Fisher, Nepean, ON) and 8 µl of monoclonal antibody F1-23 (Lamarre et al.) or 8 µl of normal mouse IgG as control and incubated during 2 hours at 4°C with rotation. The beads were previously blocked during 1 hour with 1% BSA and washed with lysis buffer. At the end of the incubation period, the beads were washed 3 times with lysis buffer and 180 µl of 2X Laemmli SDS sample buffer containing 5 % (v/v) β-mercaptoethanol was added to the beads and placed in a boiling bath for 5 minutes to elute the immunoprecipitated proteins.
Protein separation / digestion
The proteins eluted from the immunoprecipitation (75 μl) were separated by SDS 8% PAGE. The gel was fixed for 30 min with 10 % (v/v) methanol and 7 % (v/v) acetic acid solution. The gel was then stained with SYPROTMRuby fluorescent protein stain (Bio-Rad) according to the manufacturer’s instruction. The entire protein profile of the immunoprecipitated proteins was sliced from the gel into 50 bands using a gel excision LanepickerTM(The Gel Company) and placed into a 96-well plate. In-gel protein digests were performed on a MassPrepTMliquid handling station (Waters) using sequencing-grade modified trypsin (Promega). Peptide extracts were dried using a SpeedVacTMand resuspended in 10μl of 0.1 % formic acid in water.
LC/MS/MS
Final extracts were analysed by LC-MS/MS using an LCQ-DECA XP mass spectrometer equipped with a nanospray ESI (electrospray ionization) source and a Surveyor autosampler and HPLC system (Thermo Electron). A 5μl volume of extract was first focused on a Peptide CapTrapTM(Michrom Bioresources) and then loaded on a Biobasic C18PicoFRITTMcapillary column (PFC7515-BI-10; New Objective). Elution of peptides was performed using a linear acetonitrile gradient (0-60 %) over 20 min at a flow rate of approximately 200 nl/min (buffer A: 0.1 % formic acid in water; buffer B: 0.1 % formic acid in acetonitrile). MS, including collision-induced dissociation, was performed in an automated fashion using the dynamic exclusion option.
Protein Identification
MS/MS spectra were searched using both the Sequest and X!Tandem search tools against the IPI (http://www.ebi.ac.uk/IPI/) human protein database (version 3.01) to which the sequences of protein constructs, proteins of interest, and common contaminants were added.Searches were performed specifying complete (fixed) carbamidomethylation modification of cysteine (+57 Da) and oxidation of methionine (+16 Da) residues. The digestion enzyme parameter was set to trypsin. Following Sequest analysis,the algorithms PeptideProphet(Keller et al., 2002a)and ProteinProphet(Nesvizhskii et al., 2003a)were used to respectively determine the probability that peptide and protein assignments were correct.The proteins identified in this paper was obtained with a ProteinProphet probability cut-off of 0.9 for sequest identifications and a cut-off of log(e) -3 for X!tandem identifications.
Western blots
Total protein extracts and proteins eluted from the immunoprecipitations were separated by 8% SDS-PAGE and then transferred onto a 0.45 µm pore-size PVDF membrane (Millipore, Bedford, MA). After incubating 1 hour with blocking solution (PBS with 0.1% (v/v) Tween-20 (PBS-T) containing 5% non-fat milk), the membrane was probed overnight at room temperature with shaking by primary antibodies to PARP1 (C2-10, mouse monoclonal 1:1000) (Poirier et al.) or RFC1 (Replication factor C, 140 kDa subunit), rabbit polyclonal antibody (1:2500) (Bethyl Laboratories, Montgomery, TX). After washing with PBS-T, species-specific horseradish peroxidase-conjugated secondary antibody was added for 1 hour at room temperature. The signals were finally detected with Western LightningTMChemiluminescence reagent plus kit (Perkin Elmer, Boston, MA).
Results and Discussion
Protein Interaction Workflow
The workflow of protein interaction is illustrated in figure 2. In our LIMS, the data processing is devised into section corresponding to the four main steps of the workflow: sample preparation, MS Data acquisition, protein identification, and PARPS-DB.
The sample preparation(figure 2 A) section allows laboratories to track and organize biological experiments which stores information about any biological sample type. This includes terms describing the sample or the individual that supplied it.
For the purpose ofMS data acquisition(figure 2 B), different types of mass spectrometers exist, which use different methods for ionization and mass determination. As the machines from diverse suppliers use different methods internal data formats, we have implemented parsers which convert the data from the different mass spectrometers into mzXML/mzData: This representation designed to encompass all necessary information required by the currently available search engines. Also, this data representation developed by ISB/EBI provides an OS and architecture independent file format for the standardized representation and removes the burden of having to support multiple native formats from the developers of analytical applicatons(Pedrioli et al., 2004). By converting all native binary to mzXML/mzData and using these standards as the start of our analysis pipeline, the same downstream software tools, specifically the database search and raw spectral viewer, can be used in each case in a uniform manner regardless of the machine used to generate the data.
In order to analyze mass spectra,The protein identification section(figure 2 c) allows to submit the MS data to three search engines, namely Sequest, Mascot and X!Tandem. The MS/MS analytic module stores, shares, analyzes, mines and publishes tandem MS data. Our protein interaction workflow support pepXML. This second format, pepXML stores the results of assigning peptides to MS/MS spectra and subsequent peptide-level analyses. Once search results are written or converted to pepXML, they can uniformly be subjected to peptide-level applications and viewed without regard to the method used to assign peptides. Users can examine individual LC-MS/MS runs and groups of runs using complex customizable analytic filters for peptides and proteins. Theses filters can be named and saved for later use. Finally, protein identifications are stored in protXML. This data format developed by ISB stores protein identifications inferred from input lists of peptides and their subsequent protein-level analysis. Once protein identifications are converted to protXML, protein-level analyses such as protein quantification can proceed, and data viewed, without regard to the method used to infer protein identification. With the help of this standard, we have used a set of open source tools PeptideProphet and ProteinProphet that perform analysis. These analysis tools provide a standardized way of interpreting MS/MS data. For example, accurate probabilities provided by PeptideProphet and ProteinProphet serve as guides for interpretation of peptide and protein identifications, respectively, and enable the prediction of false positive error rates that can be used as objective criteria for the comparison of data sets generated by different researchers. The module interacts with protein annotation (described in the next section) to display rich annotations for putative protein identifications.
This last section represents thePARPs database(figure 2 D) physically. Following processing methods, the results are loaded automatically into PARPs-DB for viewing. The database system is interconnected with protein annotation module (figure 2 D). This module manages protein sequence annotations to help investigators cope with the ever-accelerating growth of new information about proteins and their properties. Sequence annotations are automatically updated; however, updates to the system are stored incrementally so that any previous version of a database annotation can be retrieved at any time. Protein annotations interact closely with the protein identification section (see below) to allow users to view up-to-date descriptions of protein sequence they have identified. Specific databases such as UniProt, IPI, RefSeq, BIND, HPRD, Gene Ontology are downloaded in the PARPs-DB. Furthermore, to integrate proteome data with genomic and metabolic data, dynamically created HTML hyperlinks are essential.
Accessing and Navigating Experiments in PARPs Database
To facilitate data analysis, a graphical user interface (GUI) was developed. The GUI guides the user through all steps of the experiment to enter the necessary data (immunoprecipitation methods, gel images, mass spectra, search engine results, etc.; figure 3) which ensure a complete documentation of the information. Once all necessary data has been stored in the system, the user can select data sets for visualization.
PARPs user are authenticated against an LDAP provider such as institution’s name server. Experimental data and other materials are stored in projects and their sub-folders, much like a file system. Each project has one or more groups of user associated with it, and each group can have a distinct set of permissions (e.g., read only, read and write) to each of the project’s folder. When users log in, the authorization system determines what data they have permission to view, edit, and/or delete and provides access accordingly.
Inside the PARPs-DB, we have three main sections: “sample origin”, “mass spectrometry” and “sample results” corresponding to different step of proteomic experiment. These sections allow users to store experimental parameters, results and annotations. Each section has a distinct set of permissions. For example, molecular biologist cannot access to mass spectrometry section and conversely mass spectrometry user cannot access to molecular biologist section. In each section, we have added or developed tools to help biologist to reduce the time to needed to analyze data.
Firstly,the sample originsection allows the user to enter experimental parameters by selecting a number of options. Experimental information includes cell type and cellular conditions, method of gene transfer (when applicable) and gene sequence, details of immunoprecipitation method such as lysis buffer composition, antibodies, cell lysis. The user can print experimental details entered in the database (figure 3A). For example, an image of a stained gel showing proteins immunoprecipitated in the described experiment may be loaded in the database.
The second section of PARPs-DB isthe mass spectrometry section, it allows the user to define parameters of a mass spectrometry experiment including: the plate number, the spot position, and method files (figure 3B) and bioinformatics parameters. A tabular file is generated to upload in mass spectrometer software. At the end of MS/MS analysis, the raw data is transformed in mzXML and mzData in background. If the user requests database searching, the user access to another section to set specifically search engines parameters such as database, modification. The data pipeline will submit the mzXML or mzData to the search algorithms and manage the specification of search parameters and FASTA files. Once analyzed, the system offers graphical and tabular views of the experimental steps and their input and output. Users can monitor the progress of their searches via the web interface.
Finally; the last section of PARPs-DB issample resultssection. Access to LC-MS/MS results is available this section of a project, which shows protein and peptide identifications in the list view of the PARPs database, sorted sample treated according to a certain experimental protocol (e.g. digestion) or according to their probabilities, for example (Figure 3C). Display columns include the UniProt(Bairoch et al., 2005) or IPI(Kersey et al., 2004) or RefSeq(Pruitt et al., 2005) annotation; the number of uniquely identified peptides/protein; the total number of identified peptides/protein. MS/MS search results can be evaluated using this module which allows proteins and peptides to be sorted and filtered by various criteria. Each proteins identified are linked to protein annotation module (see below), which are automatically linked following parsing of the FASTA file, allow access to a variety of up-to-date external sources (figure 3D). The accession numbers of protein identified from the Sequest, X!Tandem, or Mascot search are matched with those from IPI, and specific information regarding the protein of interest is automatically retrieved and displayed within the database window. Additional information from the software-assisted identification of the protein is displayed in a portal view, including identified peptides(s). This feature of the PARPs database is to automatically connect protein identifications with their function and other relevant biological information extracted from external databases. A statistics module within the PARP LIMS provides basic information about each experiment in the form of charts (e.g. GO annotations, peptides per protein identification, proteins identified with a certain ProteinProphet probability, etc.). In addition, the database allows for the comparison of data from different experiments at protein and peptide levels. Users are able to query the database, add notes to specific identifications, and select and export lists of interesting proteins including annotations.
Different tools are accessible through the PARPs database navigation such as BLAST, CLUSTALW and our protein-proteinviewer, a graphical tool that is linked to PARPs-DB.It displays protein-protein interactions from the PARPs-DB (Figure 3E). The user can scan all the deposited internal (our protein-protein interaction assays) and external protein-protein interactions (from publicly available data sources: INTACT, BIND, HPRD, String) in the database.Information about protein-protein interactions beyond the target protein is shown in the interaction network to visually characterize the protein network. Proteins of interest can be searched by either accession or keywords. When users input the accession of target protein, the protein interaction network is shown as nodes (proteins) and edges (interactions). The interaction network also can be displayed with the annotations for the proteins in the nodes. Each node is linked to the Protein annotation to give protein information. Data and results can be exported to other formats including PSI-MI(Stromback and Lambrix, 2005), Excel and DTA (Sequest files) for additional analysis using other tools. This method was created to exchange easily data between different laboratories.
Using Proteomics Standards in PARPs database
A major obstacle to uniform proteomic analysis has been the great heterogeneity of data formats at three distinct levels: different mass spectrometers output their raw spectral data in different proprietary formats, alternative methods that assign peptides to MS/MS spectra output their results in a variety of formats, and different methods to infer protein identifications from lists of peptides output their results in different formats. The proteomics community has recognized this problem and it is tackling this problem through the formation of groups (Protemics Standard Initiative; Institute for Systems biology) concerned with the development of standards for the capture and sharing proteomics data. The database was developed in agreement with the HUPO-PSI (Human Proteome Organization-Proteomics Standard Initiative) which includes PSI-MI (Molecular Interactions), MS (Mass spectrometry) and GPS (General Proteomics Standards). The GPS development of standard ways to represent proteomics data and an agreed minimum required level of detail are both urgently required to facilitate the analysis, dissemination and exchange of proteomics data. Minimal Information about Proteomic Experiment (MIAPE) (Orchard et al., 2004) is a proposed standard format for proteomics covering 2-DE and MS. The PARPs database contains classes derived from PEDRo(Taylor et al., 2003). As we mentioned earlier, we added the latest proteomic standards in our pipeline such as mzXML, mzData and PSI-MI. We expect the new format to greatly facilitate the exchange and publication of MS-based proteomics data such as PRIDE (http://www.ebi.ac.uk/pride/) for MS data or INTACT (http://www.ebi.ac.uk/intact/index.jsp) for protein-protein interactions and to provide a consistent platform for the development of new analytical tools.
Constructing a Protein-Protein Interaction Network for PARP-1
To illustrate the use of PARPs-DB for discovery of protein-interactions, we describe here an experiment for PARP-1 co-immunoprecipitation experiment.This co-immunoprecipitation was set of experiments aiming at identifying PARP-1 interactors which will be published elsewhere (Ethier et al., manuscript in preparation). Construction of this interaction network involved three bioinformatics steps, and the predicted interactions then verified empirically.
Step1. Identification of PARP-1 interacting proteins from published small-scale experimental studies
The first step in the generation of PARP-1 interaction model is an extensive search of the literature in order to collect published experimental data on PARP-1 interactions. The keyword used in the PARPs-DB search for literature was PARP-1. However, a functional network is not only limited to physical protein-protein interactions but also includes genetic and biochemical interactions. Thus, we have combined all available PARP-1 interaction with DNA, RNA and other biochemical species (Figure 4).
Step2. Establishment of the PARP-1 interaction network by analysis of public databases
The interacting molecules are summarized in figure 4. Four different comprehensive large-scale yeast protein interaction databases were included in PARPs-DB: BIND, HPRD, INTACT, and STRING. Search within these databases resulted in 52 PARP-1 interactors.BIND, INTACT, HPRD and STRING all have extensive collections of human protein-protein interactions although the former three databases are primarily used to extract, but not predict, protein-protein interaction data from literature. The STRING database can predict interactions between proteins but rather focuses on identifying neighbouring genes in the genomic text. It does not include any experimental protein-protein interactions.
Step3. Selection of Protein-Protein Interaction by Gene Ontology Accession
The next step was to group protein-protein interaction by molecular function via the Gene Ontology(Ashburner et al., 2000) controlled vocabulary included in PARPs-DB. In order to provide additional specificity for target selection, protein-protein interactions may be classified according to molecular function. Therefore, we further prioritized our target selection by using two keywords (DNA and replication). After having filtered by functions, six out of the 52 initial proteins interactors remained in the networks: PCNA, topoisomerase I and II, DNA ligase I, DNA Polα and β.The extensive literature search about these proteins in the context of replication and data mining to help to construct human protein-protein interaction in PARPs-DB, we have found 13 proteins and one complex could be identified to interfere with PARP-1 in the complex machinery replication. Of the 13 interacting proteins, PCNA(Frouin et al., 2003), topoisomerase I, DNA ligase I, DNA Polα and β have been previously reported to interact with PARP-1. Topo2, MSH2,RFC1, RFC2, RFC3, RFC4, RFC5 have been reported to be related to DNA replication.This analysis raises the possibility that PARP-1 may regulate theses complexes as a whole rather than regulateone or more the individual components. Of all the potential candidates, we propose that PARP-1 could interact with other proteins in RFC (RFC1 to RFC5) in the context of machinery replication (figure 5).
Demonstration of Biochemical Interactions between PARP-1 and RFC1
Verifying the interactions from molecules identifiedin silicois vital to provide a confident interaction network useful for further study. With the exception of PCNA, which has been characterized, the prioritized candidates were next tested empirically to confirm the predicted protein-protein interactions. This was carried out using co-immunoprecipitation followed by mass spectrometry.
MS/MS spectra were searched using Mascot and X!Tandem software and the search results were then validated with Scaffold (Proteome Software Inc.; version Scaffold-01_03_02). Scaffold was used to group and validate MS/MS based peptide and protein identifications from different software.This software is based onPeptideProphet algorithm which provides an empirical statistical model which estimates the accuracy of peptides identifications made by Sequest. For each tandem mass spectrum, PeptideProphet determines the probability that the spectrum is correctly assigned to a peptide. A second program, ProteinProphet was subsequently used to group the assigned peptides according to corresponding protein and to compute probability of a correct protein assignment for each protein.Peptide identifications were accepted if they could be established at greater than 80.0% probability as specified by the PeptideProphet algorithm.For the co-immunopreciptation eluate, protein identifications were accepted if they could be established at greater than 95% probability and contained at least 2 identified peptides.Table 1 lists all the proteins with a minimum probability of 95%.
Finally, the co-immunoprecipitation assay with RFC1 antibody performed in Hela cells showed that Replication Factor C subunit (RFC1, 2, 3, 4, 5) (table 1 and table 2) formed a complex with endogenous PARP-1. Many peptides have been identified for PARP-1 protein, we have preferred to show you the peptides identification by highlight many peptides in red into the PARP-1 sequence (figure 6).
RFC-2, 3, 4 and 5 were each identified with a minimum probability of 98% and by more than 4 peptides with a minimum probability of 95%. The confidence of the identification of RFC-2, 3, 4 and 5 is thus very high and, moreover, we have identified this RFC complex in different co-immunoprecipitates. The case of RFC-1 is different as it was identified by only two peptides of probabilities of 95% and identified in only one fraction. For this reason, we decided to confirm this potential interaction. We confirmed spectrometric identifications of RFC-1 by western blot analysis of complexes immunopurified with mouse monoclonal F1-23 antibody (figure 7). RFC-1 was detected by western blot analysis with rabbit polyclonal RFC1 antibody. As expected, RFC1 was pulled-down by PARP-1. We concluded that this coupled of bioinformatics and proteomics approach allowed the identification of the partner of PARP-1. Although substantially more work is required to determine if PARP-1 interact directly with PARP-1 or via other proteins such as PCNA, DNA Polymease or via of PARP-1’s polymers on automodified PARP, the findings reported here suggested that our PARPs-DB may be useful for finding interacting proteins.
Conclusion
The work presented here has demonstrated how bioinformatics can supplement conventional biological investigation. The PARPs-DB enables storage, annotation and representation of data generated by molecular biology. Moreover this system has identified a previously unknown protein interaction of PARP-1. The PARPs database allows the effective description of proteomics experiments and analysis of protein-protein interactions.
The PARPs database was developed to facilitate data sharing and exchange. Therefore, it includes the latest standard format to allow sharing of experimental design and results with the scientific community. We have incorporated tools allowing the extraction of protein-protein interactions from the HPRD, DIP and BIND public databases, literature and other sources of information. This tool then generates html output reports for peptide and protein analyses. It provides comparison reports from multiple or concatenated experiments, significantly increasing the confidence for peptide and protein identification.
The biochemical data between PARP-1 and RFC complex confirmed the interaction reported earlier although substantially more work is required to delineate the specificity and the structural interaction with respect to the regulation of their cellular function between PARP-1 and RFC complex.It is anticipated that the building of such an integrated platform, which can be constantly up-graded, could provide a predictive understanding of a novel gene’s function in its biological context.
A key design element of PARPs database is the ability to add tools or module that plug into and use the core systems. The PARPs-DB will be expanded as needed in order to make the analyse more efficient tools.
Acknowledgements
This study was supported by the Canadian Institutes of Health Research and the program in functional Genomics from CIHR. We thank Mr. Pierre Gagné and Eric Winstall for critical revision of the manuscript and Ken Sin Low for data crushing.
References
1.Hunt, D. F. Personal commentary on proteomics.J Proteome Res 1, 15-9 (2002).
2.Link, A. J. et al. Direct analysis of protein complexes using mass spectrometry.Nat Biotechnol 17, 676-82 (1999).
3.Perkins, D. N., Pappin, D. J., Creasy, D. M. & Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data.Electrophoresis 20, 3551-67 (1999).
4.Yates, J. R., 3rd, Eng, J. K., McCormack, A. L. & Schieltz, D. Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database.Anal Chem 67, 1426-36 (1995).
5.Craig, R. & Beavis, R. C. TANDEM: matching proteins with tandem mass spectra.Bioinformatics(2004).
6.Fenyo, D. Identifying the proteome: software tools.Curr Opin Biotechnol 11, 391-5 (2000).
7.Gomez, S. M., Noble, W. S. & Rzhetsky, A. Learning to predict protein-protein interactions from protein sequences.Bioinformatics 19, 1875-81 (2003).
8.Ito, T. et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome.Proc Natl Acad Sci U S A 98, 4569-74 (2001).
9.Newman, J. R., Wolf, E. & Kim, P. S. A computationally directed screen identifying interacting coiled coils from Saccharomyces cerevisiae.Proc Natl Acad Sci U S A 97, 13203-8 (2000).
10.Uetz, P. et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae.Nature 403, 623-7 (2000).
11.D'Amours, D., Desnoyers, S., D'Silva, I. & Poirier, G. G. Poly(ADP-ribosyl)ation reactions in the regulation of nuclear functions.Biochem J 342 ( Pt 2), 249-68 (1999).
12.Ame, J. C., Spenlehauer, C. & de Murcia, G. The PARP superfamily.Bioessays 26, 882-93 (2004).
13.Otto, H. et al. In silico characterization of the family of PARP-like poly(ADP-ribosyl)transferases (pARTs).BMC Genomics 6, 139 (2005).
14.Tulin, A., Chinenov, Y. & Spradling, A. Regulation of chromatin structure and gene activity by poly(ADP-ribose) polymerases.Curr Top Dev Biol 56, 55-83 (2003).
15.Dynek, J. N. & Smith, S. Resolution of sister telomere association is required for progression through mitosis.Science 304, 97-100 (2004).
16.Rouleau, M., Aubin, R. A. & Poirier, G. G. Poly(ADP-ribosyl)ated chromatin domains: access granted.J Cell Sci 117, 815-25 (2004).
17.Gagne, J. P., Hendzel, M. J., Droit, A. & Poirier, G. G. The expanding role of poly(ADP-ribose) metabolism: current challenges and new perspectives.Curr Opin Cell Biol 18, 145-151 (2006).
18.Rumbaugh, J., Jacobson, I. & Booh, G.The Unified Modeling Language Reference Manual(ed. Addison-Wesley) (1999).
19.Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.Nat Genet 25, 25-9 (2000).
20.Keller, A., Nesvizhskii, A. I., Kolker, E. & Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.Anal Chem 74, 5383-92 (2002).
21.Nesvizhskii, A., Keller, A., Kolker, E. & Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry.75, 4646-4658 (2003).
22.Kersey, P. J. et al. The International Protein Index: an integrated database for proteomics experiments.Proteomics 4, 1985-8 (2004).
23.Bairoch, A. et al. The Universal Protein Resource (UniProt).Nucleic Acids Res 33, D154-9 (2005).
24.Pruitt, K. D., Tatusova, T. & Maglott, D. R. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.Nucleic Acids Res 33, D501-4 (2005).
25.Bader, G. D., Betel, D. & Hogue, C. W. BIND: the Biomolecular Interaction Network Database.Nucleic Acids Res 31, 248-50 (2003).
26.Xenarios, I. et al. DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions.Nucleic Acids Res 30, 303-5 (2002).
27.Peri, S. et al. Human protein reference database as a discovery resource for proteomics.Nucleic Acids Res 32 Database issue, D497-501 (2004).
28.Pedrioli, P. et al. A common open representation of mass spectrometry data and its application in a proteomics research environment.22, 1459-1466 (2004).
29.Stromback, L. & Lambrix, P. Representations of molecular pathways: an evaluation of SBML, PSI MI and BioPAX.Bioinformatics 21, 4401-7 (2005).
30.Orchard, S. et al. Common interchange standards for proteomics data: Public availability of tools and schema.Proteomics 4, 490-1 (2004).
31.Taylor, C. F. et al. A systematic approach to modeling, capturing, and disseminating proteomics experimental data.Nat Biotechnol 21, 247-54 (2003).
32.Frouin, I. et al. Human proliferating cell nuclear antigen, poly(ADP-ribose) polymerase-1, and p21waf1/cip1.A dynamic exchange of partners.J Biol Chem 278, 39265-8 (2003).
Different classes (rectangles) with their associations (lines) are shown. A class is described by its attributes, e.g. a sample can be specified by its name, date. The protocols tables (red) define protocols as an ordered sequence of steps. The experiment tables (purple) record series of protocols acting on material or data input. The proteins tables (orange) defined protein identification and ontologies. The blue and green tables store additional information about search engines parameters.
(1) Immunoprecipitation of PARPs; (2) raw spectral data generated by different mass spectrometers; (3) peptide assignments using different search engines and protein identifications using different methods of inference; (4) the annotations and results are loaded automatically into PARPs database for viewing. The interactions retrieved from DIP, BIND, HPRD public database are updated regularly in PARPs database (protein annotations).
(A) Sample Origin section: This section allows the user to enter experimental parameters and visualising his protocol; (B) Mass spectrometry section allows the user to define parameters of mass spectrometry instrument; (C) analysis section shows protein identifications which include (protein accession number, entrez gene accession, number of peptides identified and a protein summary function); (D) Protein Card Layout, allow access to a variety up to date external publicly sources; (E) viewer allows user to display protein-protein interactions from internal experience and external experience (publicly available data sources).
Figure 4 : The biochemical and physical interaction networks for the PARP-1. This figure is a summary of the results of the protein-protein interaction databases and literature searches for PARP-1 substrates and cooperators, which are the basis of the protein-protein interaction networks.

In silicoprediction from the protein-protein interaction database with specific function: replication machinery. To provide maximal coverage of the potential interactome, we have searched in 4 databases. Comprehensive database analyses revealed 52 proteins as potential interaction candidates with PARP-1. These candidate proteins were further prioritized using searches based on two Gene Ontology keywords (DNA and replication) and 13 proteins were selected on the basis of their unique molecular functions. Further literature reviews identified the proteins in the DNA replication complex, RFC complex and BRC complex.Proliferating Cell Nuclear Antigen (PCNA) have been shown to interact biochemically and/or genetically with PARP-1. We have found potential interaction between PARP-1 and RFC1, 2, 3, 4, 5.
Line represent direct proteins interaction and dash line represent proteins find in a complex.
Table 1 : List of ID separated proteins identified by LC MS/MS from the co-immunoprecipitation assay with RFC-1 antibody in Hela cell
|
|
Figure 7 : PARP-1 and the interactor RFC1 were immunoprecipitated with mouse monoclonal F1-23 antibody. RFC1 was detected by western blot with rabbit polyclonal RFC1 antibody. PARP1 was detected by western blot with mouse monoclonal C2-10 antibody.

Ctr: lysate with normal mouse IgG instead of F1-23 antibody.
© Arnaud Droit, 2007