PPDB

PPDB	The Plant Proteome Database

Search PPDB by:

PPDB home page

About PPDB

PPDB, initiated in 2004, is a joint project between Klaas J. van Wijk Lab of Cornell University and the Computational Biology Service Unit of Cornell Life Sciences Core Laboratories Center. PPDB is a Plant Proteome DataBase for Arabidopsis thaliana and maize (Zea mays). Initially PPDB was dedicated to plant plastids, but has now expanded to the whole plant proteome.

What is the purpose of the PPPDB?

The main objective is to provide a centralized, curated, data deposit for predicted and experimentally determined proteins in Arabidopsis thaliana and maize (Zea mays), their annotated functions, as well as their experimental and predicted molecular and biophysical properties. Importantly, information from mass spectrometry-based identifications is available for each identified protein accession; this will allow the database user to determine the significance the experimental identification and also evaluate information of post-translational modification. The content of PPDB can be directly accessed through its web interface (http://ppdb.tc.cornell.edu/). Multiple search methods are provided so that the user can retrieve information based on gene identification number, functional annotation or various protein properties. Active links to other databases (e.g. TAIR and TIGR) are present.

The objectives of PPDB are to:

1. Collect carefully curated experimental (plastid) proteome/mass spectrometry data from maize and Arabidopsis thaliana and make them publicly accessible. Curation is focused on the experimentally determined protein localization within the plastid, physical-chemical properties of the processed proteins, and other relevant protein features. Detailed information regarding mass spectrometry based identifications is indicated.

2. Annotate protein name, location and function based on primary literature. The functional classification system developed by Thimm et al. 2004 (Plant Journal 37, 914-939) is used a basis, updated where possible from primary literature, public information and in-house experimental data.

3. Provide predicted chloroplast localization and predicted chloroplast transit peptide (cTP) and lumenal transit peptide (lTP) and physical-chemical properties (e.g. pI, mass, hydrophobicity, predicted number of transmembrane domains, cystein content, etc.) of precursor and processed proteins.

4. Collect information about protein interactions for each protein and ultimately provide a network of plastid protein interactions.

5. Predict closest orthologues between maize (Zea mays) and Arabidopsis thaliana protein models and couple experimental and predicted data for these predicted orthologues.

6. Collect high quality information on post-translational modifications using high resolution (100,000) and high mass accuracy measurements and make this available.

What are plastids and why study their proteome?

Plastids are essential organelles of prokaryotic origin that are present in every plant cell. During plant development, proplastids in meristematic cells differentiate into non-photosynthetic plastids in roots and petals and into photosynthetic plastids (chloroplasts) in leafs and stems. Plastids are responsible for synthesis of key molecules required for the architecture and functions of plant cells. Ten to twelve percent of the ~29,500 Arabidopsis thaliana genes are predicted to encode for plastid proteins, underlining the importance of this organelle for the plant cell. The proteomes of the different plastid types in different organs or in different cell types (eg Bundle sheath and mesophyl cells in maize) are not well characterized and may be very different. Characterization of these proteomes will provide insight in the essential role of the different plastid types.

Experimental data sets in PPDB (internal and external)

In house data All experimental data are from whole leaves or different plastid preparations. These (will) include non-photosynthetic plastids from different members of the Brassicaceae family, including Arabidopsis, Brassica oleracea and Brassica rapa, as well as chloroplasts from Arabidopsis thaliana and maize (Zea mays). In case of maize, a C4 plant, plastids purified from either bundle sheath (BS) cells or mesophyll (M) cells are analyzed to address their specialization. Basic information about the fractionation method can be extracted for each experimental identification, via a user determined output format.

External (published) datasets. Medium to large-scale Arabidopsis thaliana plastid proteome datasets and proteome datasets from other subcellular compartments (e.g. mitochondria, plasma membranes) are stored in PPDB and linked to each locus. This information can be obtained by selecting ‘Proteomics Publication’ as output parameter. This will help to identify proteins that partition to different locations and identify abundant proteins often found as contaminants.

Deposition data. Anybody interested to have their plastid proteome datasets included in PPDB should contact kv35@cornell.edu or qisun@cornell.edu.