PPDB, initiated in 2004, is a joint project
between Klaas J. van Wijk
University and the Computational Biology Service Unit
Life Sciences Core Laboratories
Center. PPDB is a Plant Proteome DataBase for
Arabidopsis thaliana and maize (Zea mays). Initially PPDB was dedicated
to plant plastids, but has now expanded to the whole plant proteome.
What is the purpose of the
The main objective is to provide a centralized,
curated, data deposit for predicted and experimentally determined proteins
in Arabidopsis thaliana and
maize (Zea mays), their
annotated functions, as well as their experimental and predicted molecular
and biophysical properties. Importantly, information from mass
spectrometry-based identifications is available for each identified
protein accession; this will allow the database user to determine the
significance the experimental identification and also evaluate information
of post-translational modification. The content of PPDB can be directly
accessed through its web interface (http://ppdb.tc.cornell.edu/).
Multiple search methods are provided so that the user can retrieve
information based on gene identification number, functional annotation or
various protein properties. Active links to other databases (e.g. TAIR and
TIGR) are present.
The objectives of PPDB are to:
1. Collect carefully curated experimental
(plastid) proteome/mass spectrometry data from maize and Arabidopsis thaliana and make them
publicly accessible. Curation is focused on the experimentally determined
protein localization within the plastid, physical-chemical properties of
the processed proteins, and other relevant protein features. Detailed
information regarding mass spectrometry based identifications is
2. Annotate protein name, location and function
based on primary literature. The functional classification system
developed by Thimm et al. 2004 (Plant Journal 37, 914-939) is used a
basis, updated where possible from primary literature, public information
and in-house experimental data.
3. Provide predicted chloroplast localization and
predicted chloroplast transit peptide (cTP) and lumenal transit peptide
(lTP) and physical-chemical properties (e.g. pI, mass, hydrophobicity,
predicted number of transmembrane domains, cystein content, etc.) of
precursor and processed proteins.
4. Collect information about protein interactions
for each protein and ultimately provide a network of plastid protein
5. Predict closest orthologues between maize (Zea mays) and Arabidopsis thaliana protein
models and couple experimental and predicted data for these predicted
6. Collect high quality information on
post-translational modifications using high resolution (100,000) and high
mass accuracy measurements and make this available.
What are plastids and why study their
Plastids are essential
organelles of prokaryotic origin that are present in every plant cell.
During plant development, proplastids in meristematic cells differentiate
into non-photosynthetic plastids in roots and petals and into
photosynthetic plastids (chloroplasts) in leafs and stems. Plastids are
responsible for synthesis of key molecules required for the architecture
and functions of plant cells. Ten to twelve percent of the ~29,500 Arabidopsis thaliana genes are predicted to
encode for plastid proteins, underlining the importance of this organelle
for the plant cell. The proteomes of the different plastid types in
different organs or in different cell types (eg Bundle sheath and mesophyl
cells in maize) are not well characterized and may be very different.
Characterization of these proteomes will provide insight in the essential
role of the different plastid types.
Experimental data sets in PPDB
(internal and external)
In house data All experimental data are from whole leaves or
different plastid preparations. These (will) include non-photosynthetic
plastids from different members of the Brassicaceae family,
including Arabidopsis, Brassica oleracea and Brassica rapa, as well
as chloroplasts from Arabidopsis thaliana and maize (Zea mays). In case of maize, a C4
plant, plastids purified from either bundle sheath (BS) cells or mesophyll
(M) cells are analyzed to address their specialization. Basic information
about the fractionation method can be extracted for each experimental
identification, via a user determined output format.
datasets. Medium to large-scale Arabidopsis thaliana plastid
proteome datasets and proteome datasets from other subcellular
compartments (e.g. mitochondria, plasma membranes) are stored in PPDB and
linked to each locus. This information can be obtained by selecting
‘Proteomics Publication’ as output parameter. This will help to identify
proteins that partition to different locations and identify abundant
proteins often found as contaminants.
Deposition data. Anybody interested to have their plastid proteome
datasets included in PPDB should contact firstname.lastname@example.org or