Molecular Evolution, Structure, and Function of Peroxidasins

Peroxidasins represent the subfamily 2 of the peroxidase-cyclooxygenase superfamily and are closely related to chordata peroxidases (subfamily 1) and peroxinectins (subfamily 3). They are multidomain proteins containing a heme peroxidase domain with high homology to human lactoperoxidase that mediates one- and two-electron oxidation reactions. Additional domains of the secreted and glycosylated metalloproteins are type C-like immunoglobulin domains, typical leucine-rich repeats, as well as a von Willebrand factor C module. These are typical motifs of extracellular proteins that mediate protein–protein interactions. We have reconstructed the phylogeny of this new family of oxidoreductases and show the presence of four invertebrate clades as well as one vertebrate clade that includes also two different human representatives. The variability of domain assembly in the various clades was analyzed, as was the occurrence of relevant catalytic residues in the peroxidase domain based on the knowledge of catalysis of the mammalian homologues. Finally, the few reports on expression, localization, enzymatic activity, and physiological roles in the model organisms Drosophila melanogaster, Caenorhabditis elegans, and Homo sapiens are critically reviewed. Roles attributed to peroxidasins include antimicrobial defense, extracellular matrix formation, and consolidation at various developmental stages. Many research questions need to be solved in future, including detailed biochemical/physical studies and elucidation of the three dimensional structure of a model peroxidasin as well as the relation and interplay of the domains and the in vivo functions in various organisms including man.

Introduction. -Recently, we have reconstructed the phylogenetic relationships of the main evolutionary lines of the mammalian heme-containing peroxidases myeloperoxidase (MPO), eosinophil peroxidase (EPO), lactoperoxidase (LPO), and thyroid peroxidase (TPO) [1]. Based on their occurrence and the fact that two main enzymatic activities are related to these metalloproteins, the peroxidase-cyclooxygenase superfamily was defined and shown to occur in all kingdoms of life [1]. Seven clearly separated subfamilies were found with chordata heme peroxidases comprising subfamily 1. Its representatives are involved in the innate immune system (MPO, EPO, and LPO) as well as hormone biosynthesis (TPO). Mature vertebrate peroxidases consist of only one (monomeric EPO and LPO) or two (homodimeric MPO and TPO) glycosylated, mainly a-helical domains with one autocatalytically modified heme per domain [2]. Well characterized chordata peroxidases evolved from multidomain enzymes called peroxidasins (subfamily 2) [1]. These peculiar proteins are found in vertebrates and invertebrates and have -in addition to a heme-containing domain of high homology with chordata peroxidases -flanking domains which are known to be important for protein-protein interaction or cell adhesion. These include leucine-rich regions, immunoglobulin-like domains, as well as a von Willebrand factor C (VWC) module.
In 1994, the first peroxidasin was described and found to occur in hemocytes of Drosophila [3]. Hemocytes are migratory cells present in the hemolymph of insects and are involved in (targeted) consolidation of extracellular matrix as well as in intracellular phagocytosis of apoptotic cells or foreign material (e.g., pathogens). In the following years, homologous genes were detected among numerous ecdysozoan [4] and almost all deuterostomian genomes [5] [6], implicating their importance in both invertebrate and vertebrate physiology. The human homologue of this subfamily, first cloned as a shorter mRNA fragment, was originally identified as a p53-responsive gene [7]. Finally, on a protein basis, the first human peroxidasin was detected in human colon cancer cells, but later also in squamous lung carcinoma cells, where it was initially named melanoma gene 50 (MG50) [8]. In the human genome, two different peroxidasins are encoded [1] and were originally designated as vascular peroxidase (VPO) 1 and 2 (VPO 2 is sometimes referred to as cardiac peroxidase), in relation to their high expression in the vascular system or heart [9]. However, both enzymes are multidomain proteins of subfamily 2 of the peroxidase-cyclooxygenase superfamily and thus should be named systematically human peroxidasin 1 (hsPxd01) and human peroxidasin 2 (hsPxd02). In recent years, several papers were published that report expression patterns [9 -13], localization [9] [14], enzymatic activities [9], and physiological relevance of peroxidasins [15 -18], but still many open questions remain regarding the relation between structure and function and the putative role(s) in the extracellular matrix formation and the innate immune system of invertebrates and vertebrates.
Here, we have reconstructed the molecular evolution of peroxidasins. We have analyzed the multidomain structure of the two main groups, i.e., invertebrate and vertebrate peroxidasins, and discuss the function of the enzymatic domain based on the knowledge about catalysis of human peroxidases [2]. Finally, we critically relate these facts to biochemical and biological features of these proteins.
Results and Discussion. -Phylogenetic Reconstruction. Genes encoding peroxidasins are found in vertebrates as well as in invertebrates. For the phylogenetic reconstruction, all available peroxidasin sequences (i.e., 106 in December 2011) were aligned and analyzed by three different methods. Fig. 1 depicts the reconstructed unrooted tree obtained by the neighbor-joining (NJ) method; however, very similar trees were achieved by the minimum-evolution (ME) and maximum-likelihood (ML) methods. A clear separation into five peroxidasin clades is evident, with Clades 1 -4 representing invertebrate and Clade 5 vertebrate oxidoreductases (Fig. 1). For the phylogenetic analysis, the full gene length including all domains was used; however, analyses based on only the peroxidase domain gave a very similar evolutionary tree (results not shown 1 ).
Closely related with subfamily 3 (i.e., peroxinectins) of the peroxidase-cyclooxygenase superfamily [1], and thus somehow linking subfamilies 2 and 3, are short peroxidasins that lack all characteristic peroxidasin domains except the peroxidase domain. So far (December 2011), only eight genes are known and all occur in nematodes. Since these putative proteins do not show the typical multidomain structure of neither peroxinectins nor peroxidasins, they are not designated as a distinct clade. Echinozoan and ecdysozoan peroxinectins have integrin-binding motifs in addition to the peroxidase domain [1] [18] and, so far, no vertebrate representative was found.
Vertebrate peroxidasins (Clade 5) can be divided into two branches, namely peroxidasins 1 (Pxd1) and peroxidasins 2 (Pxd2), sometimes referred as peroxidasinlike proteins. In each branch, a clear segregation in fish, amphibian, bird, and mammalian proteins is obvious (Fig. 3). In the human genome, two peroxidasins are encoded that share a sequence identity of 63%. Peroxidasin 1 from Homo sapiens (hsPxd01) is located on chromosome 2p25, whereas human peroxidasin 2 (hsPxd02) is located on chromosome 8q11 [9]. A closer look to Fig. 3 depicts the presence of a third  Table. human peroxidasin, which is almost identical to hsPxd02 -except for five amino acidsand might be a splicing variant.
Domain Assembly and Architecture. So far, no three-dimensional structure of a peroxidasin is known, but comparative sequence analysis and secondary-structure prediction clearly indicate the presence of four different domains of distinct assembly and architecture. Fig. 4 shows the domain assembly in invertebrate and vertebrate peroxidasins. In the latter, and thus also in both human peroxidasins, the primary transcript has a signal peptide for extracellular secretion followed by five leucine-rich repeats with N-terminal and C-terminal capping motifs, four successive immunoglobulin domains, one heme-binding peroxidase domain, and a C-terminal von Willebrand factor C module (Fig. 4).
Clade 1 and nematode peroxidasins (Clade 2) vary in the number of both the leucine-rich repeats as well as immunoglobulin motifs. Additionally, they do not contain a C-terminal von Willebrand factor C, suggesting that this feature is  phylogenetically the newest addition to these multidomain proteins. Arthropod peroxidasins (Clade 3) also show variable numbers of leucine-rich repeats and immunoglobulin domains, but the majority of the putative proteins seem to contain already a C-terminal von Willebrand factor C, whereas in the second group of hemichordate and urochordate peroxidases (Clade 4) the von Willebrand factor C is absent.
Up to now, we have no detailed knowledge about the relation between domain assembly and structure and their physiological relevance in peroxidasins. The only enzymatic domain is the heme-containing peroxidase domain and its sequence is critically analyzed below. All the other domains are found in thousands of proteins in various contexts.
Leucine-Rich Repeats. Leucine-rich repeats (LRR) occur in numerous proteins and all of them appear to be involved in protein-protein interactions including cell adhesion and signal transduction, extracellular matrix assembly, platelet aggregation, neuronal development, RNA processing, and immune response [19]. A striking example for the functional variety of LRRs is the adaptive immune system of jawless fishes. They possess variable lymphocyte receptors (VLR) built from proteins that are completely unrelated to immunoglobulins. Interestingly, they are reported to be generated from one or two germ lines of VLR genes by genomic rearrangement of flanking LRR cassettes [20]. Seven subclasses of LRR motifs differing in length and consensus sequence of the variable segments (typical, RI-like, CC, PS, SDS22-like, bacterial, and TpLRR) have been proposed [21]. In peroxidasins, only typical LRR cassettes are found. This motif exhibits a solenoid structure where each repeat represents a turn that is typically 20 -30 amino acids long. It contains a conserved eleven-residue hallmark sequence (LxxLxLxxNxL) corresponding to a b-sheet that forms the concave side of the solenoid [19]. By contrast, the convex side shows a greater variability and may consist of a-helical structures including 3 10 helices, polyproline II conformation, and b-turns depending on the amino acid composition and length of the variable segment [22]. The conserved leucines as well as other mainly hydrophobic amino acids are arranged to a characteristic tightly packed structure forming the inner core of the solenoid and contribute to the stability of the LRR domain. Many ligand-LRR protein complexes depict the ligand more or less surrounded by the concave surface of the parallel bsheets of the LRR domain, however, ligand interactions with the convex side have also been observed. Generally, ligand binding does not induce major rearrangements in the LRR structure, since the free and bound forms superimpose [19].
The hydrophobic core of the LRR domains may be protected from opening by Nterminal and C-terminal cysteine-rich flanking regions, which are often found in extracellular LRR proteins including peroxidasins [21]. The N-terminal capping motif (LRRNT) is described as a single b-strand, antiparallel to the main b-sheet, followed by a short LRR of 20 or 21 residues, whereas the C-terminal capping motif (LRRCT) possesses an a-helix covering the hydrophobic core of the last LRR [21]. Fig. 5, a, shows a model of the LRR domain of hsPxd01 that is built from five tandem typical LRRs framed by LRRNT and LRRCT.
Immunoglobulin Domains. Since peroxidasins in addition contain immunoglobulin (Ig) domains, they are also accounted to the heterogenic immunoglobulin superfamily (IgSF). This major group of proteins differs in tissue distribution, amino acid composition, and biological role, but all representatives possess at least one structurally discrete domain of ca. 100 amino acids exhibiting the Ig-fold [23]. The differentiation of immunoglobulin domains is based on the composition of the two b-sheets that form the sandwich-like fold. The constant domains (C-domains) are comprised of seven strands and the variable domains (V-domains) of eight, nine, or ten strands [24]. The standardized IMGT (International ImMunoGeneTics Information System) discriminates immunoglobulin domains according to following categories: C-domains (including immunoglobulins and T-cell receptors of all jawed vertebrates), C-like domains (proteins other than immunoglobulins or T-cell receptors), V-domains (immunoglobulins and T-cell receptors with V-J-region and V-D-J-region) and V-like domains (proteins other than immunoglobulins or T-cell receptor with V-J-region and V-D-J-region) [25]. According to this systematics, peroxidasins have C-like Ig-domains.
As mentioned before, the Ig-fold is a wide-spread protein motif with various biological functions, although the common feature seems to be cell adhesion and pattern recognition. An example for a structural function of immunoglobulin domains are the Ig-tandems of titin, a myofilament protein and the largest mammalian protein.
Von Castelmur et al. [26] determined the crystal structure of a fragment from the skeletal I-band of soleus titin. This structure was also used as a template for tertiarystructure modelling of the Ig-domains of hsPxd01 (Fig. 5, b).
Von Willebrand Factor Type C. The C-terminal von Willebrand factor type C (VWC) module is also referred as chordin-like, cysteine-rich (CR) repeat. It contains ca. 60 -80 amino acids and is defined by a consensus sequence of ten cysteines [27]. The name originates from the five types of structural domains comprising the von Willebrand factor (VWF), a multimeric blood glycoprotein that binds and stabilizes clotting factor VIII and mediates platelet adhesion [28]. This motif has been identified in more than 500 extracellular matrix proteins including CCN (cysteine-rich protein 61, connective tissue growth factor proteins, nephroblastoma overexpressed gene), procollagen, thrombospondin, glycosylated mucins, and neuralins with varying copy numbers [29]. Peroxidasins contain only one C-terminal copy, similar to the CCN proteins.
For most VWC modules, the cellular role has still be to investigated [29], however, binding and regulating bone morphogenetic proteins (BMPs) and transforming the tissue growth factor beta (TGF-b) are the two most common functions attributed to the VWC domain [30]. Moreover, the VWC domain might be involved in oligomerization of proteins [28].
The three-dimensional structure of a prototypical chordin-like, cysteine-rich repeat from collagen IIA was determined by OLeary et al. [27]. They found that the VWC domain exhibits a two-subdomain architecture connected via a short linker region. A model of the VWC domain of hsPxd01 is shown in Fig. 5, d. The C-terminal subdomain adopts a rather irregular and flexible structure, whereas the N-terminal subdomain of peroxidasin consists of two double stranded antiparallel b-sheets, in contrast to the Nterminal subdomain of collagen IIA that contains one double-stranded and one triplestranded antiparallel b-sheet. Interestingly, there is a structural similarity between the N-terminal subdomain of the VWC module and the fibronectin type 1 (FN1) domain. Since FN1 domains are only found in vertebrates, in contrast to VWCs, which have been identified in all eukaryotes, an evolutionary relationship between these two domains might be possible [27].
The Enzymatic Heme Domain. Subfamilies 1 -3 of the peroxidase-cyclooxygenase superfamily contain one enzymatic peroxidase domain [1]. Multiple sequence alignment reveals highly conserved regions (Figs. 6 and 7). They correspond to functionally and structurally essential motifs, as known from mammalian peroxidases. The secondary structure of mammalian peroxidases as well as of the peroxidase domain of peroxidasins is predominantly a-helical with a central heme-containing core composed of five helices (Fig. 5, c).
Both the distal and proximal histidines in the heme cavity as well as the corresponding H-bonding partners are located within a-helices. Essential distal residues in mammalian peroxidases are Gln91, His95, and Arg239 (Fig. 5, c; numbering corresponds to mature lactoperoxidase and numbering in parentheses corresponds to hsPxd01). The peroxidase-typical distal pair His-Arg is found in all heme peroxidases from both superfamilies (peroxidase-cyclooxygenase and peroxidase-catalase superfamily) [1] and is also fully conserved in all peroxidasins (Fig. 6). This pair is important in the heterolytic cleavage of H 2 O 2 [2]. In mammalian peroxidases, Gln91 (Fig. 5, c) is involved in the maintenance of the distal H-bond network as well as halide binding [2]. This essential glutamine, which is part of a conserved motif W/F-G-Q-F, is present in almost all analyzed peroxidasin sequences, except short peroxidasins and group Pxd2 of Clade 5 enzymes including hsPxd02 (Fig. 6).
The second adjacent conserved motif in mammalian peroxidases, which is also fully conserved in peroxidasins and includes the distal His95, is Asp-His-Asp. Asp94 is known to be involved in ester-bond formation with the OHCH 2 group at C(5) of pyrrole ring C of the heme (Fig. 5, c), and it is found in all peroxidasins, except group Pxd2 of Clade 5 and short peroxidasins. Regarding Asp96, its role as a ligand of Ca 2 þ is well established. The role of the distal Ca 2 þ -binding site could be to stabilize both the distal heme cavity architecture as well as to mediate the assembly of the mature peroxidases [2]. In both lactoperoxidase and myeloperoxidase, the Ca 2 þ -binding site has a typical pentagonal bipyramidal coordination geometry [2] and the cation is liganded by several highly conserved residues. One is the already mentioned Asp96 and the other ligands are part of a loop that consists of eight residues, i.e., Leu-Thr-Ser-Phe-Val-Asp-Ala-Ser (amino acid number 333 -340 of the mature LPO), and is found in all chordata peroxidases. This motif is also present in all peroxidasins, although some mutations, especially in short peroxidasins and group Pxd2 of Clade 5, are observable.
All mammalian peroxidases have their prosthetic group covalently bound to the protein and, besides Asp94 mentioned above, a conserved Glu242 forms the second ester bond. Except short peroxidasins, all other members of this protein family have this glutamate residue fully conserved (Fig. 6).
Structurally and functionally important motifs on the proximal side of mammalian peroxidases include the proximal histidine (His336) and its H-bonding partner asparagine (Asn421; Fig. 5, c). Both residues govern the heme-iron reactivity by Fig. 6. Selected parts of the multiple sequence alignment of 32 sequences from members of the peroxidasin subfamily as well as representatives of chordata peroxidases, peroxinectins, and short peroxidasins. Catalytically important residues of the distal heme cavity are highlighted by arrows. a) Area around the essential distal Asp-His-Asp as well as Gln (compare with Fig. 5, c). b) Sequence area around the known Ca 2 þ -binding site in mammalian peroxidases. c) Area around the catalytic distal Arg and Glu, the latter being involved in heme to protein ester linkage in mammalian peroxidases.
controlling the electron density at the metal [2]. Inspection of Fig. 7 again demonstrates that Clades 1 -4 and group Pxd1 of Clade 5 are chordata peroxidase-like, whereas in group Pxd2 of vertebrate peroxidasins variable amino acids (Ile, Asp, Asn) are found at the corresponding position of Asn421. Summing up, sequence analysis of the enzymatic domain of peroxidasins suggests a similar function as in vertebrate peroxidases. All catalytically relevant residues (with exception of the MPO-specific methionine that is responsible for the third covalent linkage between the autocatalytically modified heme and the protein) are also found in the peroxidase domain of most of the peroxidasins. This suggests similar enzymatic features, including catalysis of one-and two-electron oxidation reactions, and some preliminary biochemical data support this observation [9] [14], although the reported activities (e.g., tyrosine and halide oxidation) seem to be significantly lower compared to mammalian peroxidases. Only short peroxidasins and group Pxd2 of vertebrate peroxidases (including hsPxd02) show alternative amino acids at structurally and functionally important positions, and it is not clear at the moment whether these enzymes are functionally comparable with peroxidases.
Physiological Roles. So far, only a handful of studies have been published that might give an indication of the in vivo function of peroxidasins. Focusing on the enzymatic domain only, one could speculate about similar biological roles as reported for the closely related and homologous chordata peroxidases including MPO, EPO, and LPO. The latter are involved in innate immune-defense reactions against invading pathogens and are stored in high concentration in granules of neutrophilic (MPO) or eosinophilic leukocytes (EPO) that are recruited to sites of pathogen invasion and inflammation [31] [32], whereas secreted lactoperoxidase acts as extracellular bactericidal agent in exocrine secretions [33]. These peroxidases catalyze the production of hypohalous acids or hypothiocyanate from H 2 O 2 and (pseudo-) halide anions, but also act as efficient one-electron oxidants [2]. Another physiological role of chordata peroxidases is cell adhesion and formation of extracellular matrix (ECM). It has been demonstrated that the extracellular oxidants generated by cationic mammalian peroxidases (e.g., MPO) can have both damaging and protective effects on the ECM, including inactivation of metalloproteinases [34] [35] and formation of dityrosine crosslinks [36].
Among peroxidase-domain containing proteins of the same superfamily, also peroxinectins (subfamily 3 of the peroxidase-cyclooxygenase superfamily) might give some indications about the functionality of peroxidasins, although from this subfamily (that has no vertebrate representative) only very few reports can be found in the literature. These metalloenzymes seem to be involved in the promotion of cell-ECM adhesion via their integrin-binding motifs [37]. Interestingly, also MPO has been shown to promote integrin-mediated adhesion of neutrophils [38].
Regarding the overall structure and size as well as domain assembly, peroxidasins are the most complex representatives of peroxidase-domain-containing enzymes of the peroxidase-cyclooxygenase superfamily. Sequence analysis clearly suggests an enzymatic functionality similar to LPO and EPO (except in short peroxidasins and group Pxd2 of Clade 5). The motifs typical of extracellular proteins (Igs, LRRs, and VWC) might contribute to association with ECM. Both the roles in innate immune defense and extracellular matrix consolidation are often addressed in various contexts in the literature.
As already mentioned, the first peroxidasin was identified in 1994 in Drosophila melanogaster (Clade 3 peroxidasins; Fig. 2) and has shown to be associated with the function of insect hemocytes and plasmatocytes [3] [7]. Similar to neutrophils, hemocytes are migratory cells that phagocytose foreign and dead cells and deposit ECM. Peroxidasin expression in Drosophila is widely used as a molecular marker for the early hemocyte lineage [39], and hemocytes are crucial for a morphogenetic event known as condensation of the ventral nerve cord (VNC). If hemocyte migration is blocked, deposition of ECM components, including peroxidasin and type IV collagen, does not occur, leading to failure of ventral nerve cord condensation [40]. Drosophila peroxidasin was reported to be a homotrimer that catalyzes H 2 O 2 -mediated tyrosine and iodide oxidation [3].
The second invertebrate model organism in which the in vivo role of peroxidasin was studied is Caenorhabditis elegans [15]. Two orthologs are found in C. elegans (CelPxd01 and CelPxd02; Fig. 2) with apparently antagonistic functions. CelPxd02 was identified upon screening for mutants defective in embryonic worm development. It was found to be essential for specific stages of morphogenesis and epidermal muscle attachment as well as postembryonically for basement membrane integrity [15]. The peroxidase activity (which -based on sequence analysis -should be LPO-like) was shown to be responsible for these developmental roles. In adult worms, loss of CelPxd02 promotes regrowth of axons after injury, providing evidence that C. elegans extracellular matrix can play an inhibitory role in axon regeneration. By contrast, loss of CelPxd01 did not cause developmental effects. Moreover, CelPxd02 mutant phenotypes were suppressed by loss of function of CelPxd01 and exacerbated by its overexpression, suggesting antagonistic roles.
The first study on a (non-mammalian) vertebrate peroxidasin was performed on group 1 peroxidasin (Fig. 3) from Xenopus tropicalis in 2005 [10]. This representative is expressed in several distinct tissues during early development, including the neural tube and the tail-forming region, and has also a role in modifying extracellular matrix components necessary in morphogenesis, again suggesting a role of the peroxidase domain in catalyzing one-electron oxidation reactions.
Investigations on human peroxidasins started in 1999 [7]. It was demonstrated that besides the full-length mRNA of hsPxd01, also a shorter 4.5 kb mRNA version occurs in human colon EB1 cancer cells undergoing p53-dependent apoptosis. Characterization of the smaller mRNA showed that the N-terminal part of hsPxd01 including the signal peptide was not present, whereas the C-terminal portion including the peroxidase domain was present. The authors hypothesized that the protein missing a signal sequence would accumulate in the cytoplasm, and since the peroxidase domain is intact, it could potentially increase the cellular production of reactive oxygen species (ROS), which are powerful inducers of apoptosis by inactivation of the p53 tumor suppressor protein [7]. Based on the knowledge of catalysis of homologous mammalian peroxidases [2], the effect on p53 could be mediated by hypohalous acids or hypothiocyanate released from peroxidasin rather than by activated oxygen species, since peroxidases consume H 2 O 2 and do not release O 2 .À at reasonable rates.
Finally, the cloning of human and mice peroxidasins and initial characterization of hsPxd01 were reported [9]. Heme-containing hsPxd01was recombinantly expressed in HEK cells, and the prosthetic group of hsPxd01 was shown to be covalently attached to the protein via two ester bonds very similar to LPO or EPO [9]. The protein was shown to exhibit both peroxidase activity (tetramethylbenzidine oxidation) as well as halogenation activity (two-electron oxidation of Cl À ). However, the reported activities were very low compared to mammalian peroxidases. The finding of chlorination activity at neutral pH, which so far was only attributed to MPO, is peculiar, since hsPxd01 cannot form the MPO-typical (e À withdrawing) sulfonium-ion linkage and thus this reactivity must be addressed in future biochemical studies.
Based on its halogenation activity and localization, the role of hsPxd01 in lowdensity lipoprotein oxidation and endothelial cell apoptosis was investigated [17]. It was demonstrated that expression of hsPxd01 in endothelial cells correlates with LDL oxidation in a time-and concentration-dependent manner. Activity of hsPxd01 was shown to be dependent on NADPH-oxidase activity, which initially forms O 2 .À that dismutates to H 2 O 2 necessary to initiate both peroxidase and halogenation activity.
Peterfi et al. [14] demonstrated the secretion of human hsPxd01 as well as the formation of peroxidasin-containing, fibril-like structures by differentiated myofibroblasts. Since myofibroblasts appear during wound healing, it is tempting to speculate that the peroxidase domain functions in an antimicrobial manner, whereas the other domains could participate in the ECM formation. Stabilization of the ECM could also be achieved by the enzymatic formation of dityrosine crosslinks.
In another study [18], homozygous mutations in the gene encoding hsPxd01were associated with a spectrum of ocular anterior segment dysgenesis phenotypes. Although the precise effect of these mutations is unknown, the pathogenic impact is obvious. Finally, it was demonstrated that hsPxd01 is also expressed in various cancer cells, including breast cancer, melanoma, colon cancer, and metastatic gilomas [8] [11] [12]. Hence, hsPxd01 seems to be a signature gene of heme oxygenase-1 (HO-1) [12], an enzyme associated with tumor angiogenesis. A loss of the adhesion promoting effect of HO-1 in hsPxd01-silenced cells was observed. This suggests also a role of hsPxd01 in cell adhesion and invasion as well as in the transition of benign tumors to invasive and malignant cancers [12].
Summing up, peroxidasins are multidomain heme peroxidases with a substrate spectrum similar to mammalian peroxidases. Based on sequence analysis and reported biochemical data, they perform one-electron oxidation reactions including tyrosine oxidation as well as two-electron oxidation reactions of (pseudo-) halides. Hypochlorous acid formation is questionable and must be addressed in future investigations. In any case, physiological studies have shown that the enzymatic domain is essential for the in vivo activity and might contribute to extracellular matrix consolidation as well as antimicrobial defense. The additional domains (Igs, LRRs, and VWC) are typical for extracellular proteins and may help in targeting the peroxidase to its site of action. However, a distinct function of the individual domains and their relation to the reported in vivo roles has not been demonstrated so far. Clearly, more detailed biochemical and biophysical studies as well as the elucidation of the three-dimensional structure of peroxidasins are necessary. These studies must include invertebrate and both groups of vertebrate peroxidasins, since -as sequence analysis suggests -they most probably differ in catalysis and, in consequence, in function. This project was supported by the Austrian Science Fund and the doctoral program BioToP -Biomolecular Technology of Proteins (FWF W1224).

Experimental Part
Data Mining. All currently (December 2011) available 106 peroxidasin sequences were collected from public databases (Uniprot, NCBI, PeroxiBase, Ensembl). Selected representatives of nonmammalian vertebrate peroxidase (BbePOX01 Branchiostoma floridae, BbePOX01 Branchiostoma belcheri, HrPOX Halocynthia roretzi), HsTPO Human thyroid peroxidase, and Aedes aegypti peroxinectin (AaePxt01) sequences were retrieved from the PeroxiBase database [41] and used as an outgroup. Sequence data are given in the Table. Multiple Sequence Alignment. A multiple sequence alignment of the 112 enzymes was constructed with ClustalW [42] with following parameters: for pairwise alignment, gap opening penalty 9 and gap extension penalty 0.1; for multiple alignment, gap opening penalty 8 and gap extension penalty 0.2. Gonnet protein weight matrix was used and the gap separation distance was set to 4. The delay divergent cut off was defined with 25%. The output was displayed with Genedoc [43].
Phylogeny Reconstruction. Distance Method. The multiple sequence alignment of the 112 selected enzymes was subjected to the neighbor-joining (NJ) method of the MEGA5 package [44] with the JonesÀTaylorÀThornton (JTT) model of amino acid substitution, pairwise deletion of gaps, and