Bovine collagen type II a fibrillar collagen mainly
Bovine collagen type II, a fibrillar collagen mainly found in cartilage, comprises 24 potentially glycosylated lysine residues within its collagen domain. Detailed analysis revealed that 22 out of 23 of these lysine residues are hydroxylated and carry variable numbers of Gal(β1-O) and Glc(α1-2)Gal(β1-O) glycans. The extent of glycosylation of individual lysine residues was found to be highly variable, with the position of lysine along the polypeptide as a key determinant of glycosylation efficiency. For example, K884 was glycosylated at more than 95% whereas K956 was only glycosylated at less than 5% . The mapping of glycosylation sites in other collagen types, such as bovine collagen type V also revealed a majority of glycosylated Lysine residues, with 34 glycosylated Hyl sites and only 3 unmodified Hyl sites in bovine placenta COL5A1 . Another study of embryonic calf skin collagen type V identified 39 Hyl on COL5A1 and 22 Hyl on COL5A2. All of the Hyl sites on COL5A1 were glycosylated with 85% Glc(α1-2)Gal(β1-O) and 15% Gal(β1-O), whereas the 22 sites on COL5A2 were less glycosylated with only 55% carrying the disaccharide Glc(α1-2)Gal(β1-O) and 45% Gal(β1-O) .
The methods of choice to assess collagen glycosylation range from the analysis of single amino acids following alkaline hydrolysis  up to liquid chromatography–tandem mass spectrometry (LC–MS/MS) [12,14,15]. The low complexity of the polypeptide sequence and the intermolecular/intramolecular covalent cross-links represent challenges hampering the structural analysis of collagens. Trypsin, which is commonly applied in proteomic studies, often yields peptide fragments unsuited to LC–MS/MS methods, given that glycosylation of Hyl prevents trypsin-mediated cleavage . Accordingly, several proteolytic techniques must be applied to achieve an extensive coverage of the collagen polypeptides investigated. Several fragmentation approaches, such as collision induced dissociation (CID), electron transfer dissociation (ETD) and higher energy C-trap dissociation (HCD), are routinely applied to detect glycosylated Hyl in collagen. In a comparative study, HCD was found to be the most efficient technique to unravel glycosylated sites on collagen type IV . Finally, collagen O-glycans can be cleaved by trifluoromethanesulfonic (+)- Corydaline (TMSF) treatment , which leaves the polypeptide intact for further analysis of for being used as acceptor substrate in glycosyltransferase activity assays.
Biosynthesis and glycosyltransferase enzymes In spite of the early characterization of the disaccharide structure , the first genes encoding collagen glycosyltransferases have only been described in 2009. The core β1-O galactosyltransferase activity is assumed by the COLGALT1 and COLGALT2 enzymes, which were first annotated as GLT25D1 and GLT25D2 . A third structurally similar isoform, first described as CerCAM , could not be assigned any glycosyltransferase activity . The COLGALT1 gene is widely expressed in human tissues, whereas COLGALT2 is only expressed at lower levels in the nervous system. Both enzymes transfer Gal to various types of collagen and to the mannose-binding lectin at the same efficiency when assayed in vitro . In accordance with the observation that collagen polypeptides become glycosylated before formation of a triple helix, COLGALT1 is localized in the endoplasmic reticulum . The gene(s) encoding the α1-2 glucosyltransferase enzyme(s) remain(s) unknown at this stage. A glucosyltransferase activity has been assigned to the lysyl hydroxylase 3 LH3 enzyme , although another genuine collagen glucosyltransferase is likely responsible for the addition of Glc to galactosylated Hyl. In fact, the LH3 enzyme and it corresponding gene PLOD3 are missing in the genomes of invertebrates, which still produce the disaccharide Glc(α1-2)Gal(β1-O) on their collagens. Considering the conservation of Glc(α1-2)Gal(β1-O) across animal collagens from sponges to humans, it is tempting to search for the unknown α1-2 glucosyltransferase gene among putative glycosyltransferase enzymes that are strictly conserved across animal genomes. Few contenders emerge from such a survey, including paralogs to the endoplasmic reticulum UDP-Glc glucosyltransferase UGGT1 protein , the putative fucosyltransferases FUT10 and FUT11 , and the glycogenin-like proteins of the GLT8 family (Table 1). In addition to glycogenin-1 and glycogenin-2 involved in glycogen biosynthesis, the GLT8 family comprises several α1-3 xylose-transferases and the two untyped GLT8D1 and GLT8D2 proteins. Compatible with a possible role in collagen modification, the putative GLT8D2 glycosyltransferase has been localized to the endoplasmic reticulum . The GLT8D1 gene has itself been described as a susceptibility locus for hip osteoarthritis , suggesting a role of the GLT8D1 protein in bone collagen integrity. Despite these circumstantial clues, the confirmation of the GLT8D1 and GLT8D2 as collagen glucosyltransferase remains open.