Aggrecan is a modular proteoglycan with multiple functional domains. Its core protein consists of three globular regions (Doege et al., 1991), termed G1, G2 and G3 (Fig. 1), each containing cysteine residues that participate in disulphide bond formation (Sandy et al., 1990). The G1 and G2 regions are separated by a short interglobular domain (IGD), and the G2 and G3 regions are separated by a long glycosaminoglycan (GAG)- attachment region, which consists of adjacent domains rich in keratan sulphate (KS) and chondroitin sulphate (CS).
The G1 region is at the amino terminus of the core protein, and can be further sub-divided into three functional domains, termed A, B1 and B2, with the B-type domains being responsible for the interaction with HA (Watanabe et al., 1997). The G2 region also possesses two B-type domains, but does not appear to interact with HA (Fosang and Hardingham, 1989), and at present its function is unknown. The G3 region resides at the carboxy terminus of the core protein and contains a variety of distinct structural domains. It is essential for normal posttranslational processing of the aggrecan core protein and subsequent aggrecan secretion (Zheng et al., 1998).
The human aggrecan gene, which resides at chromosome 15q26 (Korenberg et al., 1993), consists of 19 exons (Valhmu et al., 1995), with each exon encoding a distinct structural domain of the core protein. Exons 3-6 encode the G1 region and exons 8-10 encode the G2 region.
The CS-rich domain and much of the KS-rich domain are encoded by the large exon 12. The G3 region is encoded by exons 13-18. It can give rise to different aggrecan transcripts due to alternative splicing (Fülöp et al., 1993), though it is not clear whether this is of any functional consequence.
The GAG-attachment region is composed of three domains responsible for the attachment of KS and CS. The KS-attachment domain resides adjacent to the G2 region and is composed largely of repeat motifs whose number varies between species (Barry et al., 1994). The neighbouring CS-attachment domain is divided into two subdomains – the CS1 and CS2 domains. The CS1 domain lies adjacent to the KS-rich domain and is also composed largely of repeat motifs whose number varies between species. In addition, the human CS1 domain exhibits size polymorphism between individuals due to a variable number of 19 amino acid repeats (Fig. 1) (Doege et al., 1997). This results in the aggrecan molecules of different individuals being able to bear different numbers of CS chains. Irrespective of the number of CS chains present, their structure varies throughout life due to changes in length and sulphation pattern (Roughley and White, 1980), though the functional consequence of this change is not clear. The GAG-attachment region also possesses sites for the attachment of O-linked oligosaccharides (Nilsson et al., 1982), which with age may become substituted with KS (Santer et al., 1982). KS may also be present within the G1 region, the IGD and the G2 region, attached to either O-linked or N-linked oligosaccharides (Barry et al., 1995). The KS chains also show age-related changes in structure (Brown et al., 1998).
Aggrecan molecules do not exist in isolation within the extracellular matrix, but as proteoglycan aggregates (Hascall, 1988). Each aggregate is composed of a central filament of HA with up to 100 aggrecan molecules radiating from it, with each interaction able to be stabilized by the presence of a link protein (Morgelin et al., 1988).
Proteoglycan aggregate structure is influenced by three parameters – the length of the HA, the proportion of link protein, and the degree of aggrecan processing. Two molecular forms have been described for proteoglycan aggregates extracted directly from cartilage without the use of dissociative agents (Manicourt et al., 1991). These forms appear to differ in their link protein content, and may have different functional characteristics (Buckwalter et al., 1994). It is the large size of the proteoglycan aggregates and their entrapment by the collagen framework of the tissue that results in aggrecan retention in the extracellular matrix.
Aggrecan molecules rarely exist in an intact form in the proteoglycan aggregates of the cartilage matrix, but instead are subject to extracellular proteolytic processing of their core proteins (Fig. 2). This results in the accumulation of fragments that bear the G1 regions and the loss of those that do not by diffusion from the tissue, ultimately yielding proteoglycan aggregates that are enriched in aggrecan G1 regions rather than more intact molecules. The G1 regions may accumulate in the cartilage matrix for many years (Maroudas et al., 1998).
Aggrecanases and matrix metalloproteinases (MMPs) are associated with aggrecan proteolysis (Sztrolovics et al., 1997). The aggrecanases (Abbaszade et al., 1999; Tortorella et al., 1999) are of particular interest because of their selectivity for aggrecan. Five aggrecanase cleavage sites have been described in aggrecan (Tortorella et al., 2000), with one residing in the IGD domain and four in the CS2 domain.
The GAG-attachment region provides the high anionic charge density needed for the unique osmotic properties of aggrecan. Normal cartilage function depends on a high aggrecan content, high GAG substitution and large aggregate size. Loss of cartilage integrity in arthritis is associated with impaired aggrecan function due either to proteolytic cleavage of the aggrecan core protein, which decreases aggrecan charge, or to cleavage of the HA, which decreases aggregate size. It has also been suggested that aggrecan charge and hence function could be affected by size polymorphism within the CS1 domain, as those individuals with the shortest core protein length would possess aggrecan with the lowest CS substitution. Such individuals might be at risk for premature cartilage degeneration. CS2 domain processing by aggrecanases would result in aggrecan fragments enriched in the CS1 domain and therefore enhance any influence that CS1 domain polymorphism may have on aggrecan function. While size polymorphism in the aggrecan CS1 domain has been associated with both articular cartilage and intervertebral disc degeneration (Horton et al., 1998; Kawaguchi et al., 1999), the reason for the linkage is not clear as it is not always the shorter CS1 domains that have been associated with disease. It is possible that the presence of one short aggrecan allele may be of little functional consequence, and only individuals with two short alleles would be at risk. Such individuals represent less than 1% of the population and have not been the focus of any study reported to date.
Mutations in the aggrecan gene leading to chondrodyplasias have been described in the human, mouse and chicken. In the human, a single base pair insertion in exon 12 causes a frameshift and results in a form of spondyloepiphyseal dysplasia (Gleghorn et al., 2005). In the mouse, a 7 bp deletion in exon 5 causes a frameshift and results in a premature termination codon arising in exon 6 (Watanabe et al., 1994). In the chicken, a premature stop codon is present within exon 10 encoding the CS-attachment region (Li et al., 1993), resulting in decreased message accumulation and underproduction of a truncated aggrecan. It is likely that the absence of a G3 region impairs secretion of the mutant aggrecan molecules. In the human, chondrodystrophic phenotypes have also been associated with the undersulphation of aggrecan due to gene defects in a sulphate transporter (Superti-Furga et al., 1996). These disorders illustrate the importance of aggrecan content and charge in embryonic cartilage development and growth.