| Nucleic Acids Research | Pages |
LDLR Database (second edition): new additions to the database and the software, and results of the first molecular analysis
The LDL Receptor And Hypercholesterolemia
The LDLR Database
Newly Developed Software Routines
Results Of The First Molecular Analysis
Database On The Web
Acknowledgements
References
LDLR Database (second edition): new additions to the database and the software, and results of the first molecular analysis
ABSTRACT
THE LDL RECEPTOR AND HYPERCHOLESTEROLEMIA
The LDL receptor is a 160 kDa transmembrane glycoprotein ubiquitously distributed, playing a major role in cholesterol homeostasis (1). Impairement of LDL receptor activity results in the accumulation of LDL cholesterol in the circulation leading to familial hypercholesterolemia (FH). Affected individuals display arcus corneae, tendon xanthomas and premature symptomatic coronary heart disease (2). FH is an autosomal dominant disease, homozygotes being more severely affected than heterozygotes. FH is also one of the most common inherited disorders with frequencies of heterozygotes and homozygotes estimated to be 1/500 and 1/106, respectively. In certain communities FH frequency is higher due to founder effects (3). The LDL receptor gene (LDLR) lies on the short arm of chromosome 19 (19p13.1-13.3) (4,5). It contains 18 exons encoding the six functional domains of the mature protein: Signal peptide, ligand-binding domain, epidermal growth factor (EGF) precursor like, O-linked sugar, transmenbrane and cytoplasmic (6). To date, 444 mutations in the LDLR gene have been identified that are distributed as follows: 350 point mutations (77%), 68 major rearrangements (15%), 20 splice mutations (4%), 6 mutations in the promoter sequence (1%) (3,7).
THE LDLR DATABASE
This second version of the LDLR database contains 350 entries. Table 1 shows the 140 new entries of the database corresponding to mutations either recently published or contributed by the co-authors of this paper (8-31). It is not intended to replace primary publications, although it does contain unpublished data. As in the previous edition, mutation names are given according to Beaudet et al. (32) and are often followed by the name of the city or country from which the proband's family originated. For each mutation, information is provided at several levels: gene (exon and codon number, wild type and mutant codon, mutational event, mutation name), protein (wild type and mutant amino acid, affected domain, activity, mutation class), personal (ethnic background, age, sex, body mass index, familial history of coronary heart disease), clinical (values of plasma total cholesterol, LDL-cholesterol, HDL-cholesterol and triglycerids, presence or absence of xanthomas, arcus corneae and symptomatic coronary heart disease) and impact (private, recurrent, founder). We have included possible recurrent mutations (when no comparable haplotypes of the LDLR gene where available) in two instances: (i) when carriers of the same mutation were from distant ethnic or geographic background, and if not (ii) when clinical data were provided for the mutations to allow analysis of phenotypic variability. This last point concerns mutations W23X identified in probands of German-Canadian and German origin, 533ins8 and R395Q identified in probands from Germany, D200G identified in probands of Afrikaner and British origin, S285L identified in probands of Afrikaner and Dutch origin and P664L identified in probands of Belgian, Flemish-Walloon and Dutch origin. The ambiguity between recurrent and founder mutations will only be solved when a consensus will be reached on the polymorphic sites of the LDLR gene that should be systematically typed. Finally, since many teams now systematically screen the whole gene, two-mutations alleles are now being reported. Eleven of these appear in Table 2 (18,33-37). They are not included in the mutations file of the database since it cannot, at present, accomodate two mutations on a single allele.
Table
A: Report number.
Table


B: Exon number in which the mutation occured. Exons are numbered according to Südhof et al. (6) with respect to the translational initiation site given by Yamamoto et al. (5).
C: Nucleotide position in which the mutation occured.
D: Codon number in which the mutation occured. Codons are numbered according to Yamamoto et al. (5). Therefore, the 21 amino acids of the signal peptide (exon 1) are numbered in negative (from -21 to -1). Codon number 1 is the last codon of exon 1 and encodes the first amino acid (Ala) of the mature LDL receptor. If the mutation spans more than one codon, e.g., there is a deletion of several bases, only the first (5[prime]) deleted codon is entered.
E: Normal base sequence of the codon in which the mutation occured.
F: Mutated base sequence of the codon in which the mutation occured. If the mutation is a base pair deletion or insertion, this is indicated by`del' or`ins' followed by the number of bases deleted or inserted and the position of this deletion or insertion in the codon (a, b or c). The nucleotide position is the first that is deleted or the one preceding the insertion. For example,`del19c' is a deletion of 19 bases including the third base of the codon,`ins8b' is an insertion of 8 bases occuring between the second and the third base of the codon.
G: Concerns base substitutions. It gives the base change, by convention, read from the coding strand. If the mutation predicts a premature protein-termination, the novel stop codon position is given, e.g.,`stop at 204'.
H: Concerns events occuring at a CpG dinucleotide (only C->T or G->A).
I: Concerns the restriction site that is lost, e.g.,`Msp I -', or created, e.g.,`Taq I +', by the mutation.
J: Mutation name according to Beaudet et al. (32). Missense mutations are designated by the codon number flanked by the single letter code of the normal amino acid prior and of the mutant amino acid after (e.g., Val to Met at codon 408 is designated`V408M'). Nonsense mutations are designated similarly exept that X is used to indicate any termination codon (e.g., Cys to stop at codon 134 is designated`C134X'). Frameshift, insertion and deletion mutations are designated by the nucleotide number followed by`ins' for insertion or`del' for deletion. The nucleotide position is the first that is deleted or the one preceding it in the case of insertions. Exact nucleotides are indicated for two or less bases (e.g., 617delG). For three or more bases, the insertion or deletion is specified by the size of the change (e.g. 681ins8 indicates a 8 bp insertion starting after nucleotide 681). For many of the mutations that have been reported this nomenclature has not been used. Therefore, the original name also appears in this column. These names were given according to the population or the city in which the mutation was reported first (e.g. TOKYO).
K: Wild type amino acid.
L: Mutant amino acid. Deletion and insertion mutations which result in a frameshift are designated by`Fr. '; Nonsense mutations are designated by`Stop'.
M: Protein domain in which the mutation occurs.`SP' for the signal peptide,`LB' for the ligand binding domain,`EGF' for the Epidermal Growth Factor precursor like domain,`OLS' for the O-linked sugar chains domain,`TM' for the transmembrane domain, and`CP' for the cytoplasmic domain. In the ligand-binding domain (LB), each of the seven repeats are numbered separately and according to their position with respect to the N-terminal end of the protein.
N: Functional class as defined by Hobbs et al. (40).
O: Clinical status according to Goldstein et al. (2):`Hmz' indicates homozygotes and`Htz' indicates heterozygotes.
P: Genotype:`aa' indicates homozygotes,`ab' indicates compound heterozygotes, and`Wa' indicates heterozygotes. Empty cases appear when no information is available.
Q: Number of the report in which the second mutation identified in a compound heterozygote is described. When the second mutation is one of those omitted in the database, this mutation is briefly described with respect to the coding sequence. Finally, `?' indicates that the second mutation has not been identified.
R: Recurence of the mutation.`F' indicates a founder effect,`F 2/140' indicates that the mutation was found in two unrelated probands in a sample 140 FH patients,`R' indicates recurrent mutations,`?' indicates mutations that have been identified in at least two unrelated probands of different ethnic backgrounds but for which LDLR gene haplotypes are not described,`?-F' indicates mutations for which LDLR gene haplotypes are not described (or incomplete) and that either are associated with a founder effect in the proband's ethnic or geographic origin, or have been identified in at least two unrelated probands of the same ethnic or geographic background, and`P' indicates mutations identified, to date, in a single proband.
S: Ethnic or geographic background of the proband.
T: Reference number indicating the publication in which the mutation is described. Full citations (authors, year, title, journal, volume, pages) are provided with the database. If the same mutation has been reported for the same patient in different papers, only one entry is made.*Indicates the co-authors who provided the information: *1 (Rochelle Thiart and Maritha J. Kotze), *2 (Helena Schmidt and Gert M. Kostner), *3 (Yasuko Miyake and Taku Yamamura), *4 (Heike Baron and Herbert Schuster), *5 (Margit Ebhardt and Manfred Stuhrmann) and *6 (Hartmut Schmidt).**Indicates submitted papers: **1 (O.Loubser et al.), **2 (A.Peeters et al.) and **3 (M.Callis et al.).
NEWLY DEVELOPED SOFTWARE ROUTINES
The software package contains routines for the analysis of the LDLR database that were developed with the 4th dimensionR (4D) package from ABI. The purpose of the software is to facilitate the mutational analysis of the LDLR gene at the molecular level and to provide the tools to promote the analysis of relationships between phenotype and genotype. Initially, six specific routines were developed (3). Four new routines have been added to the software: (i)`Restriction enzyme' appears on the first page of the mutation record. If the mutation modifies a restriction site, the program shows a restriction map displaying the new or abolished site and the enzymes of interest (Table 1, Column I). (ii)`Amino acid type search' studies the mutations with respect to phylogenic conservation. In effect, the LDLR gene has been identified, sequenced and converted to protein sequence in four mammalian species [complete coding sequence of the chinese hamster (SWISS-PROT accession number: p35950), the rabbit (p20063), the rat (p35952) and the mouse (p35951) LDL receptor] and in the xenope (38). The identity at the amino acid level between the human and chinese hamster, rabbit, rat, mouse and xenopus sequences are 81%, 79%, 77%, 76% and 70%, respectively. Therefore, the routine lists the mutations affecting conserved or non-conserved amino acids in the four mammals, in the xenope, or in all these sequences. (iii)`Phylogeny' studies the distibution of mutations (missense, stop and frameshift) in conserved amino acids between humans and mammals or vetebrates and in amino acids specifically found in the human protein. (iv)`Binary comparison' compares two mutation groups, each group being defined by distinct research criteria chosen from the database records (molecular, clinical, personal, etc.). The result can be displayed as either of several graphic representations (by amino acids, by exon, or by protein domain) of the distribution of the sorted mutations. Furthermore, the sorted mutations can also appear in a cumulated or detailed format (insertion, deletion, missense, nonsense).
RESULTS OF THE FIRST MOLECULAR ANALYSIS
The results of the first molecular analysis of the 350 point mutations of the database shows that 63% of the mutations are missense, and only 20% occur in CpG dinucleotides in opposition to the 32% observed in other human disease genes (39). The origin of this deficit is unknown. Although the mutations are widely distributed throughout the gene, there is an excess of mutations in exon 4 (P = 0.001) coding for the three central repeats of the ligand binding domain, and in exon 9 (P = 0.01) coding for the NH2 end of the central region of the EGF precursor like domain, between repeats B and C. Conversely, there is a deficit of mutations in exon 13 (P = 0.001) coding for the COOH end of the central region of the EGF precursor like domain, between repeats B and C, and in exon 15 (P = 0.001) coding for the O-linked sugar domain. These mutation hot- or cold-spots cannot be attributed to a technological bias since most teams screened the 18 exons of the LDLR gene. The analysis of the distribution of mutations in the ligand-binding domain, after alignment of the seven repeats, shows that 74% of the mutations in this domain affect a conserved amino acid, and that they are mostly located in the C-terminal region of the repeats. Conversely, the same analysis in the EGF-like domain, after alignment of the three repeats, shows that 64% of the mutations in this domain affect a non-conserved amino acid, and that they are mostly clustered in the N-terminal half of the repeats. Finally, the investigation of genotype/phenotype correlations remains difficult since clinical data are usually incomplete in many published mutation reports. Furthermore, many mutations were identified in compound heterozygotes and the clinical data provided results from the combined effect of the two mutations. To overcome this shortage, we are currently developing an entry in the Web site that will facilitate the input of high quality clinical information for each mutation.
DATABASE ON THE WEB
The LDLR database is now accessible through the World Wide Web at http://www.umd.necker.fr . Users of the database must cite this article. Finally, notification of omissions and errors in the current version as well as specific phenotypic data would be gratefully received by the corresponding authors.
ACKNOWLEDGEMENTS
This work was supported by grants from GREG (Groupe de Recherche et d'Etude du Génome), Fondation de France, Université René Descartes Paris V, Ministère de l'Education Nationale, de l'Enseignement Supérieur, de la Recherche et de l'Insertion Professionnelle (ACC-SV2), and Faculté de Médecine Necker, France; The South African Medical Research Council and the Universities of Stellenbosch and Free State, South Africa. M.V. is supported by a grant from Ministère de l'Education Nationale, de l'Enseignement Supérieur, de la Recherche et de l'Insertion Professionnelle. Finally, we gratefully acknowledge the help of the many clinicians that collaborated with the co-authors.
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals Comments and feedback: www-admin{at}oup.co.uk
Last modification: 17 Dec 1997
Copyright© Oxford University Press, 1998.
This article has been cited by other articles:
![]() |
M. M. Phelan, C.-T. Thai, D. C. Soares, R. T. Ogata, P. N. Barlow, and J. Bramham Solution Structure of Factor I-like Modules from Complement C7 Reveals a Pair of Follistatin Domains in Compact Pseudosymmetric Arrangement J. Biol. Chem., July 17, 2009; 284(29): 19637 - 19649. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-H. Chang, J.-P. Pan, D.-Y. Tai, A.-C. Huang, P.-H. Li, H.-L. Ho, H.-L. Hsieh, S.-C. Chou, W.-L. Lin, E. Lo, et al. Identification and characterization of LDL receptor gene mutations in hyperlipidemic Chinese J. Lipid Res., October 1, 2003; 44(10): 1850 - 1858. [Abstract] [Full Text] [PDF] |
||||
![]() |
R A Whittall, S Matheus, T Cranston, G J Miller, and S E Humphries The intron 14 2140+5G>A variant in the low density lipoprotein receptor gene has no effect on plasma cholesterol levels J. Med. Genet., September 1, 2002; 39(9): e57 - 57. [Full Text] [PDF] |
||||
![]() |
F. J. Chaves, J. T. Real, A. B. Garcia-Garcia, M. Civera, M. E. Armengod, J. F. Ascaso, and R. Carmena Genetic Diagnosis of Familial Hypercholesterolemia in a South European Outbreed Population: Influence of Low-Density Lipoprotein (LDL) Receptor Gene Mutations on Treatment Response to Simvastatin in Total, LDL, and High-Density Lipoprotein Cholesterol J. Clin. Endocrinol. Metab., October 1, 2001; 86(10): 4926 - 4932. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-P. Rabes, M. Varret, M. Devillers, P. Aegerter, L. Villeger, M. Krempf, C. Junien, and Catherine Boileau R3531C Mutation in the Apolipoprotein B Gene Is Not Sufficient to Cause Hypercholesterolemia Arterioscler Thromb Vasc Biol, October 1, 2000; 20 (10): e76 - e82. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. P. Reinhardt, R. N. Ono, H. Notbohm, P. K. Muller, H. P. Bachinger, and L. Y. Sakai Mutations in Calcium-binding Epidermal Growth Factor Modules Render Fibrillin-1 Susceptible to Proteolysis. A POTENTIAL DISEASE-CAUSING MECHANISM IN MARFAN SYNDROME J. Biol. Chem., April 14, 2000; 275(16): 12339 - 12345. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Whiteman, R. S. Smallridge, V. Knott, J. J. Cordle, A. K. Downing, and P. A. Handford A G1127S Change in Calcium-binding Epidermal Growth Factor-like Domain 13 of Human Fibrillin-1 Causes Short Range Conformational Effects J. Biol. Chem., May 11, 2001; 276(20): 17156 - 17162. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






