Molecular cell biology and human genetics

Published on 03/03/2015 by admin

Filed under Internal Medicine

Last modified 22/04/2025

Print this page

rate 1 star rate 2 star rate 3 star rate 4 star rate 5 star
Your rating: none, Average: 0 (0 votes)

This article have been viewed 3132 times

Chapter 2 Molecular cell biology and human genetics

Cell biology

Cells consist of cytoplasm enclosed within a lipid sheath (the plasma membrane). The cytoplasm contains a variety of organelles (sub-cellular compartments enclosed within their own membranes) in a mixture of salts and organic compounds (the cytosol). These are held within an adaptive internal scaffold (the cytoskeleton) that radiates from the nucleus outwards to the cell surface (Fig. 2.1). Many cells have special functions and their size, shape and behaviour adapt to meet their physiological roles. Cells can be organized into tissues and organs in which the individual component cells are in contact and able to send and receive messages, both directly and indirectly. Coordinated cellular responses can be achieved through systemic signalling, e.g. via hormones.

Cell structure

Cellular membranes

image Lipid bilayers separate the cell contents from the external environment and compartmentalize distinct cellular activities into organelles. These consist of a large variety of glycerophospholipids and sphingolipids. Membrane lipids usually have two hydrophobic acyl chains linked via glycerol or serine, to polar hydrophilic head groups (Fig. 2.2). This amphiphilic nature, with a ‘water-loving’ head and a ‘water-hating’ tail, means that in aqueous solution membrane lipids self-associate into a tail-to-tail bilayer with their hydrophobic chains separated from the aqueous phase by their polar head groups.

image Liposomes are spheres enclosed within a lipid bilayer. This is the most energetically favourable form for membrane lipids in solution. These have been used clinically to deliver more hydrophilic cargo, such as drugs or DNA, to cells.

image Plasma membranes are more complicated than liposomes. Their lipids are organized asymmetrically in the bilayer. For example, the outer leaflet of the plasma membrane is enriched in phosphatidyl-choline (PC) and the sphingolipids, whereas the inner leaflet is enriched in phosphatidyl-serine (PS) and phosphatidyl-ethanolamine (PE). This arrangement is necessary in normal physiology and in disease, not just for barrier function. For example, PC is extracted from the outer-leaflet of the canalicular membrane of hepatocytes to form the lipid/bile-salt micelles of bile. One of the sphingolipids, GM1-ganglioside, is the receptor for cholera toxin. The appearance of PS in the outer leaflet of the membrane is an early step in the apoptotic pathway and signals to macrophages to clear the dying cell, while PE, once cleaved by phospholipase, produces two signalling molecules as second messengers (see p. 25). Cholesterol is also an essential component of the plasma membrane and cannot be substituted by plant sterols, which have a subtly different shape. For this reason, the liver secretes plant sterols back into the gut.

Membrane proteins

Cells can absorb gases or small hydrophobic compounds directly across the plasma membrane by passive diffusion, but membrane proteins are required to take-up hydrophilic nutrients or secrete hydrophilic products, to mediate cell–cell communication and to respond to endocrine signals. Membrane proteins can be integral to the membrane (i.e. their protein chain traverses the membrane one or multiple times) or they can be anchored to the membrane by an acyl chain (Fig. 2.2).

The major classes are:

image Membrane channel proteins (Fig. 2.3): membrane proteins that form solute channels through the membrane can only work downhill and only to equilibrium. Solute actually moves down its electrochemical gradient, which is the combined force of the electric potential and the solute concentration gradient across the membrane. The bulk flow can be very high, the opening and closing of the channel can be regulated, and they can be selective for specific solutes. For example, the cystic fibrosis transmembrane regulator (CFTR; Fig. 2.22), the protein whose malfunction causes cystic fibrosis, is a chloride channel found on the apical surface of epithelial cells. CFTR functions to regulate the fluidity of the extra-epithelial mucous layer. When the channel opens, millions of negatively-charged chloride ions flow out of the cell down their electrochemical gradient. This induces positively-charged sodium ions to flow between the cells of the epithelium (via a paracellular pathway) to balance the electrical charge. Water follows the efflux of sodium chloride by osmosis, thus maintaining the fluidity of the mucus.

image Transporters (Fig. 2.3): in contrast to channels, transporters have a low capacity and work by binding solute on one side of the membrane which induces a conformational change that exposes the solute binding site on the other side of the membrane for release.

image Receptors: there are three major receptor categories: receptors that mediate endocytosis, anchorage receptors (e.g. integrins, see p. 23) and signalling receptors (see cell signalling p. 24). There are two forms of receptor-mediated endocytosis:

Pinocytosis is a small-scale model of phagocytosis and occurs continually in all cells. Smaller molecular complexes, such as low-density lipoprotein (LDL) (Fig. 2.4a), are internalized during pinocytosis via clathrin-coated pits. The LDL receptor has a large extracellular domain which binds LDL to induce a conformational change in an intracellular domain of the receptor which allows it to bind clathrin from the cytoplasm. Clathrin bends the membrane to form a pit that pinches inwards to become an intracellular clathrin-coated vesicle. Loss of the clathrin coat can allow fusion with other intracellular organelles or vesicles (e.g. with lysosomes for degradation of the cargo), or the coat can be maintained for transcellular transport. Defects in each step of pinocytosis can lead to disease. For example, hypercholesterolaemia (p. 1035) can result from mutation to the LDL receptor’s extracellular domain that prevents LDL binding, but the most common LDL receptor mutation results in loss of the intracellular domain and prevents recruitment of clathrin.

Organelles

Cytoplasmic organelles

image Endoplasmic reticulum (ER) is an array of interconnecting tubules or flattened sacs (cisternae) that is contiguous with the outer nuclear membrane (Fig. 2.1). There are three types of ER:

image Golgi apparatus has flattened cisternae similar to those of the ER but arranged in a stack (Fig. 2.1). Vesicles that bud from the ER with cargo destined for secretion, for the plasma membrane or for other organelles, fuse with the Golgi stack. The proteins, lipids and sterols synthesized in the ER are exported to the Golgi apparatus to complete maturation (e.g. the final stages of membrane protein glycosylation occurs here). The mature products are then sorted into vesicles that bud from the Golgi for transport to their final destination (Fig. 2.4b,c). Mutation in the Golgin protein GMAP-210, with a probable role in tethering of the Golgi cisternae, causes achondrogenesis type 1A, where Golgi architecture is disrupted, particularly in bone cells.

image

Golgi apparatus.

(Courtesy of Louisa Howard, Dartmouth EM Facility.)

image Lysosomes mature from vesicles (endosomes) that bud from the Golgi. They contain digestive enzymes such as lipases, proteases, nucleases and amylases that work in an acidic environment. The membrane of the lysosome therefore includes a proton ATPase pump to acidify the lumen of the organelle. Lysosomes fuse with phagocytotic vesicles to digest their contents. This is crucial to the function of macrophages and polymorphs (neutrophils and eosinophils) in killing and digesting infective agents, in tissue remodelling during development, and osteoclast remodelling of bone. Not surprisingly, many metabolic disorders result from impaired lysosomal function (p. 1040).

image Peroxisomes contain enzymes for the catabolism of long-chain fatty acids and other organic substrates like bile acids and D-amino acids. Hydrogen peroxide (H2O2), a by-product of these reactions, is a highly reactive oxidizing agent, so peroxisomes also contain catalase to detoxify the peroxide. Catalase can reduce H2O2 to water while oxidizing harmful phenols and alcohols thus beginning their detoxification. Peroxisome dysfunction can lead to rare metabolic disorders such as leukodystrophies and rhizomelic dwarfism.

image Mitochondria are the engines of the cell, providing energy in the form of ATP. Mitochondria can be small, discrete and few in number in cells with low energy demand, or large and abundant in cells with a high energy demand like hepatocytes or muscle cells. The mitochondrion has its own genome encoding 13 proteins. The other proteins (~1000) required for mitochondrial function are encoded by the nuclear genome and imported into the mitochondrion. The mitochondrion has a double membrane surrounding a central matrix. The central matrix contains the enzymes for the Krebs cycle, which accepts the products of sugar and fatty acid catabolism and uses it to produce cofactors that donate their electrons into the electron transport chain of the inner membrane (see pp. 20, 31). The inner membrane is highly folded into cristae to increase its effective surface area. The protein complexes of the electron transport chain accept and donate electrons in redox reactions, releasing energy to efflux protons (H+) into the inter-membrane space. ATP synthase, another integral membrane protein, uses this H+ electrochemical gradient to drive formation of ATP. Mitochondria have many additional functions, including roles in apoptosis (see p. 32) and supply of substrates for biosynthesis. Mitochondria are also necessary for the synthesis of porphyrin, deficiency of which causes a range of diseases collectively called porphyrias (p. 1043).

image

Mitochondria.

(Courtesy of Louisa Howard, Dartmouth EM Facility.)

The cytoskeleton

A complex network of structural proteins regulates the shape, strength and movement of the cell, and the traffic of internal organelles and vesicles. The major components are microtubules, intermediate filaments and microfilaments.

image Microtubules (20–25 nm diameter) are polymers of α and β tubulin. These tubular structures resist bending and stretching, and are polar with plus and minus ends. They emanate from the microtubule organizing centre (MTOC), a complex of centrioles, γ-tubulin and other proteins, with their plus ends extending into the cell. At their plus ends repeated cycles of assembly and disassembly permit rapid changes in length. Microtubules form a ‘highway’, transporting organelles and vesicles through the cytoplasm. The two major microtubule-associated motor proteins (kinesin and dynein) allow movement of cargo to the plus and minus ends, respectively. During cell division the MTOC forms the mitotic spindle (see p. 28). Drugs that disrupt microtubule assembly (e.g. colchicine and vinca alkaloids) or stabilize microtubules (taxanes) preferentially kill dividing cells by preventing mitosis.

image Intermediate filaments (~10 nm) form a network around the nucleus extending to the periphery of the cell. They make cell-to-cell contacts with adjacent cells via desmosomes, and with basement matrix via hemidesmosomes (Fig. 2.5; see also Fig. 24.27). Their function appears to be structural integrity; they are prominent in cellular tissues under stress and their disruption in genetic disease can cause structural defects or cell collapse. More than 40 different types of proteins polymerize to form intermediate filaments specific to particular cell types. For example keratin intermediate fibres are only found in epithelial cells whilst vimentin is in mesothelial (fibroblastic) cells. However, lamin intermediate filaments form the nuclear membrane skeleton in most cells.

image Microfilaments (3–6 nm) are polymers of actin, one of the most abundant proteins in all cells. The actin microfilament network controls cell shape, prevents cellular deformation, is involved in cell–cell and cell–matrix adhesion, in cell movements such as crawling and cytokinesis (cell division), and in intracellular vesicle transport. Bundles of actin filaments form the structural core of cellular protrusions such as microvilli, lamellipodia and filopodia (see below). Actin microfilament bundles within the cell can associate with myosin II to form contractile stress fibres, similar to muscle sarcomeres. Stress fibres are often found as circumferential belts around the apical surfaces of epithelial cells where cells associate with adjacent cells via adherens junctions, permitting reaction to external stresses as a cellular sheet. Stress fibres also form where actin interacts via accessory proteins with the extracellular matrix at sites of focal adhesion (see Fig. 2.8c). This occurs during cell movements during inflammation, wound healing and metastasis. During cytokinesis actin-myosin II bundles form the contractile ring separating dividing cells. Like microtubules, microfilaments are polar, so can be used to transport secretory vesicles, endosomes and mitochondria, powered by motor proteins, including myosin I and V.

image

Figure 2.5 Cytoskeleton of epithelial cells. (a) Keratin red, nuclei stained in blue. (b) Keratin filaments (in red) and a desmosomal plaque component desmoplakin in green.

(Reproduced with permission from Moll R, Divo M, Langbein L. The human keratins: biology and pathology. Histochemistry and Cell Biology 2008; 129:705–733.)

Cell shape and motility

The cytoskeleton determines cell shape and surface structures.

Microvilli. The apical surface of some epithelial cells is covered in tiny microvilli (~1 µm long) forming a brush border of thousands of small finger-like projections of the plasma membrane that increase the surface area for uptake or efflux (Fig. 2.6). At their core are 20–30 cross-linked actin microfilaments.

Motile cilia are also fine, finger-like protrusions but these are longer (~10–20 µm long) (Fig. 2.6). At their core is an axoneme, a bundle of nine cross-linked tubulin microtubule doublets surrounding a central pair. The action of the motor domain dynein serves to bend the cilium. Neighbouring cilia tend to beat in unison generating waves of motion that move fluid over the cell surface in the gut and airways (see Fig. 15.9), and also in the fallopian tubes.

Non-motile or primary cilia. Most cells also have a single primary cilium. These cilia have a variant axoneme with no central pair of microtubules and while they have dynein they are non-motile (the dynein is used to traffic cargo along the axoneme). Primary cilia are used for signalling during development and in the adult. Other related non-motile cilia are found in specialized cells, e.g. in the photoreceptors of the retina, the sensory neurones of the olfactory system, and in the sensory hair cells of the cochlea. A range of human ciliopathies (Fig. 2.7) have been described with pleiotropic symptoms depending on which cilia are affected. These include polycystic kidney disease, Bardet–Biedl syndrome (p. 1007), Joubert’s syndrome and Ellis–van Creveld syndrome.

Flagella. The single flagellum found on sperm is structurally related to cilia but is longer (~40 µm) and has a whip-like motion.

Cell motility is essential during development and in the adult when macrophages migrate to sites of infection, keratinocytes migrate to close wounds, osteoclasts and osteoblasts tunnel into and remodel bone, and fibroblasts migrate to sites of injury to repair the extracellular matrix. Most cell motility in the adult human takes the form of cell crawling which is dependent on remodelling of the actin cytoskeleton. How the actin cytoskeleton is remodelled determines the mode of migration:

Movement. A similar mechanism involving the coordinated remodelling of the cytoskeleton and the formation and release of cell adhesions underlies all three modes of migration. Essentially, actin is polymerized at the leading edge extending the plasma membrane forward. New adhesions are formed with the substratum (cells and/or extracellular matrix) at the leading edge to provide purchase. Release of attachments and depolymerization of the actin filaments at the trailing edge then allows the cell to move forward. Myosin and myosin motor proteins may also be involved at the trailing edge providing the tractive force to pull the cell body forward. The complex coordination of these processes is controlled via signalling pathways involving members of the Rho protein family of GTPases (see p. 21). Key signalling targets are the WASp family of proteins which stimulate actin polymerization. The significance of cell motility in humans is illustrated by mutation of the WASp expressed in blood cell lineages, which causes Wiskott–Aldrich syndrome (p. 66), and is characterized by severe immunodeficiency and thrombocytopenia (platelet deficiency).

The cell and its environment

Most cells differentiate or specialize to perform particular functions within tissues where they interact with the extracellular matrix (ECM) or other cells. The major tissue types are epithelia and connective tissues as well as muscle and neural tissue:

Extracellular matrix

The ECM is the gel matrix outside the cell, usually secreted by fibroblasts. ECM determines tissue properties, e.g. in bone it is calcified; in tendons it is tough and rope-like; and in neural tissue it is almost absent. However, ECM is more than just a support matrix. It affects cell shape, migration, cell-cell communication and signalling, proliferation and survival.

The gel or ground substance of the ECM is made from polysaccharides (glycosaminoglycans or GAGs), usually bound to proteins to form proteoglycans (p. 494). These are a diverse group of molecules conferring different matrix properties in different tissues. They form hydrated gels which can resist compression yet permit diffusion of metabolites and signalling molecules.

Fibrous proteins of ECM (p. 495) include collagens and tropoelastin, which polymerize into collagen and elastin fibres, and fibronectin which is insoluble in many tissues but soluble in plasma. Collagen provides tensile strength, elastin confers elasticity, while the widely distributed fibronectin adheres to both cells and ECM, and thus positions cells within the ECM. Collagens, the most abundant proteins in the body, are widely distributed and play a structural role in skin and bone, where collagen defects and disorders often manifest. Elastin fibres are abundant in arteries, lung and skin. Elastic fibres have a fibrillin sheath and fibrillin mutations underlie Marfan’s syndrome (p. 760). The ECM can be degraded and remodelled by proteins of the matrix metalloproteinase (MMP) family. These are needed for angiogenesis and morphogenesis and are also involved in the pathophysiology of cancer, cirrhosis and arthritis.

Basal lamina or basement membrane (lamina propria) is a specialized form of ECM, which separates cells from underlying tissue and provides a supportive, anchoring and protective role. Basal lamina can also act as molecular filters (e.g. glomerular filtration barrier, p. 636) and mediate signalling between adjacent tissues (e.g. epidermal-dermal signalling in skin). Type IV collagen, heparan sulphate proteoglycan, laminin and nidogen are key basal lamina proteins. Inherited abnormalities in these proteins cause skin blistering diseases (see Fig. 24.27). Breach of the basal lamina by invading cancer cells is a key stage in progression of epithelial carcinoma in situ to a malignant carcinoma.

Cell–cell adhesion

Cells need to interact directly for barrier function, tissue strength and to communicate. This is mediated by several types of proteins that form junctions between cells.

Cell–cell adhesion proteins (Fig. 2.8a)

As well as adhesion via multiprotein junctions, intercellular adhesion is achieved by individual transmembrane proteins.

Tight junctions (zonula occludens)

These are mediated by the integral membrane proteins, claudins and occludens; they hold cells together. They form at the top (apical) side of epithelial cells including intestinal, skin and kidney cells, and endothelial cells of blood vessels (Fig. 2.8) to provide a regulated barrier to the movement of ions and solutes through the epithelia or endothelia but also between cells (paracellular transport). Tight junctions also confer polarity to cells by acting as a gate between the apical and the baso-lateral membranes, preventing diffusion of membrane lipids and proteins. Twenty-four claudins (the protein in the junction) are differentially expressed in different cell types to regulate paracellular transport. For example, changes in claudin expression in the kidney nephron correlate with permeability changes. Mutations in claudins 16 (previously named parcellin-1) and 19, expressed in the thick ascending limb in the loop of Henle in the kidney, cause an inherited renal disorder, familial hypomagnesaemia with hypercalciuria and nephrocalcinosis (FHHNC; p. 657).

Gap junctions

Gap junctions (Fig. 2.8) allow low molecular weight substances to pass directly between cells, permitting metabolic and electric coupling (e.g. in cardiomyocytes). Protein channels made of six connexin proteins (as well as claudins and occludens) are aligned between adjacent cells and allow the passage of solutes up to 1000 kDa (e.g. amino acids, sugars, ions, chemical messengers). The channels are regulated by many factors such as intracellular Ca2+, pH, voltage. Gap junctions form in almost all interacting cells, but connexin family members are differentially expressed. Mutant connexins cause many inherited disorders, such as the X-linked form of Charcot–Marie–Tooth disease (GJB1; p. 1147) and are also a major cause of genetic hearing loss (GJB2).

Adherens junctions

Adherens junctions are multiprotein intercellular adhesive structures, prominent in epithelial tissues (Fig. 2.8b). They attach principally to actin microfilaments inside the cell with the aid of multiple additional proteins, and also attach and stabilize microtubules. At the apical sides of epithelial cells a prominent type of adherens junction, the zonula adherens, attaches to the circumferential actin stress fibres. The fascia adherens in cardiac muscle is also an adherens junction. Transmembrane proteins of the cadherin family provide the adhesion through interaction of their extracellular domains. Downregulation of cadherins is a feature of cancer progression in many cells.

Desmosomes (macula adherens)

Desmosomes provide strong attachment between cells and are prominent in tissues subject to stress such as skin and cardiac muscle (see Fig. 2.5, Fig. 2.8b and Fig. 24.1). Like adherens junctions, they are multiprotein complexes, where adhesion is provided by transmembrane cadherin proteins, desmogleins and desmocollins. However, within the cell desmosomes interact principally with intermediate filaments rather than microfilaments and microtubules. Germline mutations in genes encoding desmosomes are a cause of cardiomyopathy with/without cutaneous features and in pemphigus vulgaris and pemphigus foliaceus (p. 1222).

Basement membrane adhesion

Cells adhere (Fig. 2.8c) to non-basal lamina ECM via secreted proteins such as fibronectin and collagen, and to basal lamina proteins via focal adhesion and hemidesmosome multiprotein complexes (e.g. keratin or vimentin). Here, integrins replace cadherins as surface adhesion molecules as the key adhesive proteins. Integrins are transmembrane sensors or receptors, which change shape upon binding to ECM, a process called ‘outside-in’ signalling. Inside the cell, integrins interact with the cytoskeleton and a complex array of over 150 proteins that influence intracellular signalling pathways affecting proliferation, survival, shape, mobility and gene expression.

Defective integrins are associated with many immunological and clotting disorders such as Bernard–Soulier syndrome and Glanzmann’s thrombasthenia (p. 420).

Cellular mechanisms

Cell signalling

Signalling or communication between cells is often via extracellular molecules or ligands which can be proteins (e.g. hormones, growth factors), small molecules (e.g. lipid-soluble steroid hormones such as oestrogen and testosterone) or dissolved gases such as nitric oxide. The signal is usually received by membrane protein receptors, although some signals such as steroid hormones, enter the target cell where they interact with intracellular receptors (Fig. 2.9). Some signalling, especially in the immune system, relies on cell–cell contact, where the signalling molecule (ligand) and receptor are on adjacent cells.

Receptors transduce signals across the membrane to an intracellular pathway or second messengers to change cell behaviour, often ultimately affecting gene expression (Figs 2.9, 2.10). The membrane-bound receptors fall into three main groups based on downstream signalling pathways:

image Ion channel linked receptors (voltage or ligand activated ion channels; see Fig. 2.3). At synaptic junctions between neurones (Fig. 22.1), these receptors open in response to neurotransmitters such as glutamate, epinephrine (adrenaline) or acetylcholine to cause a rapid depolarization of the membrane.

image G-protein-linked receptors such as the odorant and light (opsin) family of receptors belong to a large family of seven-pass transmembrane proteins (see Figs 2.2 and 2.9). On activation by ligand G-protein-linked receptors bind a GTP-binding protein (G-protein), which activates adjacent enzyme complexes or ion channels (Figs 2.9 and 22.1). The adjacent enzymes can be adenylcyclase (see below).

image Enzyme-linked receptors (Figs 2.2 and 2.9) typically have an extracellular ligand-binding domain, a single transmembrane-spanning region, and a cytoplasmic domain that has intrinsic enzyme activity or which will bind and activate other membrane-bound or cytoplasmic enzyme complexes. This group of receptors is highly variable but many have kinase activity or associate with kinases, which act by phosphorylating substrate proteins usually on a tyrosine (e.g. the platelet-derived growth factor (PDGF) receptor) or a serine/threonine (e.g. the transforming growth factor-beta (TGF-β) receptor).

Signal transduction

Signal transduction from the receptor to the site of action in the cell is mediated by small signalling molecules called second messengers, or by signalling proteins (Fig. 2.9). Changes to activity of signalling proteins by acquired mutation occur in cancer, and many anti-cancer drugs target signalling pathways. For example, the Hedgehog pathway is involved in human development, tissue repair and cancer (Fig. 2.10). Inhibitors of this pathway are being developed for therapeutic interventions. The Wnt pathway is also involved in bone formation (p. 550).

image Second messengers include cAMP and lipid-derived inositol triphosphate (IP3) and diacylglycerol (Fig. 2.9). These molecules diffuse from the receptor to bind and change the activity of downstream proteins propagating the signal. cAMP triggers a protein signalling cascade by activating a cAMP-dependent protein kinase. Diacylglycerol activates protein kinase C while IP3 mobilizes calcium from intracellular stores (e.g. from the ER; Fig. 14.9).

image G-proteins or GTP-binding proteins are signalling proteins which switch between an active state when GTP is bound and an inactive state when bound to GDP. The most well-known members are the Ras superfamily, comprising Ras, Rho, Rab, Arf and Ran families. Activation of Ras members by somatic mutation is found in ~33% of human cancers. Ras members are often bound downstream of tyrosine kinase receptors, where they transmit signals by activating a cascade of downstream protein kinase activity (Fig. 2.9). Ras signalling molecules have roles in many cellular activities, including regulation of cell cycle, intracellular transport, and apoptosis.

image Kinase and phosphatase signalling proteins are enzymes that phosphorylate or dephosphorylate residues on downstream proteins to alter their activity. Chains of kinase activity (phosphorylation cascades) consisting of sequential phosphorylation of proteins can transduce signals from the membrane receptor to the site of action in the cell. The tyrosine kinase receptors phosphorylate each other when ligand binding brings the intracellular receptor components into close proximity (see Fig. 2.9). The inner membrane and cytoplasmic targets of these activated receptor complexes are ras, protein kinase C and ultimately the MAP (mitogen-activated protein) kinase, Janus-Stat pathways or phosphorylation of IκB causing it to release its DNA-binding protein, nuclear factor kappa B (NFκB). For example, activated Ras binds and activates the kinase Raf, the first of a set of three mitogen-activated protein (MAP) kinases, which transmit signals by successive phosphorylation of target proteins which can ultimately effect transcription (Fig. 2.9). Kinases and phosphatases are frequently mutated in cancers. Somatic mutations in one Raf member, B-Raf, occur in ~60% of malignant melanomas (usually the mutation V600E) and are common in other cancers (p. 1225).

Nuclear control

DNA and RNA structure

Hereditary information is contained in the sequence of the building blocks of double-stranded deoxyribonucleic acid (DNA) (Fig. 2.11). Each strand of DNA is made up of a deoxyribose-phosphate backbone and a series of purine (adenine (A) and guanine (G)) and pyrimidine (thymine (T) and cytosine (C)) bases, and because of the way the sugar phosphate backbone is chemically coupled, each strand has a polarity with a phosphate at one end (the 5′ end) and a hydroxyl at the other (the 3′ end). The two strands of DNA are held together by hydrogen bonds between the bases. A can only pair with T, and G can only pair with C, therefore each strand is the antiparallel complement of the other (Fig. 2.11b). This is key to DNA replication because each strand can be used as a template to synthesize the other.

image

Figure 2.11 DNA and its structural relationship to human chromosomes. (a) A polynucleotide strand with the position of the nucleic bases indicated. Individual nucleotides form a polymer linked via the deoxyribose sugars. The 5′ carbon of the heterocyclic sugar structure is coupled to a phosphate molecule. The 3′ carbon couples to the phosphate on the 5′ carbon of ribose of the next nucleotide forming the sugar-phosphate backbone of the nucleic acid. The 5′ to 3′ linkage gives orientation to a sequence of DNA. (b) Double-stranded DNA. The two strands of DNA are held together by hydrogen bonds between the bases. T always pairs with A, and G with C. The orientation of the complementary single strands of DNA (ssDNA) is thus complementary and anti-parallel, i.e. one will be 5′ to 3′ while the partner will be 3′ to 5′. The helical 3D structure has major and minor grooves and a complete turn of the helix contains 12 base-pairs. These grooves are structurally important, as DNA-binding proteins predominantly interact with the major grooves. (c) Supercoiling of DNA. The large stretches of helical DNA are coiled to form nucleosomes and further condensed into the chromosomes that can be seen at metaphase. DNA is first packaged by winding around nuclear proteins – histones – every 180 bp. This can then be coiled and supercoiled to compact nucleosomes and eventually visible chromosomes. (d) At the end of the metaphase DNA replication results in a twin chromosome joined at the centromere. This picture shows the chromosome, its relationship to supercoiling, and the positions of structural regions: centromeres, telomeres and sites where the double chromosome can split. Chromosomes are assigned a number or X or Y, plus short arm (p) or long arm (q). The region or subregion is defined by the transverse light and dark bands observed when staining with Giemsa (hence G-banding) or quinacrine and numbered from the centromere outwards. Chromosome constitution = chromosome number + sex chromosomes + abnormality; e.g. 46XX = normal female; 47XX+21 = Down’s syndrome; (trisomy 21) 46XYt (2;19) (p21; p12) = male with a normal number of chromosomes but a translocation between chromosome 2 and 19 with breakages at short-arm bands 21 and 12 of the respective chromosomes.

The two strands twist to form a double helix with a major and a minor groove, and the large stretches of helical DNA are coiled around histone proteins to form nucleosomes (Fig. 2.11c). They can be condensed further into the chromosomes that can be visualized by light microscopy at metaphase (see below; Fig. 2.11, Fig. 2.19).

To express the information in the genome, cells first transcribe the code into the single strand ribonucleic acid (RNA). RNA is similar to DNA in that it comprises four bases A, G and C but with uracil (U) instead of T, and a sugar phosphate backbone with ribose instead of deoxyribose. Several types of RNA are made by the cell. Messenger RNA (mRNA) codes for proteins that are translated on ribosomes. Ribosomal RNA (rRNA) is a key catalytic component of the ribosome and amino acids are delivered to the nascent peptide chain on transfer RNA (tRNA) molecules. There are also a variety of RNAs that regulate gene expression or RNA processing. These include microRNA (miRNA) and small interfering RNA (siRNA) (see p. 27) that typically bind to a subset of mRNAs and inhibit their translation, or initiate their degradation, respectively. Other non-coding RNAs are involved in X-inactivation and telomere maintenance or RNA splicing and maturation.

DNA transcription

A gene is a length of DNA (usually 20–40 kb but the muscle protein dystrophin is encoded by 2.4 Mb) that contains the codes for a polypeptide sequence. Three adjacent nucleotides (a codon) specify a particular amino acid, such as AGA for arginine. There are only 20 common amino acids, but 64 possible codon combinations make up the genetic code. This redundancy means that most amino acids are encoded by more than one triplet and other codons are used as signals for initiating or terminating polypeptide-chain synthesis.

RNA is transcribed from the DNA template by an enzyme complex of more than one hundred proteins including RNA polymerase, transcription factors and enhancers. Promoter regions upstream of the gene dictate the start point and direction of transcription. The complex binds to the promoter region, the nucleosomes are remodelled to allow access, and a DNA helicase unwinds the double helix. RNA, like DNA, is synthesized in the 5′ to 3′ direction as ribonucleotides are added to the growing 3′ end of a nascent transcript. RNA polymerase does this by base-pairing the ribonucleotides to the DNA template strand running in the 3′ to 5′ direction. Messenger RNA is modified as it is synthesized (Fig. 2.12). It is capped at the 5′ end with a modified guanine that is required for efficient processing of the mRNA and efficient translation, and introns are spliced from the nascent chain. Finally, the 3′ of the mRNA is modified with up to 200 A nucleotides by the enzyme poly-A polymerase. This 3′ poly-A tail is essential for nuclear export (through the nuclear pores), stability and efficient translation into protein by the ribosome.

Human protein coding sequences (exons) are interrupted by intervening sequences that are non-coding (introns) at multiple positions (Fig. 2.12). These have to be spliced from the nascent message in the nucleus by an RNA/protein complex called a spliceosome. Differential splicing describes the process by which two or more introns and their intervening exons are spliced from the mRNA. This contributes significantly to the complexity of the human transcriptome as proteins translated from these messages lack particular domains. This exon skipping can produce different protein activities.

Control of gene expression

The genome of all cells in the body encodes the same genetic information, yet different cell types express a very different subset of proteins and respond to external signals to switch on a new set of genes or to switch off a pathway. Gene expression can be controlled at many steps from transcription to protein degradation. However, for many genes transcription is the key point of regulation. This is controlled primarily by proteins which bind to short sequences within the promoter regions that either repress or activate transcription, or to more distant sequences where proteins bind to enhance expression. These transcription factors and enhancers are often the end points of signalling pathways that transduce extracellular signals to changes in gene expression (Fig. 2.9).

Often this involves the translocation of an activated factor from the cytoplasm to the nucleus. In the nucleus the DNA binding proteins recognize the shape and position of hydrogen bond acceptor and donor groups within the major and minor grooves of the double helix (i.e. the double helix does not need to be unwound). There are several classes of DNA binding protein that differ in the protein structural motif that allows them to interact with the double helix. These primarily include helix-turn-helix, zinc finger and leucine zipper motifs, although protein loops and β-sheets are used by some proteins. More permanent control of gene expression patterns can be achieved epigenetically. These are modifications (typically methylation and/or acetylation) of the DNA, or the histones of the nucleosome, that silence genes. Epigenetic modification is also heritable meaning that a dividing liver cell, for example, can give rise to two daughter cells with the same epigenetic signals such that they express the appropriate transcriptome for a liver cell. Epigenetic change forms the basis of genetic imprinting (see p. 42).

Most of the genome is transcribed but only a minority of transcripts encode proteins (see Human Genetics, p. 34). The non-coding RNAs (ncRNAs) include a group that regulate gene expression (see DNA and RNA structure). miRNAs and siRNAs are short ncRNAs (19–29 bp) that are known to regulate expression of approximately 30% of genes by degradation of transcripts or repression of protein synthesis. With further annotation of the genome a growing range of additional regulatory ncRNA classes are being identified, many of which control gene expression by epigenetic mechanisms.

The cell cycle and mitosis

The cell duplication cycle has four phases, G1, S, G2 and mitosis (Fig. 2.13), and takes about 20–24 hours to complete for a rapidly dividing adult cell. G1, S and G2 are collectively known as interphase during which the cells double in mass (the two gap phases are used for growth) and duplicate their 46 chromosomes (S phase). Mitosis describes, in four sub-phases (prophase, metaphase, anaphase and telophase), the process of chromosome separation and nuclear division before cytokinesis (division of the cytoplasm into two daughter cells).

image

Phases of mitosis. DNA is in blue and the microtubules of the cytoskeleton and mitotic spindle in green. The red marker CENP-V labels kinetochores in prometaphase and metaphase, the mid-zone in anaphase and the mid-body in cytokinesis.

(Courtesy of Tadeu AM, Ribeiro S, Johnston J et al. CENP-V is required for centromere organization, chromosome alignment and cytokinesis. The EMBO Journal 2008; 27:2510–2522.)

Synthesis phase; DNA replication

DNA synthesis is initiated simultaneously at multiple replication forks in the genome and is catalysed by a multienzyme complex. The key components of the replication machinery are:

image DNA helicase which hydrolyses ATP to unwind the double helix and expose each strand as a template for replication. The two strands are antiparallel, and because DNA can only be extended by addition of nucleotide triphosphates to the 3′-hydroxyl end of the growing chain, replication of each strand must be treated differently. For one strand, called the leading template strand, the replication fork is moving in a 3′ to 5′ direction along the template, meaning that the newly synthesized strand is being synthesized in a 5′ to 3′ direction.

image DNA primase synthesizes a short (~10 nucleotide) RNA molecule annealed to the DNA template which acts as a primer for DNA polymerase.

image DNA polymerase extends the primer by adding nucleotides to the 3′-end. For the leading template strand, the RNA primer is only required to initiate synthesis once and polymerization continues just behind the replication fork. For the antiparallel strand, the template is being exposed in a 5′ to 3′ direction and DNA primase is required to synthesize RNA primers every ~200 nucleotides to prime DNA synthesis in the opposite direction to the replication fork. To allow for this, the synthesis against this template is delayed and so it is called the lagging strand and requires more of the strand to be exposed for DNA primase and DNA polymerase to engage.

image Single-strand DNA binding proteins are required to bind to the exposed single-strand DNA and stabilize it in single-strand form. Once DNA polymerase has extended the new strand to cover the 200 nucleotides between each RNA primer (the single-strand RNA/DNA hybrid is called an Okazaki fragment).

Control of the cell cycle and checkpoints

Cells can exit the cell cycle and become quiescent. Indeed most terminally-differentiated adult cells are in a phase termed G0 in which the cycling machinery is switched off. In some cell types the switch is irreversible (e.g. in neurones), but others, like hepatocytes, retain the ability to re-enter the cell cycle and proliferate. This gives the liver a significant ability to regenerate following damage.

Cyclin-dependent kinases (Cdks), Retinoblastoma (Rb) and p53

Progression through the cell cycle is tightly controlled and punctuated by three key checkpoints when the cell interprets environmental and cellular signals to determine whether it is appropriate or safe to proceed (Fig. 2.13). The switches that allow progression beyond these checkpoints are a family of small protein complexes called cyclin-dependent kinases (Cdks) that phosphorylate serines or threonines in key target proteins at each stage. It is the regulatory cyclin subunit of the Cdks that oscillates during the cell cycle (the actual kinase domain may be present throughout but only activated by the transient expression of its cognate cyclin).

Checkpoints

The restriction point (G1 checkpoint)

The restriction point works to ensure the cell cycle does not progress into S-phase unless growth conditions are favourable and the genomic DNA is undamaged. The cyclin-cdk complexes active early in S-phase are denoted S-Cdk (cyclin A with Cdk1 or Cdk2).

S-Cdks have two roles:

S-Cdks are themselves subject to regulation by G1-Cdk (cyclin D1–3 with Cdk4 or Cdk5) and G1/S-Cdk (cyclin E with Cdk2), both of which can stimulate cyclin A synthesis. Two major cancer pathways converge on this checkpoint via the cyclin-Cdks:

Synthesis and secretion

Protein translation

The mature mRNA is transported through the nuclear pore into the cytoplasm for translation into protein by ribosomes (Fig. 2.12).

Translation of secreted or integral membrane proteins is different. Typically, the first few amino acids of the amino terminus of the nascent polypeptide exit the ribosome and are recognized by a signal recognition particle (SRP) that stops translation until the complex is docked onto the ER via the SRP receptor. Translation then continues and the protein is translocated into or through the ER membrane via the Sec61 translocation complex as it is being synthesized (co-translational transport).

Lipid synthesis

Fatty acids, molecules with a hydrocarbon chain with 4–28 carbons, are central to cellular life and human metabolism. They form the hydrophobic moiety of membrane lipids (see p. 17), they are precursors for short-lived, near acting lipid paracrines such as leukotrienes and prostaglandins, and they are energy stores particularly in the form of triglycerides.

Intracellular trafficking, exocytosis (secretion) and endocytosis

The molecular composition, the lipids, proteins and cargo of each type of organelle is different and distinct from the plasma membrane, yet there is a continuous flux of material between many of the different compartments. Much of this flow is via vesicles that bud from one compartment to fuse with another. It is regulated by an array of lipids and membrane proteins (coat proteins, adaptors, signalling molecules and fusion proteins).

Vesicles that fuse with the plasma membrane replenish membrane lipids and proteins and also release cargo extracellularly (exocytosis; Fig. 2.4). Clathrin-coated vesicles are also used to recycle protein from the plasma membrane, and import extracellular cargo to internal compartments called endosomes. From endosomes cargo such as receptors is recycled back to the membrane, or cargo is sent for degradation in the lysosome in the process called endocytosis.

Pinocytosis and phagocytosis (see p. 19) are forms of endocytosis. Endocytosis can also occur via plasma membrane microdomains or lipid rafts called caveolae which pinch in to form uncoated vesicles that fuse with endosomes. Endocytosed vesicles can also be transported across the cell in a process called transcytosis. For example, cargo can be endocytosed at the apical surface of an epithelial cell and exocytosed across the basolateral membrane.

Energy production

As food is catabolized, cells temporarily store the energy released in carrier molecules. These include reduced nicotinamide adenine dinucleotide and reduced nicotinamide adenine dinucleotide phosphate (NADH and NADPH, respectively) that release energy as they are oxidized to NAD+ and NADP+. The molar ration of NAD+ to NADH is typically high in a cell because NAD+ is used as an oxidizing agent in catabolic pathways. In contrast, the molar ratio of NADP+ to NADPH is typically low because NAPH is used as a reducing agent in anabolic reactions. The most versatile carrier is adenosine triphosphate (ATP). ATP can be hydrolysed to ADP and phosphate (Pi) and the release of energy used to power less favourable reactions.

The lipids and polysaccharides provide the most energy in a human diet, although protein can also be used. Enzymes secreted into the gut break down these polymers to their respective building blocks of fatty acids and sugars that are absorbed by the apical membrane of the gut epithelium (the transporters involved in the transcellular transport of glucose across the enterocyte are described in Figure 6.24). Fatty acids and sugars are further catabolized by enzyme pathways inside the cell to produce an array of activated carrier molecules.

Glycolysis

The six-carbon glucose is primarily catabolized in 10 steps by enzymes of the glycolytic pathway (see Fig. 8.25) to produce two three-carbon molecules of the carboxylic acid pyruvate. Glycolysis occurs in the cytosol and the first three steps actually consume energy (2×ATP), but the remaining six steps generate 4×ATP and 2×NADH, giving a net return of 2×ATP and 2×NADH.

Pyruvate is central to metabolism. It can be catabolized as fuel for the Krebs cycle and oxidative phosphorylation. It can regulate the cellular redox state by dehydration to lactate and regeneration of NADH. It can be a precursor for anabolism of fuels (glucose, glycogen and fatty acids) or amino acids, via conversion to alanine. The fate of pyruvate depends on the environmental conditions and needs of the cell.

Under anaerobic conditions, e.g. in skeletal muscle following prolonged exercise where NAD+ must be regenerated (because it is needed as an oxidizing reagent in the catabolism of glucose), pyruvate is reduced to lactate as NADH is oxidized to NAD+ in a ‘redox’ reaction catalysed by lactate dehydrogenase. This allows the muscle to continue to catabolize glucose to generate ATP under conditions in which metabolic oxygen is limiting. The lactate is secreted into the bloodstream and is ultimately metabolized by the liver back into glucose by gluconeogenesis consuming 6×ATP in the process. This cycle of anaerobic respiration that produces lactate in muscle, which is released into the bloodstream to be taken up by the liver for reconversion to glucose is known as the Cori cycle.

Oxidative phosphorylation

The activated carriers NADH and FADH2 carry high energy electrons as hydride (a proton H+ and two electrons), which are donated to complexes of the electron transport chain, in the process regenerating NAD+ and FAD as oxidizing agents for continued oxidative metabolism. The electrons are passed down the series of inner membrane proteins of the mitochondrion, moving to a lower energy state at each step until they are finally transferred to oxygen to produce water (hence the requirement for molecular oxygen). The energy released by the electrons is used to efflux protons (H+) into the inter-membrane space, setting up an H+ electrochemical gradient, which the ATP synthase (or F0F1 ATPase), another integral membrane protein, uses to drive the formation of ATP from ADP and Pi. Oxidative phosphorylation produces the bulk of the cellular ATP. A single molecule of glucose is able to produce a net yield of approximately 30×ATP. Only two of these come from glycolysis directly.

Cellular degradation and death

Cell dynamics

Cell components are continually being formed and degraded, and most of the degradation steps involve ATP-dependent multienzyme complexes. Old cellular proteins are mopped up by a small cofactor molecule called ‘ubiquitin’, which interacts with these worn proteins via their exposed hydrophobic residues. Ubiquitin is a small 8.5 kDa regulating protein present universally in all living cells. Cells mark the destruction of a protein by attaching molecules to the protein. This ‘ubiquitination’ signals the protein to move to lysosomes or proteosomes for destruction. A complex containing more than five ubiquitin molecules is rapidly degraded by a large proteolytic multienzyme array termed ‘26S proteosome’. Ubiquitin also plays a role in regulation of the receptor tyrosine kinase in the cell cycle and in repair of DNA damage. The failure to remove worn proteins can result in the development of chronic debilitating disorders. For example, Alzheimer’s and frontotemporal dementias are associated with the accumulation of ubiquinated proteins (prion-like proteins), which are resistant to ubiquitin-mediated proteolysis. Similar proteolytic-resistant ubiquinated proteins give rise to inclusion bodies found in myositis and myopathies. This resistance can be due to point mutation in the target protein itself (e.g. mutant p53 in cancer; see p. 46) or as a result of an external factor altering the conformation of the normal protein to create a proteolytic-resistant shape, as in the prion protein of variant Creutzfeldt–Jakob disease (vCJD). Other conditions include von Hippel–Lindau syndrome (p. 634) and Liddle’s syndrome (p. 653).

Free radicals

A free radical is any atom or molecule which contains one or more unpaired electrons, making it more reactive than the native species. The major free radical species produced in the human body are the hydroxyl radical (OH), the superoxide radical (O2) and nitric oxide (NO).

Free radicals have been implicated in a large number of human diseases. The hydroxyl radical is by far the most reactive species but the others can generate more reactive species as breakdown products. When a free radical reacts with a non-radical, a chain reaction ensues which results in direct tissue damage by membrane lipid peroxidation. Furthermore, hydroxyl radicals can cause genetic mutations by attacking purines and pyrimidines. Superoxide dismutases (SOD) convert superoxide to hydrogen peroxide and are thus part of an inherent protective antioxidant mechanism. Patients with dominant familial forms of amyotrophic lateral sclerosis (motor neurone disease) have mutations in the gene for Cu-Zn SOD-1 catalases. Glutathione peroxidases are enzymes that remove hydrogen peroxide generated by SOD in the cell cytosol and mitochondria.

Free radical scavengers bind reactive oxygen species. Alpha-tocopherol, urate, ascorbate and glutathione remove free radicals by reacting directly and non-catalytically. Severe deficiency of α-tocopherol (vitamin E deficiency) causes neurodegeneration. There is evidence that cardiovascular disease and cancer can be prevented by a diet rich in substances that diminish oxidative damage (p. 211). The principal dietary antioxidants are vitamin E, vitamin C, β-carotene and flavonoids.

Apoptotic cell death

Most terminally-differentiated cells can no longer replicate and eventually die by apoptosis, a type of programmed cell death. Apoptosis occurs through the deliberate activation of cellular pathways, which function to cause cell suicide. In contrast to necrosis, apoptosis is orderly. Cells are destroyed and their remains phagocytosed by adjacent cells and macrophages without inducing inflammation. Apoptosis is essential for many life processes, including tissue maintenance in the adult, tissue formation in embryogenesis, and normal metabolic processes such as autodestruction of the thickened endometrium to cause menstruation in a non-conception cycle. Cells which have accumulated irreparable DNA damage from toxins or ultraviolet radiation also trigger apoptosis via p53 protein to prevent replication of mutations or progression to cancer. Many chemotherapy and radiotherapy regimens work by triggering apoptotic pathways in the tumour cell.

Apoptosis has characteristic features:

Apoptosis requires proteases called caspases whose action is very tightly regulated. Caspases not only destroy cell organelles, they cleave nuclear lamin causing collapse of the nuclear envelope and activate, through cleavage, nucleases that degrade DNA. Caspase activation can be achieved by:

image

Figure 2.14 Extrinsic and intrinsic apoptotic signalling network. The Fas protein and Fas ligand (FasL) are two proteins that interact to activate an apoptotic pathway. Fas and FasL are both members of the TNF (tumour necrosis factor) family – Fas is part of the transmembrane receptor family and FasL is part of the membrane-associated cytokine family. When the homotrimer of FasL binds to Fas, it causes Fas to trimerize and brings together the death domains (DD) on the cytoplasmic tails of the protein. The adaptor protein, FADD (Fas-associating protein with death domain), binds to these activated death domains and they bind to pro-caspase 8 through a set of death effector domains (DED). When pro-caspase 8s are brought together, they transactivate and cleave themselves to release caspase 8, a protease that cleaves protein chains after aspartic acid residues. Caspase 8 then cleaves and activates other caspases, which eventually leads to activation of caspase 3. Caspase 3 cleaves ICAD, the inhibitor of CAD (caspase activated DNase), which frees CAD to enter the nucleus and cleave DNA. Although caspase 3 is the pivotal execution caspase for apoptosis, the processes can be initiated by intrinsic signalling, which always involves mitochondrial release of cytochrome C and activation of caspase 9. The release of cytochrome C and mitochondrial inhibitor of lAPs is mediated via Bcl-2 family proteins (including Bax, Bak) forming pores in the mitochondrial membrane. Interestingly, the extrinsic apoptotic signal is aided and amplified by activation of tBid, which recruits Bcl-2 family members and hence also activates the intrinsic pathway. Apaf1, apoptotic protease activating factor 1; Bid, family member of Bcl-2 protein; IAPs, inhibitor of apoptosis proteins.

The extrinsic pathway is required for tissue remodeling and induction of immune self-tolerance. Cells marked for apoptosis express a member of the tumour necrosis factor (TNF) death receptor family, such as Fas, on their surfaces. Ligand binding (e.g. by Fas ligand expressed on lymphocytes) causes activation of adaptor proteins which produce a cascade of caspase activation. The extrinsic pathway can be amplified by induction of the intrinsic pathway (see below).

The intrinsic pathway centres on increased mitochondrial permeability and release of pro-apoptotic proteins like cytochrome C. Cellular stresses such as growth factor withdrawal, p53-dependent cell cycle arrest, DNA damage, and intracellular reactive oxygen species induce expression of pro-apoptotic Bcl-2 proteins, Bax and Bak. These enter the outer mitochondrial membrane forming pores that release cytochrome C which forms a complex (the apoptosome) with other proteins. The apoptosome activates a caspase cascade.

Stem cells

Following fertilization, the newly formed fertilized cell (the zygote) and those following the first few divisions are totipotent, meaning that they can differentiate into any cell type in the adult body. At the blastula stage of embryonic development, these cells undergo a primary differentiation event to become either the trophectoderm or the inner cell mass (ICM). The trophectoderm gives rise to the fetal cells of the placenta, while the ICM are pluripotent and give rise to all other cell types of the body (except those of the placenta), and are more commonly called embryonic stem (ES) cells. Stem cells have two properties:

As they begin to differentiate, their ability to self renew and their potency is reduced but there remain adult progenitor cells (sometimes erroneously referred to as stem cells), that have a limited ability to self renew and can differentiate into multiple related lineages (multipotent, like haematopoietic ‘stem cells’) or single lineages (unipotent, like muscle satellite cells). The body uses these partially differentiated progenitor cells to continually replace or repair damaged cells and tissues.

Stem cells have great therapeutic potential and can be obtained from blood from the umbilical cord, which contains embryonic-like stem cells (not as primitive as ES cells but can differentiate into many more cells types than adult progenitor cells) or by reprogramming adult cells to regain stem-like properties (induced pluripotent stem cells, IPSC).

Human genetics

In 2003, the Human Genome Project was completed, with all 3.2 × 109 base-pairs of DNA sequenced. Over 99% of the DNA sequence is identical between individuals, but still millions of different base-pair variations occur (variants that occur at a frequency >1% are called polymorphisms; pathological polymorphisms are called mutations; single nucleotide polymorphisms are called SNPs, pronounced ‘snips’). In addition, the genome contains segmental, duplication-rich regions, where the number of duplications varies between people. These are called copy number variations or CNVs. These variations underlie most human differences, and confer genetic disease and susceptibility to many common diseases. To understand this variation, the 1000 Genomes Project was undertaken (completion date 2012), involving sequencing of many genomes from people of Asian, west African and European ancestry. This project will document most of the variation between humans.

Genomic DNA encodes approximately 21 000 genes. However, these protein-encoding genes comprise only about 1.5% of the human genome. About 90% of the remaining genome is transcribed to form RNA molecules, which are not translated into protein (non-coding RNA, ncRNA). Some of these RNAs have known regulatory roles, although the role of many is still unknown. The remaining DNA also contains evolutionarily conserved non-coding regions, some with known enhancer functions, moderately repeated elements (transposons) with probable viral origin, and microsatellites consisting of short simple sequence (1–6 nucleotide) repeats. About 10% of the genome is highly repetitive or ‘satellite’ DNA, consisting of long arrays of tandem repeats. Satellite DNA largely locates to centromeres and telomeres of chromosomes and regions of inert DNA. It forms a major part of heterochromatin. The function of genomic DNA elements is being investigated under the ENCODE (Encyclopedia of DNA Elements) project.

Tools for human genetic analysis

DNA sequencing

A chemical process known as dideoxy-sequencing or Sanger sequencing (after its inventor) allows the identification of the exact nucleotide sequence of a piece of DNA. As in PCR, an oligonucleotide primer is annealed adjacent to the region of interest. This primer acts as the starting point for a DNA polymerase to build a new DNA chain that is complementary to the sequence under investigation. Chain extension can be prematurely interrupted when a dideoxynucleotide becomes incorporated (because they lack the necessary 3′-hydroxyl group). As the dideoxynucleotides are present at a low concentration, not all the chains in a reaction tube will incorporate a dideoxynucleotide in the same place; so the tubes contain sequences of different lengths but which all terminate with a particular dideoxynucleotide. Each base dideoxynucleotide (G, C, T, A) has a different fluorochrome attached, and thus each termination base can be identified by its fluorescent colour. As each strand can be separated efficiently by capillary electrophoresis according to its size/length, simply monitoring the fluorescence as the reaction products elute from the capillary will give the gene sequence (Fig. 2.17).

Sequencing technology has developed dramatically in recent years, to the extent that it is now cost-effective and quick to sequence an individual’s whole genome in one experiment. This has massive implications in disease gene discovery but also can raise serious ethical considerations. There are a number of different platforms to perform this high-throughput sequencing (or ‘Next Generation Sequencing’) and new faster and cheaper ones are being developed. For under £2000 (2011 price), it is possible to sequence all the coding genes in an individual’s human genome to catalogue all possible disease-associated and non-disease associated variants in every human gene. As well as sequencing genomic DNA, the technology can be used to sequence RNA (termed RNAseq) to assess accurately gene expression levels in addition to determining all splice variation and allelic-copy number. This can also be used to assess the effect of methylation on gene expression.

Identification of gene function

Following sequencing of the genome, the challenge is to understand the function of the protein coding genes. Most tools rely on the comparison of a cell or animal’s phenotype in the presence or absence of the gene in question. Each tool has different merits and faults.

RNAi

RNAi takes advantage of the cellular machinery that allows microRNAs encoded by the genome to regulate the expression of many genes at the level of messenger RNA stability and translation (see Control of gene expression, above). This phenomenon has been exploited in the laboratory to study the function of a gene of interest or, on a much larger scale, the function of each gene in the genome. In such an RNAi screen, a small interfering (si) RNA specific for each gene in the genome is introduced into cells grown in vitro, in effect knocking down expression of each gene in ~20 000 separate experiments. The phenotype of the cells in each experiment is then monitored to test the effect of loss of gene expression.

Genetic polymorphisms and linkage studies

Techniques have been developed to identify and quantitate genetic polymorphisms such as single nucleotide polymorphisms (SNPs; p. 34), microsatellites and copy number variants (CNVs). For example, SNPs consist usually of two nucleotides at a particular site and vary between populations and ethnic groups. They must occur in at least 1% of the population to be a SNP. SNPs can be in coding or non-coding regions of the genes or be between genes and thus may not change the amino acid sequence of the protein.

The biology of chromosomes

Human chromosomes

The nucleus of each diploid cell contains 6 × 109 bp of DNA in long molecules called chromosomes (Fig. 2.11). Chromosomes are massive structures containing one linear molecule of DNA that is wound around histone proteins into small units called nucleosomes, and these are further wound to make up the structure of the chromosome itself.

Diploid human cells have 46 chromosomes, 23 inherited from each parent; thus there are 23 ‘homologous’ pairs of chromosomes (22 pairs of ‘autosomes’ and two ‘sex chromosomes’). The sex chromosomes, called X and Y, are not homologous but are different in size and shape. Males have an X and a Y chromosome; females have two X chromosomes. (Primary male sexual characteristics are determined by the SRY gene – sex determining region, Y chromosome.)

The chromosomes are classified according to their size and shape, the largest being chromosome 1. The constriction in the chromosome is the centromere, which can be in the middle of the chromosome (metacentric) or at one extreme end (acrocentric). The centromere divides the chromosome into a short arm and a long arm, referred to as the p arm and the q arm, respectively (Fig. 2.11d).

Chromosomes can be stained when they are in the metaphase stage of the cell cycle and are very condensed. The stain gives a different pattern of light and dark bands that is diagnostic for each chromosome. Each band is given a number, and gene mapping techniques allow genes to be positioned within a band within an arm of a chromosome. For example, the CFTR gene (in which a defect gives rise to cystic fibrosis) maps to 7q21; that is, on chromosome 7 in the long arm in band 21.

During cell division (mitosis), each chromosome divides into two so that each daughter nucleus has the same number of chromosomes as its parent cell. During gametogenesis, however, the number of chromosomes is halved by meiosis, so that after conception the number of chromosomes remains the same and is not doubled. In the female, each ovum contains one or other X chromosome but, in the male, the sperm bears either an X or a Y chromosome.

Chromosomes can only be seen easily in actively dividing cells. Typically, lymphocytes from the peripheral blood are stimulated to divide and are processed to allow the chromosomes to be examined. Cells from other tissues can also be used for chromosomal analysis, e.g. amniotic fluid, placental cells from chorionic villus sampling, bone marrow and skin (Box 2.1).

The mitochondrial chromosome

In addition to the 23 pairs of chromosomes in the nucleus of every diploid cell, the mitochondria in the cytoplasm of the cell also have their own genome. The mitochondrial chromosome is a circular DNA (mtDNA) molecule of approximately 16 500 bp, and every base-pair makes up part of the coding sequence. These genes principally encode proteins or RNA molecules involved in mitochondrial function. These proteins are components of the mitochondrial respiratory chain involved in oxidative phosphorylation producing ATP. They also have a critical role in apoptotic cell death. Every cell contains several hundred mitochondria, and therefore several hundred mitochondrial chromosomes. Virtually all mitochondria are inherited from the mother as the sperm head contains no (or very few) mitochondria. Disorders mapped to the mitochondrial chromosome are shown in Figure 2.18 and discussed on page 40.

Genetic disorders

The spectrum of inherited or congenital genetic disorders can be classified as the chromosomal disorders, including mitochondrial chromosome disorders, the Mendelian and sex-linked single-gene disorders, a variety of non-Mendelian disorders, and the multifactorial and polygenic disorders (Table 2.1 and Box 2.2). All are a result of a mutation in the genetic code. This may be a change of a single base-pair of a gene, resulting in functional change in the product protein (e.g. thalassaemia) or gross rearrangement of the gene within a genome (e.g. chronic myeloid leukaemia). These mutations can be congenital (inherited at birth) or somatic (arising during a person’s life).

Chromosomal disorders

Chromosomal abnormalities are much more common than is generally appreciated. Over half of spontaneous abortions have chromosomal abnormalities, compared with only 4–6 abnormalities per 1000 live births. Specific chromosomal abnormalities can lead to well-recognized and severe clinical syndromes, although autosomal aneuploidy (a differing from the normal diploid number) is usually more severe than the sex-chromosome aneuploidies. Abnormalities may occur in either the number or the structure of the chromosomes.

Abnormal chromosome numbers

If a chromosome or chromatids fail to separate (‘non-disjunction’) either in meiosis or mitosis, one daughter cell will receive two copies of that chromosome and one daughter cell will receive no copies of the chromosome. If this non-disjunction occurs during meiosis, it can lead to an ovum or sperm having:

Non-disjunction can occur with autosomes or sex chromosomes. However, only individuals with trisomy 13, 18 and 21 survive to birth, and most children with trisomy 13 and trisomy 18 die in early childhood. Trisomy 21 (Down’s syndrome) is observed with a frequency of 1 in 650 live births, regardless of geography or ethnic background. This should be reduced with widespread screening (p. 43). Full autosomal monosomies are extremely rare and very deleterious. Sex-chromosome trisomies (e.g. Klinefelter’s syndrome, XXY) are relatively common. The sex-chromosome monosomy in which the individual has an X chromosome only and no second X or Y chromosome is known as Turner’s syndrome and is estimated to occur in 1 in 2500 live-born girls.

Occasionally, non-disjunction can occur during mitosis shortly after two gametes have fused. It will then result in the formation of two cell lines, each with a different chromosome complement: termed a ‘mosaic’ individual.

Very rarely, the entire chromosome set will be present in more than two copies, so the individual may be triploid rather than diploid and have a chromosome number of 69. Triploidy and tetraploidy (four sets) result in spontaneous abortion.

Abnormal chromosome structures

As well as abnormal numbers of chromosomes, chromosomes can have abnormal structures, and the disruption to the DNA and gene sequences may give rise to a genetic disease.

Translocations can be very complex, involving more than two chromosomes, but most are simple and fall into one of two categories:

Table 2.2 shows some of the syndromes resulting from chromosomal abnormalities.

Mitochondrial chromosome disorders

The mitochondrial chromosome (see Fig. 2.18, p. 37) carries its genetic information in a very compact form; e.g. there are no introns in the genes. Therefore, any mutation has a high chance of having an effect. However, as every cell contains hundreds of mitochondria, a single altered mitochondrial genome will not be noticed. As mitochondria divide, there is a statistical likelihood that there will be more mutated mitochondria, and at some point, this will give rise to a mitochondrial disease.

Most mitochondrial diseases are myopathies and neuropathies with a maternal pattern of inheritance. Other abnormalities include retinal degeneration, diabetes mellitus and hearing loss. Many syndromes have been described.

Myopathies include chronic progressive external ophthalmoplegia (CPEO); encephalomyopathies include myoclonic epilepsy with ragged red fibres (MERRF) and mitochondrial encephalomyopathy, lactic acidosis and stroke-like episodes (MELAS) (see p. 1153).

Kearns–Sayre syndrome includes ophthalmoplegia, heart block, cerebellar ataxia, deafness and mental deficiency due to long deletions and rearrangements. Leber’s hereditary optic neuropathy (LHON) is the commonest cause of blindness in young men, with bilateral loss of central vision and cardiac arrhythmias, and is an example of a mitochondrial disease caused by a point mutation in one gene.

Multisystem disorders include Pearson’s syndrome (sideroblastic anaemia, pancytopenia, exocrine pancreatic failure, subtotal villous atrophy, diabetes mellitus and renal tubular dysfunction). In some families, hearing loss is the only symptom, and one of the mitochondrial genes implicated may predispose patients to aminoglycoside cytotoxicity.

Analysis of chromosome disorders

The cell cycle can be arrested at mitosis with colchicine and, following staining, the chromosomes with their characteristic banding can be seen and any abnormalities identified (Fig. 2.19). This is an automated process with computer scanning software searching for metaphase spreads and then automatic binning of each chromosome to allow easy scoring of chromosome number and banding patterns. Another approach utilizes genome-wide array based platforms (comparative genomic hybridization (CGH) or chromosomal microarray analysis (CMA)) to identify changes in chromosome copy number and can identify very small interstitial deletions and insertions (<1 Mb in size).

Large region specific probes are labelled with fluorescently tagged nucleotides and used to allow rapid identification of metaphase chromosomes. This approach allows easy identification of chromosomal translocations (Fig. 2.20). High-throughput sequencing is another method to identify deletions, insertions and translocation breakpoints.

Gene defects

Mendelian and sex-linked single-gene disorders are the result of mutations in coding sequences and their control elements. These mutations can have various effects on the expression of the gene, as explained below, but all cause a dysfunction of the protein product.

Mutations

Although DNA replication is a very accurate process, occasionally mistakes occur to produce changes or mutations. These changes can also occur owing to other factors such as radiation, ultraviolet light or chemicals. Mutations in gene sequences or in the sequences which regulate gene expression (transcription and translation) may alter the amino acid sequence in the protein encoded by that gene. In some cases, protein function will be maintained; in other cases, it will change or cease, perhaps producing a clinical disorder. Many different types of mutation occur.

Autosomal recessive disorders

These disorders (Fig. 2.21b) manifest themselves only when an individual is homozygous or a compound heterozygote for the disease allele, i.e. both chromosomes carry the same gene mutation (homozygous) or different mutations in the same gene (compound heterozygote). The parents are unaffected carriers (heterozygous for the disease allele). If carriers marry, the offspring have a one in four chance of carrying both mutant copies of the gene and being affected, a one in two chance of being a carrier, and a one in four chance of being genetically normal. Consanguinity increases the risk.

Sex-linked disorders

Genes carried on the X chromosome are said to be ‘X-linked’, and can be dominant or recessive in the same way as autosomal genes (Fig. 2.21c,d).

Other single-gene disorders

These are disorders which may be due to mutations in single genes but which do not manifest as simple monogenic disorders. They can arise from a variety of mechanisms, including the following.

Triplet repeat mutations

In the gene responsible for myotonic dystrophy (p. 1153), the mutated allele was found to have an expanded 3′UTR region in which three nucleotides, CTG, were repeated up to about 200 times. In families with myotonic dystrophy, people with the late-onset form of the disease had 20–40 copies of the repeat, but their children and grandchildren who presented with the disease from birth had vast increases in the number of repeats, up to 2000 copies. It is thought that some mechanism during meiosis causes this ‘triplet repeat expansion’ so that the offspring inherit an increased number of triplets. The number of triplets affects mRNA and protein function (Table 2.3). See also page 43 for the phenomenon of ‘anticipation’.

Mitochondrial disease

As discussed on pages 37 and 40, various mitochondrial gene mutations can give rise to complex disease syndromes with incomplete penetrance maternal inheritance (Fig. 2.18).

Clinical genetics and genetic counselling

Genetic disorders pose considerable health and economic problems because often there is no effective therapy. In any pregnancy, the risk of a serious developmental abnormality is approximately 1 in 50 pregnancies; approximately 15% of paediatric inpatients have a multifactorial disorder with a predominantly genetic element. 50% of clinical genetics referrals are adults with late-onset disease including neurological, endocrine, gastrointestinal or cancer.

People with a history of a congenital abnormality in a member of their family often seek advice as to why it happened and about the risks of producing further abnormal offspring. Interviews must be conducted with great sensitivity and psychological insight, as parents may feel a sense of guilt and blame themselves for the abnormality in their child.

Genetic counselling should have the following aims:

image Obtaining a full history. The pregnancy history, drug and alcohol ingestion during pregnancy and maternal illnesses (e.g. diabetes) should be detailed.

image Establishing an accurate diagnosis. Examination of the child may help in diagnosing a genetically abnormal child with characteristic features (e.g. trisomy 21) or whether a genetically normal fetus was damaged in utero.

image Drawing a family tree is essential. Questions should be asked about abortions, stillbirths, deaths, marriages, consanguinity and medical history of family members. Diagnoses may need verification from other hospital reports.

image Estimating the risk of a future pregnancy being affected or carrying a disorder. Estimation of risk should be based on the pattern of inheritance. Mendelian disorders (see earlier) carry a high risk; chromosomal abnormalities other than translocations typically carry a low risk. Empirical risks may be obtained from population or family studies.

image Information giving on prognosis and management with adequate time given so that all information is discussed openly, freely and repeated as necessary.

image Continued support and follow-up. Explanation of the implications for other siblings and family members.

image Genetic screening. This includes prenatal diagnosis or preimplantation genetic diagnosis (IVF followed by testing of embryos before implantation) if requested, carrier detection and data storage in genetic registers. A large number of molecular genetic tests are now available

image The near future? With the development of cheap high-throughput sequencing, couples could be tested for all genes (termed ‘exome’ sequencing) prior to starting a family to assess if they are carriers of recessive mutations in the same disease-associated gene. This information could then be used in prenatal diagnosis.

Genetic counselling should be non-directive, with the couple making their own decisions on the basis of an accurate presentation of the facts and risks in a way they can understand.

Prenatal diagnosis for chromosomal disorders

This should be offered to all pregnant women. Practice and uptake varies in different maternity units, with some offering screening only to high-risk mothers. The risks of Down’s syndrome increase disproportionately and rapidly for children born to mothers older than 35 years. Infants born to mothers with a history or family history of other conditions due to chromosomal abnormalities may be at increased risk.

Investigations

The choice of investigation depends on gestational age:

Genomic medicine

Gene therapy

Some genetic disorders, such as phenylketonuria or haemophilia, can be managed by diet or replacement therapy, but most have no effective treatment. One approach to manage inherited genetic disease entails placing a normal copy of a gene into the cells of a patient who has a defective copy of the gene; termed gene therapy.

There are many technical problems to overcome in gene therapy, particularly in finding delivery systems to introduce DNA into a mammalian cell. Very careful control and supervision of gene manipulation will be necessary because of its potential hazards and the ethical issues.

Two major factors are involved in gene therapy:

Cystic fibrosis (see also p. 821)

CFTR, the cystic fibrosis transmembrane regulator, is an unusual ABC transporter in that it does not function as a primary active transporter but as a ligand-gated chloride channel (Fig. 2.22). The common CF mutation is a 3 bp deletion in exon 10 resulting in the removal of a codon specifying phenylalanine (F508del). In this mutation the CFTR protein is misfolded, thereby causing ineffective biosynthesis and consequently disrupting the delivery of the protein to the cell surface. In the mutation G551 D-CFTR, glycine in position 551 is replaced by aspartate; the CFTR channel reaches the cell surface but fails to open. This has introduced a new era of treatment. VX-770, a potentiating agent which can be given orally, has been developed. It increases the fraction of time that the phosphorylated G551 D-CFTR channel is open allowing bicarbonate and chloride flow across the membrane. Early clinical results are encouraging.

There are also over 1000 different mutations of the CFTR gene with many mapping to the ATP-binding domains.

Two other routes of gene therapy have been tried, either with placing the wild-type CFTR cDNA into an adenovirus vector (see Fig. 15.28) to allow infection of human cells or into a plasmid (an engineered circle of DNA) that is then encapsulated into a liposome to allow transfection of human cells. The latter can be conveyed via an aerosol spray to the lung where the liposome fuses with the cell membrane to deliver the CFTR cDNA into the cell. However, neither is yet a treatment option. An alternative method is to suppress premature termination codons and thus permit translation to continue; topical nasal gentamicin (an aminoglycoside antibiotic) has been shown to result in the expression of functional CFTR channels.

Stem cell therapy

Stem cell therapy has the potential to radically change the treatment of human disease (see p. 33). A number of adult stem cell therapies already exist, particularly bone marrow transplants. It is currently anticipated that technologies derived from stem cell research can be used to treat a wider variety of diseases in which replacement of destroyed specialist tissues is required, such as in Parkinson’s disease, spinal cord injuries and muscle damage.

The genetic basis of cancer

Cancers are genetic diseases and involve changes to the normal function of cellular genes. However, multiple genes interact during oncogenesis and an almost stepwise progression of defects leads from an overproliferation of a particular cell to the breakdown of control mechanisms such as apoptosis (programmed cell death). This would be triggered if a cell were to attempt to survive in an organ other than its tissue of origin. For the vast majority of cancer cases (especially those in older people), the multiple genetic changes which occur are somatic. For some cancers, however (where the cancer normally occurs at an earlier age), a dominant inherited single-gene defect can give rise to an almost Mendelian trend with lifetime risks of nearly 90%.

Autosomal dominant inheritance

The following are examples of cancer syndromes (see Table 9.3, p. 433) that exhibit dominant inheritance:

Oncogenes

The genes coding for growth factors, growth factor receptors, secondary messengers or even DNA-binding proteins would act as promoters of abnormal cell growth if mutated. This concept was verified when viruses were found to carry genes which, when integrated into the host cell, promoted oncogenesis. These were originally termed viral or ‘v-oncogenes’, and later their normal cellular counterparts, c-oncogenes, were found. Thus, oncogenes encode proteins that are known to participate in the regulation of normal cellular proliferation e.g. erb-A on chromosome 17q11-q12 encodes for the thyroid hormone receptor. See Table 2.5.

Table 2.5 Examples of acquired/somatic mutations and proto-oncogenes

Point mutation

 

 K-RAS

Colorectal and pancreatic cancer

 B-RAF

Melanoma, thyroid

 ALK

Lung cancer

DNA amplification

 

 MYC

Neuroblastoma

 HER2-neu

Breast cancer

Chromosome translocation

 

 BCR/ABL

CML, ALL

 PML/RARA

APL

 BCL2/IGH

Follicular lymphoma

 IGH/CCND1

Mantle cell lymphoma

 MYC/IgH

Burkitt’s lymphoma

CML, chronic myeloid leukaemia; ALL, acute lymphoblastic leukaemia; APL, acute promyelocytic leukaemia.

Activation of oncogenes

Non-activated oncogenes, which are functioning normally, have been referred to as ‘proto-oncogenes’. Their transformation to oncogenes can occur by three routes.

Tumour suppressor genes

These genes restrict undue cell proliferation (in contrast to oncogenes), and induce the repair or self-destruction (apoptosis) of cells containing damaged DNA. Therefore, mutations in these genes which disable their function, lead to uncontrolled cell growth in cells with active oncogenes. An example is the germline mutations in genes found in non-polyposis colorectal cancer, which are responsible for repairing DNA mismatches (p. 288).

The RB gene was the first tumour suppressor gene to be described (p. 433). In the familial variety, the first mutation is inherited and by chance, a second somatic mutation occurs with the formation of a tumour. In the sporadic variety, by chance both mutations occur in both the RB genes in a single cell.

Since the finding of RB, other tumour suppressor genes have been described, including the gene p53. Mutations in p53 have been found in almost all human tumours, including sporadic colorectal carcinomas, carcinomas of breast and lung, brain tumours, osteosarcomas and leukaemias. The protein encoded by p53 is a cellular 53 kDa nuclear phosphoprotein that plays a role in DNA repair and synthesis, in the control of the cell cycle and cell differentiation and programmed cell death – apoptosis. p53 is a DNA-binding protein which activates many gene expression pathways but it is normally only short-lived. In many tumours, mutations that disable p53 function also prevent its cellular catabolism. Although in some cancers there is a loss of p53 from both chromosomes, in most cancers (particularly colorectal carcinomas; see Fig. 9.1) such long-lived mutant p53 alleles can disrupt the normal alleles’ protein. As a DNA-binding protein, p53 is likely to act as a tetramer.

Thus, a mutation in a single copy of the gene can promote tumour formation because a hetero-tetramer of mutated and normal p53 subunits would still be dysfunctional. p53 and RB are involved in normal regulation of the cell cycle. Other cancer–associated genes are also intimately involved in control of the cell cycle (Fig. 2.13).