Membrane channel proteins (Fig. 2.3): membrane proteins that form solute channels through the membrane can only work downhill and only to equilibrium. Solute actually moves down its electrochemical gradient, which is the combined force of the electric potential and the solute concentration gradient across the membrane. The bulk flow can be very high, the opening and closing of the channel can be regulated, and they can be selective for specific solutes. For example, the cystic fibrosis transmembrane regulator (CFTR; Fig. 2.22), the protein whose malfunction causes cystic fibrosis, is a chloride channel found on the apical surface of epithelial cells. CFTR functions to regulate the fluidity of the extra-epithelial mucous layer. When the channel opens, millions of negatively-charged chloride ions flow out of the cell down their electrochemical gradient. This induces positively-charged sodium ions to flow between the cells of the epithelium (via a paracellular pathway) to balance the electrical charge. Water follows the efflux of sodium chloride by osmosis, thus maintaining the fluidity of the mucus.

Transporters (Fig. 2.3): in contrast to channels, transporters have a low capacity and work by binding solute on one side of the membrane which induces a conformational change that exposes the solute binding site on the other side of the membrane for release.

– Passive transporters work without an energy source and can only transport downhill to equilibrium.

– Active transporters use energy and can work uphill to concentrate a solute. Primary active transporters use the energy derived from binding and hydrolysis of ATP to drive the translocation cycle. ATP binding cassette (ABC) transporters are a major class of primary active transporters whose malfunction causes dozens of human diseases including a spectrum of liver, eye and skin diseases, bleeding disorders and adrenoleukodystrophy. Secondary active pumps are driven by ion gradients which are themselves made and maintained by primary active pumps, thus primary and secondary active pumps often work in concert as illustrated for the transcellular uptake of glucose across the intestinal epithelia).

Receptors: there are three major receptor categories: receptors that mediate endocytosis, anchorage receptors (e.g. integrins, see p. 23) and signalling receptors (see cell signalling p. 24). There are two forms of receptor-mediated endocytosis:

– Phagocytosis: specialized phagocytic cells such as macrophages and neutrophils can engulf, or phagocytose ~20% of their surface in pursuit of large particles such as bacteria or apoptotic cells for digestion and recycling. Phagocytosis is only triggered when specific cell surface receptors – such as the macrophage Fc receptor – are occupied by their ligand.

Figure 2.3 The difference between a transporter and a channel. (a) Transporters expose specific solute binding sites alternately on different sides of the membrane. They can function uphill if coupled to an energy source (active transport) or be downhill only (facilitated diffusion). They are low capacity. (b) Channels form a continuous pore through the membrane. They can be regulated and selective and only work downhill; bulk flow is high.

– Pinocytosis is a small-scale model of phagocytosis and occurs continually in all cells. Smaller molecular complexes, such as low-density lipoprotein (LDL) (Fig. 2.4a), are internalized during pinocytosis via clathrin-coated pits. The LDL receptor has a large extracellular domain which binds LDL to induce a conformational change in an intracellular domain of the receptor which allows it to bind clathrin from the cytoplasm. Clathrin bends the membrane to form a pit that pinches inwards to become an intracellular clathrin-coated vesicle. Loss of the clathrin coat can allow fusion with other intracellular organelles or vesicles (e.g. with lysosomes for degradation of the cargo), or the coat can be maintained for transcellular transport. Defects in each step of pinocytosis can lead to disease. For example, hypercholesterolaemia (p. 1035) can result from mutation to the LDL receptor’s extracellular domain that prevents LDL binding, but the most common LDL receptor mutation results in loss of the intracellular domain and prevents recruitment of clathrin.

Figure 2.4 Intracellular transport. (a) Receptor-mediated endocytosis or pinocytosis. (b) Trafficking of vesicles containing synthesized proteins to the cell surface (e.g. hormones). (c) Traffic between organelles is also mediated by v- and t-SNARE-containing organelles. v-SNARE, vesicle-specific SNARE; t-SNARE, target-specific SNARE. COPI, coat protein; LDL, low density lipoprotein.

FURTHER READING

Artenstein AW, Opal SM. Proprotein convertases in health and disease. N Engl J Med 2011; 365:2507–2518.

Organelles

Cytoplasmic organelles

Endoplasmic reticulum (ER) is an array of interconnecting tubules or flattened sacs (cisternae) that is contiguous with the outer nuclear membrane (Fig. 2.1). There are three types of ER:

– Rough ER carries ribosomes on its cytosolic surface which synthesize proteins destined for secretion or membrane insertion.

– Smooth ER is the site of lipid and sterol synthesis, and also of steroid and drug metabolism. With sodium pumps and channels in its membrane it is also a calcium store which is released for signalling.

– Sarcoplasmic reticulum is a form of ER found in muscle, where release of calcium on excitation is necessary for muscle contraction.

Golgi apparatus has flattened cisternae similar to those of the ER but arranged in a stack (Fig. 2.1). Vesicles that bud from the ER with cargo destined for secretion, for the plasma membrane or for other organelles, fuse with the Golgi stack. The proteins, lipids and sterols synthesized in the ER are exported to the Golgi apparatus to complete maturation (e.g. the final stages of membrane protein glycosylation occurs here). The mature products are then sorted into vesicles that bud from the Golgi for transport to their final destination (Fig. 2.4b,c). Mutation in the Golgin protein GMAP-210, with a probable role in tethering of the Golgi cisternae, causes achondrogenesis type 1A, where Golgi architecture is disrupted, particularly in bone cells.

Golgi apparatus.

(Courtesy of Louisa Howard, Dartmouth EM Facility.)

Lysosomes mature from vesicles (endosomes) that bud from the Golgi. They contain digestive enzymes such as lipases, proteases, nucleases and amylases that work in an acidic environment. The membrane of the lysosome therefore includes a proton ATPase pump to acidify the lumen of the organelle. Lysosomes fuse with phagocytotic vesicles to digest their contents. This is crucial to the function of macrophages and polymorphs (neutrophils and eosinophils) in killing and digesting infective agents, in tissue remodelling during development, and osteoclast remodelling of bone. Not surprisingly, many metabolic disorders result from impaired lysosomal function (p. 1040).

Peroxisomes contain enzymes for the catabolism of long-chain fatty acids and other organic substrates like bile acids and D-amino acids. Hydrogen peroxide (H₂O₂), a by-product of these reactions, is a highly reactive oxidizing agent, so peroxisomes also contain catalase to detoxify the peroxide. Catalase can reduce H₂O₂ to water while oxidizing harmful phenols and alcohols thus beginning their detoxification. Peroxisome dysfunction can lead to rare metabolic disorders such as leukodystrophies and rhizomelic dwarfism.

Mitochondria are the engines of the cell, providing energy in the form of ATP. Mitochondria can be small, discrete and few in number in cells with low energy demand, or large and abundant in cells with a high energy demand like hepatocytes or muscle cells. The mitochondrion has its own genome encoding 13 proteins. The other proteins (~1000) required for mitochondrial function are encoded by the nuclear genome and imported into the mitochondrion. The mitochondrion has a double membrane surrounding a central matrix. The central matrix contains the enzymes for the Krebs cycle, which accepts the products of sugar and fatty acid catabolism and uses it to produce cofactors that donate their electrons into the electron transport chain of the inner membrane (see pp. 20, 31). The inner membrane is highly folded into cristae to increase its effective surface area. The protein complexes of the electron transport chain accept and donate electrons in redox reactions, releasing energy to efflux protons (H⁺) into the inter-membrane space. ATP synthase, another integral membrane protein, uses this H⁺ electrochemical gradient to drive formation of ATP. Mitochondria have many additional functions, including roles in apoptosis (see p. 32) and supply of substrates for biosynthesis. Mitochondria are also necessary for the synthesis of porphyrin, deficiency of which causes a range of diseases collectively called porphyrias (p. 1043).

Mitochondria.

(Courtesy of Louisa Howard, Dartmouth EM Facility.)

FURTHER READING

Linton KJ, Holland IB. The ABC Transporters of Human Physiology and Disease. New Jersey: World Scientific; 2011.

Lowe M. Structural organization of the Golgi apparatus. Curr Opin Cell Biol 2011; 23: 85–93.

SIGNIFICANT WEBSITE

http://www.cytochemistry.net/Cell-biology/

Nucleus

The most prominent cellular organelle, the nucleus, has a double membrane (the outer membrane is continuous with the ER) enclosing the human genome. The double membrane contains nuclear pores through which gene regulatory proteins, transcription factors and RNA that has been transcribed from the DNA, are transported. The nuclear matrix is highly organized. Microscopically dense regions of heterochromatin represent highly compacted chromosomal DNA which tends to be transcriptionally repressed. Lighter regions of euchromatin contain extended chromosomes which tend to be transcriptionally active. The most prominent nuclear compartment, the nucleolus, is where ribosomal RNA (rRNA) is synthesized and ribosomal subunits are assembled.

Nucleus showing dark regions of heterochromatin and lighter euchromatin.

(Courtesy of Louisa Howard Dartmouth, EM Facility.)

The cytoskeleton

A complex network of structural proteins regulates the shape, strength and movement of the cell, and the traffic of internal organelles and vesicles. The major components are microtubules, intermediate filaments and microfilaments.

Microtubules (20–25 nm diameter) are polymers of α and β tubulin. These tubular structures resist bending and stretching, and are polar with plus and minus ends. They emanate from the microtubule organizing centre (MTOC), a complex of centrioles, γ-tubulin and other proteins, with their plus ends extending into the cell. At their plus ends repeated cycles of assembly and disassembly permit rapid changes in length. Microtubules form a ‘highway’, transporting organelles and vesicles through the cytoplasm. The two major microtubule-associated motor proteins (kinesin and dynein) allow movement of cargo to the plus and minus ends, respectively. During cell division the MTOC forms the mitotic spindle (see p. 28). Drugs that disrupt microtubule assembly (e.g. colchicine and vinca alkaloids) or stabilize microtubules (taxanes) preferentially kill dividing cells by preventing mitosis.

Intermediate filaments (~10 nm) form a network around the nucleus extending to the periphery of the cell. They make cell-to-cell contacts with adjacent cells via desmosomes, and with basement matrix via hemidesmosomes (Fig. 2.5; see also Fig. 24.27). Their function appears to be structural integrity; they are prominent in cellular tissues under stress and their disruption in genetic disease can cause structural defects or cell collapse. More than 40 different types of proteins polymerize to form intermediate filaments specific to particular cell types. For example keratin intermediate fibres are only found in epithelial cells whilst vimentin is in mesothelial (fibroblastic) cells. However, lamin intermediate filaments form the nuclear membrane skeleton in most cells.

Microfilaments (3–6 nm) are polymers of actin, one of the most abundant proteins in all cells. The actin microfilament network controls cell shape, prevents cellular deformation, is involved in cell–cell and cell–matrix adhesion, in cell movements such as crawling and cytokinesis (cell division), and in intracellular vesicle transport. Bundles of actin filaments form the structural core of cellular protrusions such as microvilli, lamellipodia and filopodia (see below). Actin microfilament bundles within the cell can associate with myosin II to form contractile stress fibres, similar to muscle sarcomeres. Stress fibres are often found as circumferential belts around the apical surfaces of epithelial cells where cells associate with adjacent cells via adherens junctions, permitting reaction to external stresses as a cellular sheet. Stress fibres also form where actin interacts via accessory proteins with the extracellular matrix at sites of focal adhesion (see Fig. 2.8c). This occurs during cell movements during inflammation, wound healing and metastasis. During cytokinesis actin-myosin II bundles form the contractile ring separating dividing cells. Like microtubules, microfilaments are polar, so can be used to transport secretory vesicles, endosomes and mitochondria, powered by motor proteins, including myosin I and V.

Figure 2.5 Cytoskeleton of epithelial cells. (a) Keratin red, nuclei stained in blue. (b) Keratin filaments (in red) and a desmosomal plaque component desmoplakin in green.

(Reproduced with permission from Moll R, Divo M, Langbein L. The human keratins: biology and pathology. Histochemistry and Cell Biology 2008; 129:705–733.)

Actin microfilament network in an epithelial cell. Actin is orange, nuclei blue, green shows the endoplasmic reticulum.

(Courtesy of Carolyn Byrne, Queen Mary University, London.)

Microfilament (actin) cytoskeleton adhesion belt in polarized epithelial cells.

Cell shape and motility

The cytoskeleton determines cell shape and surface structures.

Microvilli. The apical surface of some epithelial cells is covered in tiny microvilli (~1 µm long) forming a brush border of thousands of small finger-like projections of the plasma membrane that increase the surface area for uptake or efflux (Fig. 2.6). At their core are 20–30 cross-linked actin microfilaments.

Figure 2.6 Cilia and microvilli in trachea. (a) Scanning electron microscopy image of longer cilia bearing cells with adjacent microvilli-bearing cells. (b) Transmission electron microscope image of section.

(Courtesy of Louisa Howard, Dartmouth EM Facility.)

Motile cilia are also fine, finger-like protrusions but these are longer (~10–20 µm long) (Fig. 2.6). At their core is an axoneme, a bundle of nine cross-linked tubulin microtubule doublets surrounding a central pair. The action of the motor domain dynein serves to bend the cilium. Neighbouring cilia tend to beat in unison generating waves of motion that move fluid over the cell surface in the gut and airways (see Fig. 15.9), and also in the fallopian tubes.

Non-motile or primary cilia. Most cells also have a single primary cilium. These cilia have a variant axoneme with no central pair of microtubules and while they have dynein they are non-motile (the dynein is used to traffic cargo along the axoneme). Primary cilia are used for signalling during development and in the adult. Other related non-motile cilia are found in specialized cells, e.g. in the photoreceptors of the retina, the sensory neurones of the olfactory system, and in the sensory hair cells of the cochlea. A range of human ciliopathies (Fig. 2.7) have been described with pleiotropic symptoms depending on which cilia are affected. These include polycystic kidney disease, Bardet–Biedl syndrome (p. 1007), Joubert’s syndrome and Ellis–van Creveld syndrome.

Figure 2.7 Structure of a cilium showing ciliopathy proteins and intraflagellar transport (IFT). Some single-gene ciliopathies are shown along with their gene products situated in the cilia centrosome complex (CCC). Receptors on cilia receive external cell signals which are processed via sonic hedgehog (SHH) and Wnt pathways. The gene mutation can act during morphogenesis (e.g. Meckel’s syndrome) or during tissue maintenance and repair leading to degenerative disorders. The IFT system transports axoneme and membrane compounds in raft macromolecular particles (IFT cargo and complex). Retrograde transport occurs via cytoplasmic dynein. NPHP1, nephronophthisis type 1; TRPR1 and 2, polycystin 1 and 2.

(Adapted from Hildebrandt F, Benzing T, Katsanis N. Ciliopathies. New England Journal of Medicine 2011; 364:1533–1543.)

Flagella. The single flagellum found on sperm is structurally related to cilia but is longer (~40 µm) and has a whip-like motion.

Cell motility is essential during development and in the adult when macrophages migrate to sites of infection, keratinocytes migrate to close wounds, osteoclasts and osteoblasts tunnel into and remodel bone, and fibroblasts migrate to sites of injury to repair the extracellular matrix. Most cell motility in the adult human takes the form of cell crawling which is dependent on remodelling of the actin cytoskeleton. How the actin cytoskeleton is remodelled determines the mode of migration:

Filopodia: if remodelled essentially in one dimension into a long actin filament, the leading edge of the plasma membrane is pushed forward as spikes, similar to long thin villi.

Lamellipodia: if remodelled in two dimensions to form a network of cross-linked actin microfilaments, a broad flat skirt or lamellipodium is formed.

Pseudopodia: are more three-dimensional projections as the actin cytoskeleton is remodelled into a gel-like lattice.

Movement. A similar mechanism involving the coordinated remodelling of the cytoskeleton and the formation and release of cell adhesions underlies all three modes of migration. Essentially, actin is polymerized at the leading edge extending the plasma membrane forward. New adhesions are formed with the substratum (cells and/or extracellular matrix) at the leading edge to provide purchase. Release of attachments and depolymerization of the actin filaments at the trailing edge then allows the cell to move forward. Myosin and myosin motor proteins may also be involved at the trailing edge providing the tractive force to pull the cell body forward. The complex coordination of these processes is controlled via signalling pathways involving members of the Rho protein family of GTPases (see p. 21). Key signalling targets are the WASp family of proteins which stimulate actin polymerization. The significance of cell motility in humans is illustrated by mutation of the WASp expressed in blood cell lineages, which causes Wiskott–Aldrich syndrome (p. 66), and is characterized by severe immunodeficiency and thrombocytopenia (platelet deficiency).

FURTHER READING

Hildebrandt F, Benzing T, Katsanis N. Ciliopathies. N Engl J Med 2011; 364:1533–1543.

The cell and its environment

Most cells differentiate or specialize to perform particular functions within tissues where they interact with the extracellular matrix (ECM) or other cells. The major tissue types are epithelia and connective tissues as well as muscle and neural tissue:

Epithelial tissues comprise layers of cells held tightly together by intercellular junctions and are usually separated from underlying tissue by specialized ECM called basal lamina. Epithelia cover surfaces (e.g. epidermis, tongue surface) and line passageways (airways, digestive tract, blood vessels), providing protection and regulating absorption and secretion.

Connective tissues provide support to other tissues and give organs shape. They comprise cells (fibroblasts) embedded within ECM such as the matrix of bone, dermis of skin and the fluid matrix of blood.

Extracellular matrix

The ECM is the gel matrix outside the cell, usually secreted by fibroblasts. ECM determines tissue properties, e.g. in bone it is calcified; in tendons it is tough and rope-like; and in neural tissue it is almost absent. However, ECM is more than just a support matrix. It affects cell shape, migration, cell-cell communication and signalling, proliferation and survival.

The gel or ground substance of the ECM is made from polysaccharides (glycosaminoglycans or GAGs), usually bound to proteins to form proteoglycans (p. 494). These are a diverse group of molecules conferring different matrix properties in different tissues. They form hydrated gels which can resist compression yet permit diffusion of metabolites and signalling molecules.

Hyaluronan, a very large hydrated GAG, is secreted into the joint space in synovial joints (p. 493), where it aids lubrication and helps reduce compressive forces.

Aggrecan, a very large proteoglycan, forms part of the articular cartilage of joints (p. 494) also contributing to compression resistance.

Decorin is a much smaller proteoglycan from loose connective tissue of skin with both structural and signalling function (through binding and regulating growth factor activity).

Fibrous proteins of ECM (p. 495) include collagens and tropoelastin, which polymerize into collagen and elastin fibres, and fibronectin which is insoluble in many tissues but soluble in plasma. Collagen provides tensile strength, elastin confers elasticity, while the widely distributed fibronectin adheres to both cells and ECM, and thus positions cells within the ECM. Collagens, the most abundant proteins in the body, are widely distributed and play a structural role in skin and bone, where collagen defects and disorders often manifest. Elastin fibres are abundant in arteries, lung and skin. Elastic fibres have a fibrillin sheath and fibrillin mutations underlie Marfan’s syndrome (p. 760). The ECM can be degraded and remodelled by proteins of the matrix metalloproteinase (MMP) family. These are needed for angiogenesis and morphogenesis and are also involved in the pathophysiology of cancer, cirrhosis and arthritis.

Basal lamina or basement membrane (lamina propria) is a specialized form of ECM, which separates cells from underlying tissue and provides a supportive, anchoring and protective role. Basal lamina can also act as molecular filters (e.g. glomerular filtration barrier, p. 636) and mediate signalling between adjacent tissues (e.g. epidermal-dermal signalling in skin). Type IV collagen, heparan sulphate proteoglycan, laminin and nidogen are key basal lamina proteins. Inherited abnormalities in these proteins cause skin blistering diseases (see Fig. 24.27). Breach of the basal lamina by invading cancer cells is a key stage in progression of epithelial carcinoma in situ to a malignant carcinoma.

Cell–cell adhesion

Cells need to interact directly for barrier function, tissue strength and to communicate. This is mediated by several types of proteins that form junctions between cells.

Cell–cell adhesion proteins (Fig. 2.8a)

As well as adhesion via multiprotein junctions, intercellular adhesion is achieved by individual transmembrane proteins.

Immunoglobulin-like cell adhesion molecules (iCAMs or CAMs) (Fig. 2.8a) are structurally related to antibodies. The neural cell adhesion molecule (N-CAM) is found predominantly in the nervous system. It mediates a homophilic (like-like) adhesion. When bound to an identical molecule on another cell, N-CAM can also associate laterally with a fibroblast growth factor receptor and stimulate its tyrosine kinase activity to induce neurite growth thus triggering cellular responses by indirect activation of the recipient.

Selectins. Unlike most adhesion molecules which bind to other proteins, the selectins interact with carbohydrate ligands or mucin complexes on leucocytes and endothelial cells (vascular and haematological systems). Leucocyte-selectin (CD62L) mediates the homing of lymphocytes to lymph nodes. Endothelial-selectin (CD62E) is expressed after activation by inflammatory cytokines; the small basal amount of E-selectin in many vascular beds appears to be necessary for the migration of leucocytes. Platelet-selectin (CD62P) is stored in the alpha granules of platelets and the Weibel–Palade bodies of endothelial cells, but it moves rapidly to the plasma membrane upon stimulation of these cells. All three selectins play a part in leucocyte rolling (p. 63).

Integrins are membrane glycoproteins with α and β subunits which exist as active and inactive forms. The amino acid sequence arginine–glycine–aspartic acid (RGD) is a potent recognition system for integrin binding

Figure 2.8 Cell adhesion molecules and cellular junctions. (a) Major groups of adhesion molecules. (b) Adjacent cells form focal adhesion junctions. (c) Basement membrane adhesion.

Focal adhesion junctions between adjacent cells

See Figure 2.8b.

Tight junctions (zonula occludens)

These are mediated by the integral membrane proteins, claudins and occludens; they hold cells together. They form at the top (apical) side of epithelial cells including intestinal, skin and kidney cells, and endothelial cells of blood vessels (Fig. 2.8) to provide a regulated barrier to the movement of ions and solutes through the epithelia or endothelia but also between cells (paracellular transport). Tight junctions also confer polarity to cells by acting as a gate between the apical and the baso-lateral membranes, preventing diffusion of membrane lipids and proteins. Twenty-four claudins (the protein in the junction) are differentially expressed in different cell types to regulate paracellular transport. For example, changes in claudin expression in the kidney nephron correlate with permeability changes. Mutations in claudins 16 (previously named parcellin-1) and 19, expressed in the thick ascending limb in the loop of Henle in the kidney, cause an inherited renal disorder, familial hypomagnesaemia with hypercalciuria and nephrocalcinosis (FHHNC; p. 657).

Gap junctions

Gap junctions (Fig. 2.8) allow low molecular weight substances to pass directly between cells, permitting metabolic and electric coupling (e.g. in cardiomyocytes). Protein channels made of six connexin proteins (as well as claudins and occludens) are aligned between adjacent cells and allow the passage of solutes up to 1000 kDa (e.g. amino acids, sugars, ions, chemical messengers). The channels are regulated by many factors such as intracellular Ca²⁺, pH, voltage. Gap junctions form in almost all interacting cells, but connexin family members are differentially expressed. Mutant connexins cause many inherited disorders, such as the X-linked form of Charcot–Marie–Tooth disease (GJB1; p. 1147) and are also a major cause of genetic hearing loss (GJB2).

Adherens junctions

Adherens junctions are multiprotein intercellular adhesive structures, prominent in epithelial tissues (Fig. 2.8b). They attach principally to actin microfilaments inside the cell with the aid of multiple additional proteins, and also attach and stabilize microtubules. At the apical sides of epithelial cells a prominent type of adherens junction, the zonula adherens, attaches to the circumferential actin stress fibres. The fascia adherens in cardiac muscle is also an adherens junction. Transmembrane proteins of the cadherin family provide the adhesion through interaction of their extracellular domains. Downregulation of cadherins is a feature of cancer progression in many cells.

Desmosomes (macula adherens)

Desmosomes provide strong attachment between cells and are prominent in tissues subject to stress such as skin and cardiac muscle (see Fig. 2.5, Fig. 2.8b and Fig. 24.1). Like adherens junctions, they are multiprotein complexes, where adhesion is provided by transmembrane cadherin proteins, desmogleins and desmocollins. However, within the cell desmosomes interact principally with intermediate filaments rather than microfilaments and microtubules. Germline mutations in genes encoding desmosomes are a cause of cardiomyopathy with/without cutaneous features and in pemphigus vulgaris and pemphigus foliaceus (p. 1222).

Basement membrane adhesion

Cells adhere (Fig. 2.8c) to non-basal lamina ECM via secreted proteins such as fibronectin and collagen, and to basal lamina proteins via focal adhesion and hemidesmosome multiprotein complexes (e.g. keratin or vimentin). Here, integrins replace cadherins as surface adhesion molecules as the key adhesive proteins. Integrins are transmembrane sensors or receptors, which change shape upon binding to ECM, a process called ‘outside-in’ signalling. Inside the cell, integrins interact with the cytoskeleton and a complex array of over 150 proteins that influence intracellular signalling pathways affecting proliferation, survival, shape, mobility and gene expression.

Outside-in signalling: forms the basis for anoikis or apoptotic death, such as occurs in cancer cells that inappropriately lose cell-substratum adhesion.

Inside-out signalling: intracellular changes can also be communicated extracellularly via integrins whereby intracellular changes cause integrins to change from an inactive to an actively adhesive conformation. This ‘inside-out’ signalling occurs when platelet integrins glycoprotein IIb-IIIa (GPIIb-IIa) are activated to bind fibrinogen at sites of vessel injury, resulting in platelet aggregation (p. 415 and Fig. 8.41).

Defective integrins are associated with many immunological and clotting disorders such as Bernard–Soulier syndrome and Glanzmann’s thrombasthenia (p. 420).

FURTHER READING

De Matteis MA, Luini A. Mendelian disorders of membrane trafficking. N Engl J Med 2011; 365:927–928.

Jean C, Gravelle P, Fournie JJ et al. Influence of stress on extracellular matrix and integrin biology. Oncogene 2011; 30:2697–2706.

Thomason HA, Scothern A, McHarg S et al. Desmosomes: adhesive strength and signalling in health and disease. Biochem J 2010; 429:419–433.

Cellular mechanisms

Cell signalling

Signalling or communication between cells is often via extracellular molecules or ligands which can be proteins (e.g. hormones, growth factors), small molecules (e.g. lipid-soluble steroid hormones such as oestrogen and testosterone) or dissolved gases such as nitric oxide. The signal is usually received by membrane protein receptors, although some signals such as steroid hormones, enter the target cell where they interact with intracellular receptors (Fig. 2.9). Some signalling, especially in the immune system, relies on cell–cell contact, where the signalling molecule (ligand) and receptor are on adjacent cells.

Figure 2.9 Cell signalling. (i) G-protein receptor binds ligand (e.g. hormone) and activates G-protein complex. The G-protein complex can activate three different secondary messengers: (a) cAMP generation; (b) inositol 1,4,5-trisphosphate (IP₃) and release of Ca²⁺; (c) diacylglycerol (DAG) activation of C-kinase and subsequent protein phosphorylation. (ii) Enzyme-linked receptors often dimerize upon ligand binding. Intracellular domains cross-phosphorylate and link to the phosphorylation cascades such as the MAP kinase cascade, via molecules such as Ras. (iii) Lipid-soluble molecules, e.g. steroids, pass through the cell membrane and bind to cytoplasmic receptors, which enter the nucleus and bind directly to DNA.

Receptors transduce signals across the membrane to an intracellular pathway or second messengers to change cell behaviour, often ultimately affecting gene expression (Figs 2.9, 2.10). The membrane-bound receptors fall into three main groups based on downstream signalling pathways:

Ion channel linked receptors (voltage or ligand activated ion channels; see Fig. 2.3). At synaptic junctions between neurones (Fig. 22.1), these receptors open in response to neurotransmitters such as glutamate, epinephrine (adrenaline) or acetylcholine to cause a rapid depolarization of the membrane.

G-protein-linked receptors such as the odorant and light (opsin) family of receptors belong to a large family of seven-pass transmembrane proteins (see Figs 2.2 and 2.9). On activation by ligand G-protein-linked receptors bind a GTP-binding protein (G-protein), which activates adjacent enzyme complexes or ion channels (Figs 2.9 and 22.1). The adjacent enzymes can be adenylcyclase (see below).

Enzyme-linked receptors (Figs 2.2 and 2.9) typically have an extracellular ligand-binding domain, a single transmembrane-spanning region, and a cytoplasmic domain that has intrinsic enzyme activity or which will bind and activate other membrane-bound or cytoplasmic enzyme complexes. This group of receptors is highly variable but many have kinase activity or associate with kinases, which act by phosphorylating substrate proteins usually on a tyrosine (e.g. the platelet-derived growth factor (PDGF) receptor) or a serine/threonine (e.g. the transforming growth factor-beta (TGF-β) receptor).

Signal transduction

Signal transduction from the receptor to the site of action in the cell is mediated by small signalling molecules called second messengers, or by signalling proteins (Fig. 2.9). Changes to activity of signalling proteins by acquired mutation occur in cancer, and many anti-cancer drugs target signalling pathways. For example, the Hedgehog pathway is involved in human development, tissue repair and cancer (Fig. 2.10). Inhibitors of this pathway are being developed for therapeutic interventions. The Wnt pathway is also involved in bone formation (p. 550).

Second messengers include cAMP and lipid-derived inositol triphosphate (IP₃) and diacylglycerol (Fig. 2.9). These molecules diffuse from the receptor to bind and change the activity of downstream proteins propagating the signal. cAMP triggers a protein signalling cascade by activating a cAMP-dependent protein kinase. Diacylglycerol activates protein kinase C while IP₃ mobilizes calcium from intracellular stores (e.g. from the ER; Fig. 14.9).

G-proteins or GTP-binding proteins are signalling proteins which switch between an active state when GTP is bound and an inactive state when bound to GDP. The most well-known members are the Ras superfamily, comprising Ras, Rho, Rab, Arf and Ran families. Activation of Ras members by somatic mutation is found in ~33% of human cancers. Ras members are often bound downstream of tyrosine kinase receptors, where they transmit signals by activating a cascade of downstream protein kinase activity (Fig. 2.9). Ras signalling molecules have roles in many cellular activities, including regulation of cell cycle, intracellular transport, and apoptosis.

Kinase and phosphatase signalling proteins are enzymes that phosphorylate or dephosphorylate residues on downstream proteins to alter their activity. Chains of kinase activity (phosphorylation cascades) consisting of sequential phosphorylation of proteins can transduce signals from the membrane receptor to the site of action in the cell. The tyrosine kinase receptors phosphorylate each other when ligand binding brings the intracellular receptor components into close proximity (see Fig. 2.9). The inner membrane and cytoplasmic targets of these activated receptor complexes are ras, protein kinase C and ultimately the MAP (mitogen-activated protein) kinase, Janus-Stat pathways or phosphorylation of IκB causing it to release its DNA-binding protein, nuclear factor kappa B (NFκB). For example, activated Ras binds and activates the kinase Raf, the first of a set of three mitogen-activated protein (MAP) kinases, which transmit signals by successive phosphorylation of target proteins which can ultimately effect transcription (Fig. 2.9). Kinases and phosphatases are frequently mutated in cancers. Somatic mutations in one Raf member, B-Raf, occur in ~60% of malignant melanomas (usually the mutation V600E) and are common in other cancers (p. 1225).

Figure 2.10 Signal transduction showing the Hedgehog and Wnt signalling pathways. (a) Wnt signalling has three pathways: the canonical (β catenin), Wnt/Ca²⁺ and planar cell polarity pathways. Wnt binds to the frizzled protein and then disheveilled activity via other pathways inhibits phosphorylation of β catenin. This alters gene transcription. TCF, T-cell factor. (b) Hedgehog ligand (Hh) binds to a 12-transmembrane protein receptor Patched (Ptc). This acts as an inhibitor of smoothened (Smo), another transmembrane protein related to the Frizzled family of Wnt receptors. In the presence of Hh the inhibitory effects of Ptc on Smo are removed and Smo is phosphorylated by protein kinase A and other kinases. This prevents cleavage of Ci which enters the nucleus inducing the transcription of Hh target genes. Ci, cubitus interruptus, a zinc finger protein.

FURTHER READING

Alberts B, Johnson A, Lewis J et al. Molecular Biology of the Cell, 5th edn. New York: Garland Science; 2007.

Dlugosz AA, Talpaz M. Following the hedgehog to new cancer therapies. N Engl J Med 2009; 361:1202–1205.

SIGNIFICANT WEBSITE

Cell signalling information

http://www.biochemj.org/csb/

Nuclear control

DNA and RNA structure

Hereditary information is contained in the sequence of the building blocks of double-stranded deoxyribonucleic acid (DNA) (Fig. 2.11). Each strand of DNA is made up of a deoxyribose-phosphate backbone and a series of purine (adenine (A) and guanine (G)) and pyrimidine (thymine (T) and cytosine (C)) bases, and because of the way the sugar phosphate backbone is chemically coupled, each strand has a polarity with a phosphate at one end (the 5′ end) and a hydroxyl at the other (the 3′ end). The two strands of DNA are held together by hydrogen bonds between the bases. A can only pair with T, and G can only pair with C, therefore each strand is the antiparallel complement of the other (Fig. 2.11b). This is key to DNA replication because each strand can be used as a template to synthesize the other.

Figure 2.11 DNA and its structural relationship to human chromosomes. (a) A polynucleotide strand with the position of the nucleic bases indicated. Individual nucleotides form a polymer linked via the deoxyribose sugars. The 5′ carbon of the heterocyclic sugar structure is coupled to a phosphate molecule. The 3′ carbon couples to the phosphate on the 5′ carbon of ribose of the next nucleotide forming the sugar-phosphate backbone of the nucleic acid. The 5′ to 3′ linkage gives orientation to a sequence of DNA. (b) Double-stranded DNA. The two strands of DNA are held together by hydrogen bonds between the bases. T always pairs with A, and G with C. The orientation of the complementary single strands of DNA (ssDNA) is thus complementary and anti-parallel, i.e. one will be 5′ to 3′ while the partner will be 3′ to 5′. The helical 3D structure has major and minor grooves and a complete turn of the helix contains 12 base-pairs. These grooves are structurally important, as DNA-binding proteins predominantly interact with the major grooves. (c) Supercoiling of DNA. The large stretches of helical DNA are coiled to form nucleosomes and further condensed into the chromosomes that can be seen at metaphase. DNA is first packaged by winding around nuclear proteins – histones – every 180 bp. This can then be coiled and supercoiled to compact nucleosomes and eventually visible chromosomes. (d) At the end of the metaphase DNA replication results in a twin chromosome joined at the centromere. This picture shows the chromosome, its relationship to supercoiling, and the positions of structural regions: centromeres, telomeres and sites where the double chromosome can split. Chromosomes are assigned a number or X or Y, plus short arm (p) or long arm (q). The region or subregion is defined by the transverse light and dark bands observed when staining with Giemsa (hence G-banding) or quinacrine and numbered from the centromere outwards. Chromosome constitution = chromosome number + sex chromosomes + abnormality; e.g. 46XX = normal female; 47XX+21 = Down’s syndrome; (trisomy 21) 46XYt (2;19) (p21; p12) = male with a normal number of chromosomes but a translocation between chromosome 2 and 19 with breakages at short-arm bands 21 and 12 of the respective chromosomes.

The two strands twist to form a double helix with a major and a minor groove, and the large stretches of helical DNA are coiled around histone proteins to form nucleosomes (Fig. 2.11c). They can be condensed further into the chromosomes that can be visualized by light microscopy at metaphase (see below; Fig. 2.11, Fig. 2.19).

To express the information in the genome, cells first transcribe the code into the single strand ribonucleic acid (RNA). RNA is similar to DNA in that it comprises four bases A, G and C but with uracil (U) instead of T, and a sugar phosphate backbone with ribose instead of deoxyribose. Several types of RNA are made by the cell. Messenger RNA (mRNA) codes for proteins that are translated on ribosomes. Ribosomal RNA (rRNA) is a key catalytic component of the ribosome and amino acids are delivered to the nascent peptide chain on transfer RNA (tRNA) molecules. There are also a variety of RNAs that regulate gene expression or RNA processing. These include microRNA (miRNA) and small interfering RNA (siRNA) (see p. 27) that typically bind to a subset of mRNAs and inhibit their translation, or initiate their degradation, respectively. Other non-coding RNAs are involved in X-inactivation and telomere maintenance or RNA splicing and maturation.

DNA transcription

A gene is a length of DNA (usually 20–40 kb but the muscle protein dystrophin is encoded by 2.4 Mb) that contains the codes for a polypeptide sequence. Three adjacent nucleotides (a codon) specify a particular amino acid, such as AGA for arginine. There are only 20 common amino acids, but 64 possible codon combinations make up the genetic code. This redundancy means that most amino acids are encoded by more than one triplet and other codons are used as signals for initiating or terminating polypeptide-chain synthesis.

RNA is transcribed from the DNA template by an enzyme complex of more than one hundred proteins including RNA polymerase, transcription factors and enhancers. Promoter regions upstream of the gene dictate the start point and direction of transcription. The complex binds to the promoter region, the nucleosomes are remodelled to allow access, and a DNA helicase unwinds the double helix. RNA, like DNA, is synthesized in the 5′ to 3′ direction as ribonucleotides are added to the growing 3′ end of a nascent transcript. RNA polymerase does this by base-pairing the ribonucleotides to the DNA template strand running in the 3′ to 5′ direction. Messenger RNA is modified as it is synthesized (Fig. 2.12). It is capped at the 5′ end with a modified guanine that is required for efficient processing of the mRNA and efficient translation, and introns are spliced from the nascent chain. Finally, the 3′ of the mRNA is modified with up to 200 A nucleotides by the enzyme poly-A polymerase. This 3′ poly-A tail is essential for nuclear export (through the nuclear pores), stability and efficient translation into protein by the ribosome.

Human protein coding sequences (exons) are interrupted by intervening sequences that are non-coding (introns) at multiple positions (Fig. 2.12). These have to be spliced from the nascent message in the nucleus by an RNA/protein complex called a spliceosome. Differential splicing describes the process by which two or more introns and their intervening exons are spliced from the mRNA. This contributes significantly to the complexity of the human transcriptome as proteins translated from these messages lack particular domains. This exon skipping can produce different protein activities.

Figure 2.12 Transcription and translation (DNA to RNA to protein). RNA polymerase creates an RNA copy of the DNA gene sequence. This primary transcript is processed: capping of the 5′ free end of the mRNA precursor involves the addition of an inverted guanine residue to the 5′ terminal which is subsequently methylated, forming a 7-methylguanosine residue.

The 3′ end of mRNA defined by the sequence AAUAAA acts as a cleavage signal for an endonuclease, which cleaves the growing transcript about 20 bp downstream from the signal. The 3′ end is further processed by a poly-A polymerase which adds adenosine residues to the 3′ end, forming a poly-A tail (polyadenylation).

Splicing out of the introns then produces the mature mRNA, which is trafficked out of the nucleus via nuclear pores. Ribosomal subunits assemble on the mRNA moving along 5′ to 3′.

With the transport of amino acids to their active sites by specific tRNAs, the complex translates the code, producing the peptide sequence.

Control of gene expression

The genome of all cells in the body encodes the same genetic information, yet different cell types express a very different subset of proteins and respond to external signals to switch on a new set of genes or to switch off a pathway. Gene expression can be controlled at many steps from transcription to protein degradation. However, for many genes transcription is the key point of regulation. This is controlled primarily by proteins which bind to short sequences within the promoter regions that either repress or activate transcription, or to more distant sequences where proteins bind to enhance expression. These transcription factors and enhancers are often the end points of signalling pathways that transduce extracellular signals to changes in gene expression (Fig. 2.9).

Often this involves the translocation of an activated factor from the cytoplasm to the nucleus. In the nucleus the DNA binding proteins recognize the shape and position of hydrogen bond acceptor and donor groups within the major and minor grooves of the double helix (i.e. the double helix does not need to be unwound). There are several classes of DNA binding protein that differ in the protein structural motif that allows them to interact with the double helix. These primarily include helix-turn-helix, zinc finger and leucine zipper motifs, although protein loops and β-sheets are used by some proteins. More permanent control of gene expression patterns can be achieved epigenetically. These are modifications (typically methylation and/or acetylation) of the DNA, or the histones of the nucleosome, that silence genes. Epigenetic modification is also heritable meaning that a dividing liver cell, for example, can give rise to two daughter cells with the same epigenetic signals such that they express the appropriate transcriptome for a liver cell. Epigenetic change forms the basis of genetic imprinting (see p. 42).

Most of the genome is transcribed but only a minority of transcripts encode proteins (see Human Genetics, p. 34). The non-coding RNAs (ncRNAs) include a group that regulate gene expression (see DNA and RNA structure). miRNAs and siRNAs are short ncRNAs (19–29 bp) that are known to regulate expression of approximately 30% of genes by degradation of transcripts or repression of protein synthesis. With further annotation of the genome a growing range of additional regulatory ncRNA classes are being identified, many of which control gene expression by epigenetic mechanisms.

FURTHER READING

Bernstein BE, Meissner A, Lander ES. The mammalian epigenome. Cell 2007; 128:669–681.

Zhou H, Hu H, Lai M. Non-coding RNAs and their epigenetic regulatory mechanisms. Biol Cell 2010; 102:645–655.

The cell cycle and mitosis

The cell duplication cycle has four phases, G1, S, G2 and mitosis (Fig. 2.13), and takes about 20–24 hours to complete for a rapidly dividing adult cell. G1, S and G2 are collectively known as interphase during which the cells double in mass (the two gap phases are used for growth) and duplicate their 46 chromosomes (S phase). Mitosis describes, in four sub-phases (prophase, metaphase, anaphase and telophase), the process of chromosome separation and nuclear division before cytokinesis (division of the cytoplasm into two daughter cells).

Figure 2.13 The cell cycle. Cells are stimulated to leave non-cycle G0 to enter G1 phase by growth factors. During G1, transcription of the DNA synthesis molecules occurs. Rb is a ‘checkpoint’ (inhibition molecule) between G1 and S phases and must be removed for the cycle to continue. This is achieved by the action of the cyclin-dependent kinase produced during G1. During the S phase, any DNA defects will be detected and p53 will halt the cycle (see p. 46). Following DNA synthesis (S phase), cells enter G2, a preparation phase for cell division. Mitosis takes place in the M phase. The new daughter cells can now either enter G0 and differentiate into specialized cells, or re-enter the cell cycle.

Phases of mitosis. DNA is in blue and the microtubules of the cytoskeleton and mitotic spindle in green. The red marker CENP-V labels kinetochores in prometaphase and metaphase, the mid-zone in anaphase and the mid-body in cytokinesis.

(Courtesy of Tadeu AM, Ribeiro S, Johnston J et al. CENP-V is required for centromere organization, chromosome alignment and cytokinesis. The EMBO Journal 2008; 27:2510–2522.)

Synthesis phase; DNA replication

DNA synthesis is initiated simultaneously at multiple replication forks in the genome and is catalysed by a multienzyme complex. The key components of the replication machinery are:

DNA helicase which hydrolyses ATP to unwind the double helix and expose each strand as a template for replication. The two strands are antiparallel, and because DNA can only be extended by addition of nucleotide triphosphates to the 3′-hydroxyl end of the growing chain, replication of each strand must be treated differently. For one strand, called the leading template strand, the replication fork is moving in a 3′ to 5′ direction along the template, meaning that the newly synthesized strand is being synthesized in a 5′ to 3′ direction.

DNA primase synthesizes a short (~10 nucleotide) RNA molecule annealed to the DNA template which acts as a primer for DNA polymerase.

DNA polymerase extends the primer by adding nucleotides to the 3′-end. For the leading template strand, the RNA primer is only required to initiate synthesis once and polymerization continues just behind the replication fork. For the antiparallel strand, the template is being exposed in a 5′ to 3′ direction and DNA primase is required to synthesize RNA primers every ~200 nucleotides to prime DNA synthesis in the opposite direction to the replication fork. To allow for this, the synthesis against this template is delayed and so it is called the lagging strand and requires more of the strand to be exposed for DNA primase and DNA polymerase to engage.

Single-strand DNA binding proteins are required to bind to the exposed single-strand DNA and stabilize it in single-strand form. Once DNA polymerase has extended the new strand to cover the 200 nucleotides between each RNA primer (the single-strand RNA/DNA hybrid is called an Okazaki fragment).

– RNAase H removes the RNA primer from the preceding Okazaki fragment, DNA polymerase extends the new strand over the gap.

– DNA ligase joins the two DNA fragments together.

The phases of mitosis

Prophase. The two sister chromatids (the replicated chromosomes held together by a protein complex called a kinetochore) condense in the nucleus. The two centrosomes between which the microtubules of the mitotic spindle will form move apart in the cytoplasm. At the end of prophase (sometimes known as prometaphase), the nuclear membrane breaks down and the spindle microtubules attach to the kinetochores.

Metaphase. The chromosomes are aligned on a central plane with the two centrosomes at opposite poles. The sister chromatids are attached to microtubules from different centrosomes via the kinetochore.

Anaphase. The sister chromatids separate and are pulled in opposite directions as the microtubules shorten towards their respective spindle poles.

Telophase. Each set of daughter chromosomes are held at a spindle pole and the nuclear envelope reforms around the genome of each new daughter cell.

Cytokinesis

Binary fission of the cytoplasm begins in telophase before the completion of mitosis with the appearance of a ring of actin and myosin filaments around the equator of the cell. Cytokinesis is completed as the ring contracts to create a cleavage furrow and separate the two daughter cells.

Control of the cell cycle and checkpoints

Cells can exit the cell cycle and become quiescent. Indeed most terminally-differentiated adult cells are in a phase termed G0 in which the cycling machinery is switched off. In some cell types the switch is irreversible (e.g. in neurones), but others, like hepatocytes, retain the ability to re-enter the cell cycle and proliferate. This gives the liver a significant ability to regenerate following damage.

Cyclin-dependent kinases (Cdks), Retinoblastoma (Rb) and p53

Progression through the cell cycle is tightly controlled and punctuated by three key checkpoints when the cell interprets environmental and cellular signals to determine whether it is appropriate or safe to proceed (Fig. 2.13). The switches that allow progression beyond these checkpoints are a family of small protein complexes called cyclin-dependent kinases (Cdks) that phosphorylate serines or threonines in key target proteins at each stage. It is the regulatory cyclin subunit of the Cdks that oscillates during the cell cycle (the actual kinase domain may be present throughout but only activated by the transient expression of its cognate cyclin).

Checkpoints

Synthesis and secretion

Protein translation

The mature mRNA is transported through the nuclear pore into the cytoplasm for translation into protein by ribosomes (Fig. 2.12).

The two subunits of ribosomes (the 40S and 60S) are formed in the nucleolus from multiple proteins and several rRNAs, before transport to the cytoplasm.

In the cytoplasm, the two subunits interact on an mRNA molecule, usually via ribosome binding sites encoded in the untranslated 5′ region of the message. The mRNA is then pulled through the ribosome until a translation initiation codon is encountered (usually an AUG coding for methionine).

The triplets of adjacent bases of the mRNA (codons) are exposed and recognized by complementary sequences, or anti-codons, in tRNA molecules that dock on the ribosome.

Each tRNA molecule carries an amino acid specific to the anti-codon. As the mRNA is pulled through the ribosome in the 5′ to 3′ direction, amino acids are transferred from tRNA molecules and sequentially linked to the carboxy-terminus of the growing polypeptide by the peptidyl transferase activity of the ribosome.

The poly-A tail of the mRNA is not translated (3′ untranslated region) and is preceded by a translational stop codon, UAA, UAG or UGA.

Translation of secreted or integral membrane proteins is different. Typically, the first few amino acids of the amino terminus of the nascent polypeptide exit the ribosome and are recognized by a signal recognition particle (SRP) that stops translation until the complex is docked onto the ER via the SRP receptor. Translation then continues and the protein is translocated into or through the ER membrane via the Sec61 translocation complex as it is being synthesized (co-translational transport).

Protein structure

The amino acid sequence of a polypeptide chain (its primary structure) ultimately determines its shape. The weak bonds (hydrogen bonds, electrostatic and van der Waals interactions) formed between the side chains and/or the peptide backbone of the different amino acids provide the secondary structure (α-helices, β-strands, loops). These are in turn folded into a three-dimensional, tertiary structure to provide functional protein domains of 40–350 amino acids. The modular nature of domains allows their functionality to be combined in protein complexes of different proteins or, by gene fusion and/or exon skipping, into new multidomain single polypeptides. This final level of organization is the quaternary structure.

In cells, the folding of polypeptides into fully functional proteins is facilitated by an assortment of molecular chaperones, e.g. heat shock proteins (HSP), which bind to partially folded polypeptides and prevent the formation of inappropriate bonds.

Lipid synthesis

Fatty acids, molecules with a hydrocarbon chain with 4–28 carbons, are central to cellular life and human metabolism. They form the hydrophobic moiety of membrane lipids (see p. 17), they are precursors for short-lived, near acting lipid paracrines such as leukotrienes and prostaglandins, and they are energy stores particularly in the form of triglycerides.

Fatty acids as an energy store

Long chain fatty acids can be incorporated into triglycerides, which are relatively inert and lipophilic compounds that can be stored as fat droplets in cells (particularly adipocytes). When blood glucose is low, these triglycerides are hydrolysed, secreted into the bloodstream as free fatty acids, and distributed as an energy source for the cells of the body. In the recipient cell, fatty acids are metabolized in the mitochondrion to produce acetyl-CoA for the Krebs cycle (see p. 31). This is a particularly efficient storage system as gram for gram, triglyceride produces six times the amount of energy than glycogen and occupies less volume in the cell.

Essential fatty acids

Unsaturated fatty acids (UFAs) have carbon–carbon double bonds that are introduced by desaturase enzymes by removal of the hydrogens. The remaining hydrogens on either side of the double bond can be on the same side of the chain (cis) or on opposite sides (trans). The acyl chain of cis UFAs is kinked, which influences the packing of membrane lipids and the function of the membrane barrier. Humans have desaturases that can introduce some double bonds but lack a desaturase required to make linoleic acid or alpha-linolenic acid. These fatty acids have double bonds 6 and 3 carbons from their respective omega ends (the methyl end of the chain). Omega-6 and omega-3 UFAs are essential fatty acids that must be obtained from the diet (see Ch. 5). They are precursors of arachidonic acid and eicosapentaenoic acid, respectively, from which cyclo-oxygenase 1 and 2 (cox-1 and 2) (see p. 826) produce the paracrines that play a role in inflammation, pain, fever, and airway constriction.

Intracellular trafficking, exocytosis (secretion) and endocytosis

The molecular composition, the lipids, proteins and cargo of each type of organelle is different and distinct from the plasma membrane, yet there is a continuous flux of material between many of the different compartments. Much of this flow is via vesicles that bud from one compartment to fuse with another. It is regulated by an array of lipids and membrane proteins (coat proteins, adaptors, signalling molecules and fusion proteins).

Budding of vesicles involves recruitment of coat proteins and adaptors to the membrane. Thus, a receptor on binding to its ligand may stimulate a kinase to phosphorylate neighbouring phosphatidyl-inositol, or activate an associated small GTPase (Arf or SarI), increasing their affinities for a coat protein or adaptor. The coat protein (clathrin at the plasma membrane, COPI at the Golgi, COPII in the ER) forms a mesh around the developing vesicle (Fig. 2.4). Fully-formed vesicles normally shed their coat (often triggered by GTP hydrolysis by the GTPase), leaving the adaptor/receptor/lipid combination to identify the vesicle.

Targeting and trafficking is mediated by a different family of GTPases (Rab proteins) that recognize the combination of vesicle surface markers and targets them appropriately. Once activated by GTP, the Rab proteins are lipid-anchored to the vesicle where they engage with a diverse pool of Rab effectors. These can be motor proteins that traffic the vesicle along the microfilament and microtubule fibres of the cytoskeleton, or tethering proteins on the target membrane.

Fusion is accomplished by membrane-fusion SNARES (Fig. 2.4). The v-SNARE protein on the vesicle (often associated with the Rab effector) interacts with the t-SNARE on the target membrane to facilitate fusion of the two compartments (distinct combinations of v-SNARE and t-SNARE specify particular pathways).

Vesicles that fuse with the plasma membrane replenish membrane lipids and proteins and also release cargo extracellularly (exocytosis; Fig. 2.4). Clathrin-coated vesicles are also used to recycle protein from the plasma membrane, and import extracellular cargo to internal compartments called endosomes. From endosomes cargo such as receptors is recycled back to the membrane, or cargo is sent for degradation in the lysosome in the process called endocytosis.

Pinocytosis and phagocytosis (see p. 19) are forms of endocytosis. Endocytosis can also occur via plasma membrane microdomains or lipid rafts called caveolae which pinch in to form uncoated vesicles that fuse with endosomes. Endocytosed vesicles can also be transported across the cell in a process called transcytosis. For example, cargo can be endocytosed at the apical surface of an epithelial cell and exocytosed across the basolateral membrane.

Energy production

As food is catabolized, cells temporarily store the energy released in carrier molecules. These include reduced nicotinamide adenine dinucleotide and reduced nicotinamide adenine dinucleotide phosphate (NADH and NADPH, respectively) that release energy as they are oxidized to NAD⁺ and NADP⁺. The molar ration of NAD⁺ to NADH is typically high in a cell because NAD⁺ is used as an oxidizing agent in catabolic pathways. In contrast, the molar ratio of NADP⁺ to NADPH is typically low because NAPH is used as a reducing agent in anabolic reactions. The most versatile carrier is adenosine triphosphate (ATP). ATP can be hydrolysed to ADP and phosphate (Pi) and the release of energy used to power less favourable reactions.

The lipids and polysaccharides provide the most energy in a human diet, although protein can also be used. Enzymes secreted into the gut break down these polymers to their respective building blocks of fatty acids and sugars that are absorbed by the apical membrane of the gut epithelium (the transporters involved in the transcellular transport of glucose across the enterocyte are described in Figure 6.24). Fatty acids and sugars are further catabolized by enzyme pathways inside the cell to produce an array of activated carrier molecules.

Glycolysis

The six-carbon glucose is primarily catabolized in 10 steps by enzymes of the glycolytic pathway (see Fig. 8.25) to produce two three-carbon molecules of the carboxylic acid pyruvate. Glycolysis occurs in the cytosol and the first three steps actually consume energy (2×ATP), but the remaining six steps generate 4×ATP and 2×NADH, giving a net return of 2×ATP and 2×NADH.

Pyruvate is central to metabolism. It can be catabolized as fuel for the Krebs cycle and oxidative phosphorylation. It can regulate the cellular redox state by dehydration to lactate and regeneration of NADH. It can be a precursor for anabolism of fuels (glucose, glycogen and fatty acids) or amino acids, via conversion to alanine. The fate of pyruvate depends on the environmental conditions and needs of the cell.

Under anaerobic conditions, e.g. in skeletal muscle following prolonged exercise where NAD⁺ must be regenerated (because it is needed as an oxidizing reagent in the catabolism of glucose), pyruvate is reduced to lactate as NADH is oxidized to NAD⁺ in a ‘redox’ reaction catalysed by lactate dehydrogenase. This allows the muscle to continue to catabolize glucose to generate ATP under conditions in which metabolic oxygen is limiting. The lactate is secreted into the bloodstream and is ultimately metabolized by the liver back into glucose by gluconeogenesis consuming 6×ATP in the process. This cycle of anaerobic respiration that produces lactate in muscle, which is released into the bloodstream to be taken up by the liver for reconversion to glucose is known as the Cori cycle.

Krebs cycle

Under aerobic conditions, the fate of pyruvate is different. It is transported into the mitochondrion, where it is decarboxylated to acetyl-CoA and NADH, with CO₂ released as a waste product. The acetyl-CoA formed from pyruvate (or from catabolism of amino acids or β-oxidation of fatty acids) enters the Krebs cycle in the matrix of the mitochondrion, where it is condensed with the 4-carbon oxaloacetate to form the 6-carbon citric acid. Citric acid has three carboxylate groups providing the alternative names for the Krebs cycle (the citric acid or tricarboxylic acid cycle). In eight reactions, the Krebs cycle oxidizes two of the four carbons of citric acid to 2×CO₂, regenerates oxaloacetate to enter the next cycle, and in the process, provides enough energy to produce 1×GTP, 3×NADH and 1× reduced flavin adenine dinucleotide (FADH₂, a carrier of electrons much like NADH). The latter two products feed their electrons into the electron transport chain where they are used to make ATP from ADP and Pi, a process known as oxidative phosphorylation.

In addition to energy production, glycolysis and the Krebs cycle provide precursors for the anabolism of amino acids, cholesterol, fatty acids, nucleotides amino sugars and lipids.

Oxidative phosphorylation

The activated carriers NADH and FADH₂ carry high energy electrons as hydride (a proton H⁺ and two electrons), which are donated to complexes of the electron transport chain, in the process regenerating NAD⁺ and FAD as oxidizing agents for continued oxidative metabolism. The electrons are passed down the series of inner membrane proteins of the mitochondrion, moving to a lower energy state at each step until they are finally transferred to oxygen to produce water (hence the requirement for molecular oxygen). The energy released by the electrons is used to efflux protons (H⁺) into the inter-membrane space, setting up an H⁺ electrochemical gradient, which the ATP synthase (or F₀F₁ ATPase), another integral membrane protein, uses to drive the formation of ATP from ADP and Pi. Oxidative phosphorylation produces the bulk of the cellular ATP. A single molecule of glucose is able to produce a net yield of approximately 30×ATP. Only two of these come from glycolysis directly.

Cellular degradation and death

Cell dynamics

Cell components are continually being formed and degraded, and most of the degradation steps involve ATP-dependent multienzyme complexes. Old cellular proteins are mopped up by a small cofactor molecule called ‘ubiquitin’, which interacts with these worn proteins via their exposed hydrophobic residues. Ubiquitin is a small 8.5 kDa regulating protein present universally in all living cells. Cells mark the destruction of a protein by attaching molecules to the protein. This ‘ubiquitination’ signals the protein to move to lysosomes or proteosomes for destruction. A complex containing more than five ubiquitin molecules is rapidly degraded by a large proteolytic multienzyme array termed ‘26S proteosome’. Ubiquitin also plays a role in regulation of the receptor tyrosine kinase in the cell cycle and in repair of DNA damage. The failure to remove worn proteins can result in the development of chronic debilitating disorders. For example, Alzheimer’s and frontotemporal dementias are associated with the accumulation of ubiquinated proteins (prion-like proteins), which are resistant to ubiquitin-mediated proteolysis. Similar proteolytic-resistant ubiquinated proteins give rise to inclusion bodies found in myositis and myopathies. This resistance can be due to point mutation in the target protein itself (e.g. mutant p53 in cancer; see p. 46) or as a result of an external factor altering the conformation of the normal protein to create a proteolytic-resistant shape, as in the prion protein of variant Creutzfeldt–Jakob disease (vCJD). Other conditions include von Hippel–Lindau syndrome (p. 634) and Liddle’s syndrome (p. 653).

Free radicals

A free radical is any atom or molecule which contains one or more unpaired electrons, making it more reactive than the native species. The major free radical species produced in the human body are the hydroxyl radical (OH), the superoxide radical (O₂^–) and nitric oxide (NO).

Free radicals have been implicated in a large number of human diseases. The hydroxyl radical is by far the most reactive species but the others can generate more reactive species as breakdown products. When a free radical reacts with a non-radical, a chain reaction ensues which results in direct tissue damage by membrane lipid peroxidation. Furthermore, hydroxyl radicals can cause genetic mutations by attacking purines and pyrimidines. Superoxide dismutases (SOD) convert superoxide to hydrogen peroxide and are thus part of an inherent protective antioxidant mechanism. Patients with dominant familial forms of amyotrophic lateral sclerosis (motor neurone disease) have mutations in the gene for Cu-Zn SOD-1 catalases. Glutathione peroxidases are enzymes that remove hydrogen peroxide generated by SOD in the cell cytosol and mitochondria.

Free radical scavengers bind reactive oxygen species. Alpha-tocopherol, urate, ascorbate and glutathione remove free radicals by reacting directly and non-catalytically. Severe deficiency of α-tocopherol (vitamin E deficiency) causes neurodegeneration. There is evidence that cardiovascular disease and cancer can be prevented by a diet rich in substances that diminish oxidative damage (p. 211). The principal dietary antioxidants are vitamin E, vitamin C, β-carotene and flavonoids.

Heat shock proteins

The heat shock response is a highly conserved and ancient response to tissue stress (chemical and physical) that is mediated by activation of specific genes leading to the production of specific heat shock proteins (HSPs). The diverse functions of HSPs include the transport of proteins in and out of specific cell organelles, acting as molecular chaperones (the catalysis of protein folding and unfolding) and the degradation of proteins (often by ubiquitination pathways). As well as heat, cytotoxic chemicals and free radicals can trigger HSP expression. The unifying feature, which leads to the activation of HSPs, is the accumulation of damaged intracellular protein. Tumours have an abnormal thermotolerance, which is the basis for the observation of the enhanced cytotoxic effect of chemotherapeutic agents in hyperthermic subjects. The HSPs are expressed in a wide range of human cancers and have been implicated in tumour cell proliferation, differentiation, invasion, metastasis, cell death and immune response.

FURTHER READING

Weiner LM, Lotze MT. Tumor-cell death, antophagy, and immunity. N Engl J Med 2012; 366:1156–1158.

Autophagy

Cells continually recycle material. For example, cellular proteins targeted for degradation can be ubiquitinated and degraded by the proteasome (p. 31), and mRNA can be de-tailed and degraded by the exosome or decapping complex. Cells respond to stresses like starvation by degrading much of their cytoplasmic contents in order to recycle components and survive.

Cells achieve this by autophagy, during which everything from sugars, lipids, protein aggregates, ribosomal particles and organelles are enclosed in a double membrane (a vesicle forms a cup shape to extend around the material). The new autophagosome then fuses with a lysosome leading to degradation of the contents by acid hydrolysis. Autophagic induction is complex and still not completely understood, but it has roles in tumour growth, elimination of intracellular microorganisms, and elimination of toxic misfolded proteins such as those that give rise to neurodegenerative disorders. Autophagy can suppress apoptotic cell death induced by chemotherapy, while excessive autophagy in response to starvation can lead to autophagic cell death.

Necrotic cell death

In necrotic cell death, external factors (e.g. hypoxia, chemical toxins, injury) damage the cell irreversibly. Necrotic cell death is associated with ischaemia and stroke, cardiac failure, neurodegeneration, pathogen infection and occurs in the centre of tumours deprived of a blood supply. Characteristically, there is an influx of water and ions, after which the cell and its organelles swell and rupture. Lysosomal proteases released into the cytosol cause widespread degradation. There is a rise in cytosolic calcium, increased reactive oxygen species (ROS), intracellular acidification and ATP depletion. Necrosis is regulated and the cellular processes and activated pathways are still being investigated. Necrotic cell lysis induces acute inflammatory responses owing to the release of lysosomal enzymes into the extracellular environment.

Apoptotic cell death

Most terminally-differentiated cells can no longer replicate and eventually die by apoptosis, a type of programmed cell death. Apoptosis occurs through the deliberate activation of cellular pathways, which function to cause cell suicide. In contrast to necrosis, apoptosis is orderly. Cells are destroyed and their remains phagocytosed by adjacent cells and macrophages without inducing inflammation. Apoptosis is essential for many life processes, including tissue maintenance in the adult, tissue formation in embryogenesis, and normal metabolic processes such as autodestruction of the thickened endometrium to cause menstruation in a non-conception cycle. Cells which have accumulated irreparable DNA damage from toxins or ultraviolet radiation also trigger apoptosis via p53 protein to prevent replication of mutations or progression to cancer. Many chemotherapy and radiotherapy regimens work by triggering apoptotic pathways in the tumour cell.

Apoptosis has characteristic features:

Shrinkage of the cell and its nucleus

Chromatin aggregation into membrane-bound vesicles called apoptotic bodies

Cell ‘blebs’ (which are intact membrane vesicles)

Absence of inflammatory response.

Apoptosis requires proteases called caspases whose action is very tightly regulated. Caspases not only destroy cell organelles, they cleave nuclear lamin causing collapse of the nuclear envelope and activate, through cleavage, nucleases that degrade DNA. Caspase activation can be achieved by:

signals from outside the cell (the extrinsic apoptotic pathway or the death receptor pathway) and

internal signals, such as DNA damage (the intrinsic apoptotic pathway or the mitochondrial pathway) (Fig. 2.14).

Figure 2.14 Extrinsic and intrinsic apoptotic signalling network. The Fas protein and Fas ligand (FasL) are two proteins that interact to activate an apoptotic pathway. Fas and FasL are both members of the TNF (tumour necrosis factor) family – Fas is part of the transmembrane receptor family and FasL is part of the membrane-associated cytokine family. When the homotrimer of FasL binds to Fas, it causes Fas to trimerize and brings together the death domains (DD) on the cytoplasmic tails of the protein. The adaptor protein, FADD (Fas-associating protein with death domain), binds to these activated death domains and they bind to pro-caspase 8 through a set of death effector domains (DED). When pro-caspase 8s are brought together, they transactivate and cleave themselves to release caspase 8, a protease that cleaves protein chains after aspartic acid residues. Caspase 8 then cleaves and activates other caspases, which eventually leads to activation of caspase 3. Caspase 3 cleaves ICAD, the inhibitor of CAD (caspase activated DNase), which frees CAD to enter the nucleus and cleave DNA. Although caspase 3 is the pivotal execution caspase for apoptosis, the processes can be initiated by intrinsic signalling, which always involves mitochondrial release of cytochrome C and activation of caspase 9. The release of cytochrome C and mitochondrial inhibitor of lAPs is mediated via Bcl-2 family proteins (including Bax, Bak) forming pores in the mitochondrial membrane. Interestingly, the extrinsic apoptotic signal is aided and amplified by activation of tBid, which recruits Bcl-2 family members and hence also activates the intrinsic pathway. Apaf1, apoptotic protease activating factor 1; Bid, family member of Bcl-2 protein; IAPs, inhibitor of apoptosis proteins.

The extrinsic pathway is required for tissue remodeling and induction of immune self-tolerance. Cells marked for apoptosis express a member of the tumour necrosis factor (TNF) death receptor family, such as Fas, on their surfaces. Ligand binding (e.g. by Fas ligand expressed on lymphocytes) causes activation of adaptor proteins which produce a cascade of caspase activation. The extrinsic pathway can be amplified by induction of the intrinsic pathway (see below).

The intrinsic pathway centres on increased mitochondrial permeability and release of pro-apoptotic proteins like cytochrome C. Cellular stresses such as growth factor withdrawal, p53-dependent cell cycle arrest, DNA damage, and intracellular reactive oxygen species induce expression of pro-apoptotic Bcl-2 proteins, Bax and Bak. These enter the outer mitochondrial membrane forming pores that release cytochrome C which forms a complex (the apoptosome) with other proteins. The apoptosome activates a caspase cascade.

Stem cells

Following fertilization, the newly formed fertilized cell (the zygote) and those following the first few divisions are totipotent, meaning that they can differentiate into any cell type in the adult body. At the blastula stage of embryonic development, these cells undergo a primary differentiation event to become either the trophectoderm or the inner cell mass (ICM). The trophectoderm gives rise to the fetal cells of the placenta, while the ICM are pluripotent and give rise to all other cell types of the body (except those of the placenta), and are more commonly called embryonic stem (ES) cells. Stem cells have two properties:

Self renewal: the ability to divide indefinitely without differentiating.

Pluri- or toti-potency: the capability to differentiate, given the appropriate signals, into any cell type (except fetal placental cells).

As they begin to differentiate, their ability to self renew and their potency is reduced but there remain adult progenitor cells (sometimes erroneously referred to as stem cells), that have a limited ability to self renew and can differentiate into multiple related lineages (multipotent, like haematopoietic ‘stem cells’) or single lineages (unipotent, like muscle satellite cells). The body uses these partially differentiated progenitor cells to continually replace or repair damaged cells and tissues.

Stem cells have great therapeutic potential and can be obtained from blood from the umbilical cord, which contains embryonic-like stem cells (not as primitive as ES cells but can differentiate into many more cells types than adult progenitor cells) or by reprogramming adult cells to regain stem-like properties (induced pluripotent stem cells, IPSC).

FURTHER READING

Hotchkiss RS, Strasser A, McDunn JE et al. Cell death. N Engl J Med 2009; 361:1570–1583.

FURTHER READING

Ben-David U, Benvenisty N. The tumorigenicity of human embryonic and induced pluripotent stem cells. Nat Rev Cancer 2011; 11(4):268–277.

Clevers H. The cancer stem cell: premises, promises and challenges. Nat Med 2011; 17(3):313–319.

Forraz N, McGuckin CP. The umbilical cord: a rich and ethical stem cell source to advance regenerative medicine. Cell Prolif 2011; 44(Suppl 1):60–69.

Robbins RD, Prasain N, Maier BF et al. Inducible pluripotent stem cells: not quite ready for prime time? Curr Opin Organ Transplant 2010; 15:61–67.

Wu SM, Hochedlinger K. Harnessing the potential of induced pluripotent stem cells for regenerative medicine. Nat Cell Biol 2011; 13:497–505.

Cancer ‘stem cells’

Only a very small proportion (<1%) of the individual cells from a cancer can form a cancer in a recipient immunodeficient mouse. These cells equate with the population that exclude the fluorescent drug Hoescht 3342 due to presence of a primary active (ABC) drug transporter on their cell surface. They have the characteristics of adult progenitor cells. The high relapse rate of many cancers may well be due to the persistence of these cancer ‘stem cells’ and new therapies are required to target these in the initial treatment regimen.

Human genetics

In 2003, the Human Genome Project was completed, with all 3.2 × 10⁹ base-pairs of DNA sequenced. Over 99% of the DNA sequence is identical between individuals, but still millions of different base-pair variations occur (variants that occur at a frequency >1% are called polymorphisms; pathological polymorphisms are called mutations; single nucleotide polymorphisms are called SNPs, pronounced ‘snips’). In addition, the genome contains segmental, duplication-rich regions, where the number of duplications varies between people. These are called copy number variations or CNVs. These variations underlie most human differences, and confer genetic disease and susceptibility to many common diseases. To understand this variation, the 1000 Genomes Project was undertaken (completion date 2012), involving sequencing of many genomes from people of Asian, west African and European ancestry. This project will document most of the variation between humans.

Genomic DNA encodes approximately 21 000 genes. However, these protein-encoding genes comprise only about 1.5% of the human genome. About 90% of the remaining genome is transcribed to form RNA molecules, which are not translated into protein (non-coding RNA, ncRNA). Some of these RNAs have known regulatory roles, although the role of many is still unknown. The remaining DNA also contains evolutionarily conserved non-coding regions, some with known enhancer functions, moderately repeated elements (transposons) with probable viral origin, and microsatellites consisting of short simple sequence (1–6 nucleotide) repeats. About 10% of the genome is highly repetitive or ‘satellite’ DNA, consisting of long arrays of tandem repeats. Satellite DNA largely locates to centromeres and telomeres of chromosomes and regions of inert DNA. It forms a major part of heterochromatin. The function of genomic DNA elements is being investigated under the ENCODE (Encyclopedia of DNA Elements) project.

FURTHER READING

ENCODE Project Consortium Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007; 447(7146):799–816.

Lander ES. Initial impact of the sequencing of the human genome. Nature 2011; 470:187–197.

Nagano T, Fraser P. No-nonsense functions for long noncoding RNAs. Cell 2011; 145:178–181.

Zhou H, Hu H, Lai M. Non-coding RNAs and their epigenetic regulatory mechanisms. Biol Cell 2010; 102:645–655.

SIGNIFICANT WEBSITES

http://genome.wellcome.ac.uk/

http://genome.ucsc.edu/ENCODE/

Tools for human genetic analysis

The polymerase chain reaction (PCR)

This technique revolutionized genetic research because minute amounts of DNA, e.g. from buccal cell scrapings, blood spots or single embryonic cells, can be amplified over a million times within a few hours. The exact DNA sequence to be amplified needs to be known because the DNA is amplified between two short (generally 17–25 bases) single-stranded DNA fragments (‘oligonucleotide primers’) that are complementary to the sequences on different strands at each end of the DNA of interest (Fig. 2.15).

Figure 2.15 Polymerase chain reaction. The technique is based on thermal cycling and has three basic steps. (1) The double-stranded genomic DNA is heat-denatured into single-stranded DNA. (2) The sample is cooled to favour annealing of the primers to their target DNA. (3) A thermostable DNA polymerase extends the primers over the target DNA. After one cycle there are two copies of double-stranded DNA, after two cycles there are four copies, and so on.

FURTHER READING

Quackenbush J. Microarray analysis and tumor classification. N Engl J Med 2006; 354(23):2463–2472.

Hybridization arrays

A fundamental property of DNA is that when two strands are separated, e.g. by heating, they will always re-associate and stick together again because of their complementary base sequences. Therefore, the presence or position of a particular gene can be identified using a gene ‘probe’ consisting of DNA or RNA, with a base sequence that is complementary to that of the sequence of interest. A DNA probe is thus a piece of single-stranded DNA that can locate and bind to its complementary sequence. Hybridization is utilized in array-based platforms, where thousands and thousands of probes can be analysed in one experiment to investigate global gene expression, large-scale genotyping, gene methylation status and/or for chromosomal aberrations, including for small chromosomal deletion/insertion events or copy number changes (Fig. 2.16).

Figure 2.16 Expression microarray. Example of part of a gene chip with hundreds of DNA sequences in microspots. The enlarged area shows the colour coding of no expression (black), overexpression (red), underexpression (green) and equal expression (yellow) of detected mRNA in two related samples.

DNA sequencing

A chemical process known as dideoxy-sequencing or Sanger sequencing (after its inventor) allows the identification of the exact nucleotide sequence of a piece of DNA. As in PCR, an oligonucleotide primer is annealed adjacent to the region of interest. This primer acts as the starting point for a DNA polymerase to build a new DNA chain that is complementary to the sequence under investigation. Chain extension can be prematurely interrupted when a dideoxynucleotide becomes incorporated (because they lack the necessary 3′-hydroxyl group). As the dideoxynucleotides are present at a low concentration, not all the chains in a reaction tube will incorporate a dideoxynucleotide in the same place; so the tubes contain sequences of different lengths but which all terminate with a particular dideoxynucleotide. Each base dideoxynucleotide (G, C, T, A) has a different fluorochrome attached, and thus each termination base can be identified by its fluorescent colour. As each strand can be separated efficiently by capillary electrophoresis according to its size/length, simply monitoring the fluorescence as the reaction products elute from the capillary will give the gene sequence (Fig. 2.17).

Figure 2.17 Gene sequencing. Sequence profile of: (a) part of the normal luteinizing hormone beta chain gene; (b) that from a polymorphic variant. The base changes and consequential alterations in amino acid residues are indicated.

Sequencing technology has developed dramatically in recent years, to the extent that it is now cost-effective and quick to sequence an individual’s whole genome in one experiment. This has massive implications in disease gene discovery but also can raise serious ethical considerations. There are a number of different platforms to perform this high-throughput sequencing (or ‘Next Generation Sequencing’) and new faster and cheaper ones are being developed. For under £2000 (2011 price), it is possible to sequence all the coding genes in an individual’s human genome to catalogue all possible disease-associated and non-disease associated variants in every human gene. As well as sequencing genomic DNA, the technology can be used to sequence RNA (termed RNAseq) to assess accurately gene expression levels in addition to determining all splice variation and allelic-copy number. This can also be used to assess the effect of methylation on gene expression.

Identification of gene function

Following sequencing of the genome, the challenge is to understand the function of the protein coding genes. Most tools rely on the comparison of a cell or animal’s phenotype in the presence or absence of the gene in question. Each tool has different merits and faults.

Cell culture

Human cells can be grown in culture flasks in the laboratory and their behaviour (growth rate, morphology, motility, gene expression profile and biochemistry) characterized. A specific gene can then be introduced in a small plasmid (a circle of DNA from which the gene of interest can be expressed) or incorporated into a virus, and the change in cell behaviour assessed to provide an indication of gene function. Alternatively, if the cell line in question already expresses the gene of interest, its expression can be knocked down by RNA interference (RNAi).

RNAi

RNAi takes advantage of the cellular machinery that allows microRNAs encoded by the genome to regulate the expression of many genes at the level of messenger RNA stability and translation (see Control of gene expression, above). This phenomenon has been exploited in the laboratory to study the function of a gene of interest or, on a much larger scale, the function of each gene in the genome. In such an RNAi screen, a small interfering (si) RNA specific for each gene in the genome is introduced into cells grown in vitro, in effect knocking down expression of each gene in ~20 000 separate experiments. The phenotype of the cells in each experiment is then monitored to test the effect of loss of gene expression.

Animal models

The effect of a gene at the organismal level can also be tested by mis-expression/over-expression or knock-out of a particular gene in a model animal. Nematode worms (Caenorhabditis), fruit flies (Drosophila), zebra fish and rodents have all been genetically engineered to identify the function of a gene of interest. Knock-out models of the higher organisms can be particularly helpful for medical research to provide a model of disease for exploration of therapeutic intervention. A current goal is to mutate or knockout every protein-coding gene in mouse through large-scale mutagenesis programmes. However, it should be noted that the physiology of rodents and humans can differ.

FURTHER READING

Kim IY, Shin JH, Seong JK. Mouse phenogenomics, toolbox for functional annotation of human genome. BMB Rep 2010; 43:79–90.

Lieschke GJ, Currie PD. Animal models of human disease: zebrafish swim into view. Nat Rev Genet 2007; 8(5):353–367.

Markaki M, Tavernarakis N. Modeling human diseases in Caenorhabditis elegans. Biotechnology Journal 2010; 5:1261–1276

Perrimon N, Ni JQ, Perkins L. In vivo RNAi: today and tomorrow. Cold Spring Harb Perspect Biol 2010; 2(8):a003640.

Genetic polymorphisms and linkage studies

Techniques have been developed to identify and quantitate genetic polymorphisms such as single nucleotide polymorphisms (SNPs; p. 34), microsatellites and copy number variants (CNVs). For example, SNPs consist usually of two nucleotides at a particular site and vary between populations and ethnic groups. They must occur in at least 1% of the population to be a SNP. SNPs can be in coding or non-coding regions of the genes or be between genes and thus may not change the amino acid sequence of the protein.

Linkage disequilibrium

Polymorphisms that are closer together are more likely to have alleles that move together in a block than those further apart. This phenomenon is called ‘linkage disequilibrium’ and enables, e.g. one SNP variant (tag SNP) in this block to act as a marker for the presence of other SNP variants. Linkage analysis has provided many breakthroughs in mapping the positions of genes that cause genetic diseases, such as the gene for cystic fibrosis, which was found to be tightly linked to a marker on chromosome 7.

The International Hapmap Project

As SNPs close together are inherited in blocks (haplotypes), tag SNPs for each haplotype block are typed and can then be correlated with a specific phenotype. An International Hapmap Project has been developed. The Wellcome Trust Case Control Consortium was set up to analyse thousands of DNA samples from patients with different diseases in which there is thought to be a genetic component. Utilizing the Hapmap data and the use of high density SNP hybridization arrays, genetic risk associated sequence variants have been found in many diseases and traits including diabetes, cancer, hypertension, Crohn’s disease, height and metabolism.

The ‘lod score’

The likelihood of recombination between the marker under study and the disease allele must be taken into account. This degree of likelihood is known as the ‘lod score’ (the logarithm of the odds) and is a measure of the statistical significance of the observed co-segregation of the marker and the disease gene, compared with what would be expected by chance alone.

Positive lod scores make linkage more likely.

Negative lod scores make linkage less likely.

By convention, a lod score of +3 is taken to be definite evidence of linkage because this indicates 1000:1 odds that the co-segregation of the DNA marker and the disease did not occur by chance alone.

SIGNIFICANT WEBSITES

National Center for Biotechnology Information: http://www.ncbi.nlm.nih.gov

UCSC Genome Bioinformatics: http://genome.ucsc.edu

Ensemble: http://www.ensembl.org/index.html

The Online Mendelian Inheritance in Man website, for information on gene products and their disease association: http://www.ncbi.nlm.nih.gov/omim

Genome databases

Information arising from human genome sequencing is publicly available, providing biological information on every gene in the human genome. Information on any gene describing its protein product, function, tissue specific expression, disease association and sequence variation/mutation can all be easily obtained by searching and manipulating computer-based databases.

The biology of chromosomes

Human chromosomes

The nucleus of each diploid cell contains 6 × 10⁹ bp of DNA in long molecules called chromosomes (Fig. 2.11). Chromosomes are massive structures containing one linear molecule of DNA that is wound around histone proteins into small units called nucleosomes, and these are further wound to make up the structure of the chromosome itself.

Diploid human cells have 46 chromosomes, 23 inherited from each parent; thus there are 23 ‘homologous’ pairs of chromosomes (22 pairs of ‘autosomes’ and two ‘sex chromosomes’). The sex chromosomes, called X and Y, are not homologous but are different in size and shape. Males have an X and a Y chromosome; females have two X chromosomes. (Primary male sexual characteristics are determined by the SRY gene – sex determining region, Y chromosome.)

The chromosomes are classified according to their size and shape, the largest being chromosome 1. The constriction in the chromosome is the centromere, which can be in the middle of the chromosome (metacentric) or at one extreme end (acrocentric). The centromere divides the chromosome into a short arm and a long arm, referred to as the p arm and the q arm, respectively (Fig. 2.11d).

Chromosomes can be stained when they are in the metaphase stage of the cell cycle and are very condensed. The stain gives a different pattern of light and dark bands that is diagnostic for each chromosome. Each band is given a number, and gene mapping techniques allow genes to be positioned within a band within an arm of a chromosome. For example, the CFTR gene (in which a defect gives rise to cystic fibrosis) maps to 7q21; that is, on chromosome 7 in the long arm in band 21.

During cell division (mitosis), each chromosome divides into two so that each daughter nucleus has the same number of chromosomes as its parent cell. During gametogenesis, however, the number of chromosomes is halved by meiosis, so that after conception the number of chromosomes remains the same and is not doubled. In the female, each ovum contains one or other X chromosome but, in the male, the sperm bears either an X or a Y chromosome.

Chromosomes can only be seen easily in actively dividing cells. Typically, lymphocytes from the peripheral blood are stimulated to divide and are processed to allow the chromosomes to be examined. Cells from other tissues can also be used for chromosomal analysis, e.g. amniotic fluid, placental cells from chorionic villus sampling, bone marrow and skin (Box 2.1).

The X chromosome and inactivation

Although females have two X chromosomes (XX), they do not have two doses of X-linked genes (compared with just one dose for a male XY) because of the phenomenon of X inactivation or Lyonization (after its discoverer, Dr Mary Lyon). In this process, one of the two X chromosomes in the cells of females is epigenetically silenced through the action of a regulatory ncRNA, so the cell has only one dose of the X-linked genes. Inactivation is random and can affect either X chromosome.

Telomeres and immortality

The ends of chromosomes, telomeres (Fig. 2.11d), do not contain genes but many repeats of a hexameric sequence TTAGGG. Replication of linear chromosomes starts at coding sites (origins of replication) within the main body of chromosomes and not at the two extreme ends. The extreme ends are therefore susceptible to single-stranded DNA degradation back to double-stranded DNA. Thus, cellular ageing can be measured as a genetic consequence of multiple rounds of replication, with consequential telomere shortening. This leads to chromosome instability and cell death.

Stem cells have longer telomeres than their terminally differentiated daughters. However, germ cells replicate without shortening of their telomeres. This is because they express an enzyme called telomerase, which protects against telomere shortening by acting as a template primer at the extreme ends of the chromosomes. Most somatic cells (unlike germ and embryonic cells) switch off the activity of telomerase after birth and die as a result of apoptosis. Many cancer cells, however, reactivate telomerase, contributing to their immortality. Conversely, cells from patients with progeria (premature ageing syndrome) have extremely short telomeres. Transient expression of telomerase in various stem and daughter cells is part of their normal biology.

The mitochondrial chromosome

In addition to the 23 pairs of chromosomes in the nucleus of every diploid cell, the mitochondria in the cytoplasm of the cell also have their own genome. The mitochondrial chromosome is a circular DNA (mtDNA) molecule of approximately 16 500 bp, and every base-pair makes up part of the coding sequence. These genes principally encode proteins or RNA molecules involved in mitochondrial function. These proteins are components of the mitochondrial respiratory chain involved in oxidative phosphorylation producing ATP. They also have a critical role in apoptotic cell death. Every cell contains several hundred mitochondria, and therefore several hundred mitochondrial chromosomes. Virtually all mitochondria are inherited from the mother as the sperm head contains no (or very few) mitochondria. Disorders mapped to the mitochondrial chromosome are shown in Figure 2.18 and discussed on page 40.

Figure 2.18 Mitochondrial chromosome abnormalities. Disorders that are frequently or prominently associated with mutations in a particular gene are shown in bold. Diseases due to mutations that impair mitochondrial protein synthesis are shown in blue. Diseases due to mutations in protein-coding genes are shown in red.

ECM, encephalomyopathy; FBSN familial bilateral striatal necrosis; LHON, Leber’s hereditary optic neuropathy; LS, Leigh’s syndrome; MELAS, mitochondrial encephalomyopathy, lactic acidosis and stroke-like episodes; MERRF, myoclonic epilepsy with ragged-red fibres; MILS, maternally inherited Leigh’s syndrome; NARP, neuropathy, ataxia and retinitis pigmentosa; PEO progressive external ophthalmoplegia; PPK, palmoplantar keratoderma; SIDS, sudden infant death syndrome.

Genetic disorders

The spectrum of inherited or congenital genetic disorders can be classified as the chromosomal disorders, including mitochondrial chromosome disorders, the Mendelian and sex-linked single-gene disorders, a variety of non-Mendelian disorders, and the multifactorial and polygenic disorders (Table 2.1 and Box 2.2). All are a result of a mutation in the genetic code. This may be a change of a single base-pair of a gene, resulting in functional change in the product protein (e.g. thalassaemia) or gross rearrangement of the gene within a genome (e.g. chronic myeloid leukaemia). These mutations can be congenital (inherited at birth) or somatic (arising during a person’s life).

Table 2.1 Prevalence of genetic disease

Chromosomal disorders

Chromosomal abnormalities are much more common than is generally appreciated. Over half of spontaneous abortions have chromosomal abnormalities, compared with only 4–6 abnormalities per 1000 live births. Specific chromosomal abnormalities can lead to well-recognized and severe clinical syndromes, although autosomal aneuploidy (a differing from the normal diploid number) is usually more severe than the sex-chromosome aneuploidies. Abnormalities may occur in either the number or the structure of the chromosomes.

Abnormal chromosome numbers

If a chromosome or chromatids fail to separate (‘non-disjunction’) either in meiosis or mitosis, one daughter cell will receive two copies of that chromosome and one daughter cell will receive no copies of the chromosome. If this non-disjunction occurs during meiosis, it can lead to an ovum or sperm having:

either an extra chromosome, so resulting in a fetus that is ‘trisomic’ and has three instead of two copies of the chromosome;

or no chromosome, so the fetus is ‘monosomic’ and has one instead of two copies of the chromosome.

Non-disjunction can occur with autosomes or sex chromosomes. However, only individuals with trisomy 13, 18 and 21 survive to birth, and most children with trisomy 13 and trisomy 18 die in early childhood. Trisomy 21 (Down’s syndrome) is observed with a frequency of 1 in 650 live births, regardless of geography or ethnic background. This should be reduced with widespread screening (p. 43). Full autosomal monosomies are extremely rare and very deleterious. Sex-chromosome trisomies (e.g. Klinefelter’s syndrome, XXY) are relatively common. The sex-chromosome monosomy in which the individual has an X chromosome only and no second X or Y chromosome is known as Turner’s syndrome and is estimated to occur in 1 in 2500 live-born girls.

Occasionally, non-disjunction can occur during mitosis shortly after two gametes have fused. It will then result in the formation of two cell lines, each with a different chromosome complement: termed a ‘mosaic’ individual.

Very rarely, the entire chromosome set will be present in more than two copies, so the individual may be triploid rather than diploid and have a chromosome number of 69. Triploidy and tetraploidy (four sets) result in spontaneous abortion.

Abnormal chromosome structures

As well as abnormal numbers of chromosomes, chromosomes can have abnormal structures, and the disruption to the DNA and gene sequences may give rise to a genetic disease.

Deletions of a portion of a chromosome may give rise to a disease syndrome if two copies of the genes in the deleted region are necessary, and the individual will not be normal with just the one normal copy remaining on the non-deleted homologous chromosome. Many deletion syndromes have been well described. For example, Prader–Willi syndrome (p. 198) is the result of cytogenetic events resulting in deletion of part of the long arm of chromosome 15; Wilms’ tumour is characterized by deletion of part of the short arm of chromosome 11; and microdeletions in the long arm of chromosome 22 give rise to the DiGeorge’s syndrome.

Duplications occur when a portion of the chromosome is present on the chromosome in two copies, so the genes in that chromosome portion are present in an extra dose. A form of neuropathy, Charcot–Marie–Tooth disease (p. 1105), is due to a small duplication of a region of chromosome 17.

Inversions involve an end-to-end reversal of a segment within a chromosome, e.g. ‘abcdefgh’ becomes ‘abcfedgh’, e.g. haemophilia (p. 421).

Translocations occur when two chromosome regions join together, when they would not normally. Chromosome translocations in somatic cells may be associated with tumorigenesis (see p. 451 and Fig. 9.16).

Translocations can be very complex, involving more than two chromosomes, but most are simple and fall into one of two categories:

Reciprocal translocations occur when any two non-homologous chromosomes break simultaneously and rejoin, swapping ends. In this case, the cell still has 46 chromosomes but two of them are rearranged. Someone with a balanced translocation is likely to be normal (unless a translocation breakpoint interrupts a gene); but at meiosis, when the chromosomes separate into different daughter cells, the translocated chromosomes will enter the gametes and any resulting fetus may inherit one abnormal chromosome and have an unbalanced translocation, with physical manifestations.

Robertsonian translocations occur when two acrocentric chromosomes join and the short arm is lost, leaving only 45 chromosomes. This translocation is balanced as no genetic material is lost and the individual is healthy. However, any offspring have a risk of inheriting an unbalanced arrangement. This risk depends on which acrocentric chromosome is involved. Clinically relevant is the 14/21 Robertsonian translocation. A woman with this karyotype has a one in eight risk of having a baby with Down’s syndrome (a male carrier has a 1 in 50 risk). However, they have a 50% risk of producing a carrier like themselves, hence the necessity for genetic family studies. Relatives should be alerted to the increased risk of Down’s syndrome in their offspring, and should have their chromosomes checked.

Table 2.2 shows some of the syndromes resulting from chromosomal abnormalities.

Table 2.2 Chromosomal abnormalities: examples of a few syndromes

Mitochondrial chromosome disorders

The mitochondrial chromosome (see Fig. 2.18, p. 37) carries its genetic information in a very compact form; e.g. there are no introns in the genes. Therefore, any mutation has a high chance of having an effect. However, as every cell contains hundreds of mitochondria, a single altered mitochondrial genome will not be noticed. As mitochondria divide, there is a statistical likelihood that there will be more mutated mitochondria, and at some point, this will give rise to a mitochondrial disease.

Most mitochondrial diseases are myopathies and neuropathies with a maternal pattern of inheritance. Other abnormalities include retinal degeneration, diabetes mellitus and hearing loss. Many syndromes have been described.

Myopathies include chronic progressive external ophthalmoplegia (CPEO); encephalomyopathies include myoclonic epilepsy with ragged red fibres (MERRF) and mitochondrial encephalomyopathy, lactic acidosis and stroke-like episodes (MELAS) (see p. 1153).

Kearns–Sayre syndrome includes ophthalmoplegia, heart block, cerebellar ataxia, deafness and mental deficiency due to long deletions and rearrangements. Leber’s hereditary optic neuropathy (LHON) is the commonest cause of blindness in young men, with bilateral loss of central vision and cardiac arrhythmias, and is an example of a mitochondrial disease caused by a point mutation in one gene.

Multisystem disorders include Pearson’s syndrome (sideroblastic anaemia, pancytopenia, exocrine pancreatic failure, subtotal villous atrophy, diabetes mellitus and renal tubular dysfunction). In some families, hearing loss is the only symptom, and one of the mitochondrial genes implicated may predispose patients to aminoglycoside cytotoxicity.

FURTHER READING

Park CB, Larsson N-G. Mitochondrial DNA mutations in disease and aging. J Cell Biol 2011; 193:809–818

Analysis of chromosome disorders

The cell cycle can be arrested at mitosis with colchicine and, following staining, the chromosomes with their characteristic banding can be seen and any abnormalities identified (Fig. 2.19). This is an automated process with computer scanning software searching for metaphase spreads and then automatic binning of each chromosome to allow easy scoring of chromosome number and banding patterns. Another approach utilizes genome-wide array based platforms (comparative genomic hybridization (CGH) or chromosomal microarray analysis (CMA)) to identify changes in chromosome copy number and can identify very small interstitial deletions and insertions (<1 Mb in size).

Figure 2.19 Karyotyping. (a) G-banded spread of metaphase chromosomes, showing trisomy 21 (arrowed) Down’s syndrome. (b) Whole chromosome painting probes labelled with different fluorochromes or fluorochrome combinations on 24 colour or mFISH analysis is based on a DNA probe kit. Digital image analysis software analyses the colour information and identifies the chromosomal origin of each individual pixel within the image. Inter-chromosomal rearrangements will show up as colour changes within the aberrant chromosome.

(Courtesy of D. Lillington, Medical Oncology Unit, St Bartholomew’s Hospital.)

Large region specific probes are labelled with fluorescently tagged nucleotides and used to allow rapid identification of metaphase chromosomes. This approach allows easy identification of chromosomal translocations (Fig. 2.20). High-throughput sequencing is another method to identify deletions, insertions and translocation breakpoints.

Figure 2.20 Fluorescence in situ hybridization (FISH). Two-coloured FISH using a red paint for the ABL gene on chromosome 9 and a green paint for the BCR gene on chromosome 22 in the anaphase cell nucleus. Where a translocation has occurred, the two genes become juxtaposed on the Philadelphia chromosome and a hybrid yellow fluorescence can be seen only in the affected cell nucleus on the right.

(Courtesy of D. Lillington, Medical Oncology Unit, St Bartholomew’s Hospital.)

Gene defects

Mendelian and sex-linked single-gene disorders are the result of mutations in coding sequences and their control elements. These mutations can have various effects on the expression of the gene, as explained below, but all cause a dysfunction of the protein product.

Mutations

Although DNA replication is a very accurate process, occasionally mistakes occur to produce changes or mutations. These changes can also occur owing to other factors such as radiation, ultraviolet light or chemicals. Mutations in gene sequences or in the sequences which regulate gene expression (transcription and translation) may alter the amino acid sequence in the protein encoded by that gene. In some cases, protein function will be maintained; in other cases, it will change or cease, perhaps producing a clinical disorder. Many different types of mutation occur.

Point mutation

This is the simplest type of change and involves the substitution of one nucleotide for another, so changing the codon in a coding sequence, leading to an amino acid substitution (non-synonymous). For example, in sickle cell disease a mutation within the globin gene changes one codon from GAG to GTG, so that instead of glutamic acid, valine is incorporated into the polypeptide chain, which radically alters its properties. However, substitutions may have no effect on the function or stability of the proteins produced as several codons code for the same amino acid (synonymous).

Insertion or deletion

Insertion or deletion of one or more bases is a more serious change, particularly if the inserted or deleted DNA is not a multiple of three bases, as this will cause the following sequence to be out of frame.

Splicing mutations

If the DNA sequences which direct the splicing of introns from mRNA are mutated, then abnormal splicing may occur. In this case, the processed mRNA which is translated into protein by the ribosomes may carry intron sequences or miss exons so altering amino acid composition.

Nonsense mutations

A nonsense mutation is a point mutation in a sequence of DNA that results in a premature stop codon.

Single-gene disease

Monogenetic disorders involving single genes can be inherited as dominant, recessive or sex-linked characteristics. Although classically divided into autosomal dominant, recessive or X-linked disorders, many syndromes show multiple forms of inheritance pattern. For example in Ehlers–Danlos syndrome, we find autosomal dominant, recessive and X-linked inheritance. In addition, there is a spectrum between autosomal recessive and autosomal dominance in that having just one defective allele gives a mild form of the disease (semi-dominant), while having both alleles with the mutation results in a more severe form of the syndrome. In some cases, such as factor V Leiden disease, the boundary between dominant and recessive forms is very blurred.

Some monogenetic disorders show a racial or geographical prevalence, e.g. thalassaemia (see p. 390) is seen mainly in Greeks, South-east Asians and Italians; porphyria variegata in the South African white population; and Tay–Sachs disease (p. 1042) in Ashkenazi Jewish people. Thus, although the prevalence of some single-gene diseases is very low worldwide, it is much higher in specific populations.

Autosomal dominant disorders

Each diploid cell contains two copies of all the autosomes. An autosomal dominant disorder (Fig. 2.21a) occurs when one of the two copies has a mutation and the protein produced by the normal form of the gene cannot compensate. In this case, a heterozygous individual who has two different forms (or alleles) of the same gene will manifest the disease. The offspring of heterozygotes have a 50% chance of inheriting the chromosome carrying the disease allele, and therefore also of having the disease. However, estimation of risk to offspring for counselling families can be difficult because of three factors:

Those disorders which have a great variability in their manifestation. ‘Incomplete penetrance’ may occur if patients have a dominant disorder but it does not manifest itself clinically in them. This gives the appearance of the gene having ‘skipped’ a generation.

Dominant traits are extremely variable in severity (variable expression) and a mildly affected parent may have a severely affected child.

New cases in a previously unaffected family may be the result of a new mutation. In this case, the risk of a further affected child is negligible. Most cases of achondroplasia, for example, are due to new mutations.

Figure 2.21 Modes of inheritance of simple gene disorders, with a key to the standard pedigree symbols.

SIGNIFICANT WEBSITES

Information on all autosomal single-gene disorders

http://www.genetests.org

http://www.ncbi.nlm.nih.gov/omim

Autosomal recessive disorders

These disorders (Fig. 2.21b) manifest themselves only when an individual is homozygous or a compound heterozygote for the disease allele, i.e. both chromosomes carry the same gene mutation (homozygous) or different mutations in the same gene (compound heterozygote). The parents are unaffected carriers (heterozygous for the disease allele). If carriers marry, the offspring have a one in four chance of carrying both mutant copies of the gene and being affected, a one in two chance of being a carrier, and a one in four chance of being genetically normal. Consanguinity increases the risk.

Sex-linked disorders

Genes carried on the X chromosome are said to be ‘X-linked’, and can be dominant or recessive in the same way as autosomal genes (Fig. 2.21c,d).

X-linked dominant disorders

These are rare. Females who are heterozygous for the mutant gene and males who have one copy of the mutant gene on their single X chromosome will manifest the disease. Half the male or female offspring of an affected mother and all the female offspring of an affected man will have the disease. Affected males tend to have the disease more severely than the heterozygous female.

X-linked recessive disorders

These disorders present in males and present only in homozygous females (usually rare). X-linked recessive diseases are transmitted by healthy female carriers or affected males if they survive to reproduce. An example of an X-linked recessive disorder is haemophilia A (see p. 421), which is caused by a mutation in the X-linked gene for factor VIII. It has been shown that in 50% of cases there is an intrachromosomal rearrangement (inversion) of the tip of the long arm of the X chromosome (one break point being within intron 22 of the factor VIII gene).

Of the offspring from a carrier female and a normal male:

50% of the girls will be carriers as they inherit a mutant allele from their mother and the normal allele from their father; the other 50% of the girls inherit two normal alleles and are themselves normal

50% of the boys will have haemophilia as they inherit the mutant allele from their mother (and the Y chromosome from their father); the other 50% of the boys will be normal as they inherit the normal allele from their mother (and the Y chromosome from their father).

The male offspring of a male with haemophilia and a normal female will not have the disease as they do not inherit his X chromosome. However, all the female offspring will be carriers as they all inherit his X chromosome.

FURTHER READING

De Matteis MA, Luine A. Mendelian disorders of membrane trafficking. N Engl J Med 2011; 365:927–938.

Manolio TA. Genomewide association studies and assessment of the risk of disease. N Engl J Med 2010; 363(2):166–176.

SIGNIFICANT WEBSITE

A Catalogue of published Genome-wide Association studies

http://www.genome.gov/gwastudies/

Other single-gene disorders

These are disorders which may be due to mutations in single genes but which do not manifest as simple monogenic disorders. They can arise from a variety of mechanisms, including the following.

Triplet repeat mutations

In the gene responsible for myotonic dystrophy (p. 1153), the mutated allele was found to have an expanded 3′UTR region in which three nucleotides, CTG, were repeated up to about 200 times. In families with myotonic dystrophy, people with the late-onset form of the disease had 20–40 copies of the repeat, but their children and grandchildren who presented with the disease from birth had vast increases in the number of repeats, up to 2000 copies. It is thought that some mechanism during meiosis causes this ‘triplet repeat expansion’ so that the offspring inherit an increased number of triplets. The number of triplets affects mRNA and protein function (Table 2.3). See also page 43 for the phenomenon of ‘anticipation’.

Table 2.3 Examples of trinucleotide repeat genetic disorders

Mitochondrial disease

As discussed on pages 37 and 40, various mitochondrial gene mutations can give rise to complex disease syndromes with incomplete penetrance maternal inheritance (Fig. 2.18).

Imprinting

It is known that normal humans need a diploid number of chromosomes of 46. However, the maternal and paternal contributions can be different. Imprinting is relevant to human genetic disease because different phenotypes may result depending on whether the mutant chromosome is maternally or paternally inherited. A deletion of part of the long arm of chromosome 15 (15q11-q13) will give rise to the Prader–Willi syndrome (PWS), if it is paternally inherited. A deletion of a similar region of the chromosome gives rise to Angelman’s syndrome (AS) if it is maternally inherited. The affected gene has been identified as ubiquitin (UBE3A).

Complex traits: multifactorial and polygenic inheritance

Characteristics resulting from a combination of genetic and environmental factors are said to be multifactorial; those involving multiple genes are said to be polygenic. There has been an explosion of genetic discovery about these complex traits with the development of high-throughput genome-wide SNP arrays, which has allowed cost-effective and unbiased screening of large case–control cohorts (in excess of 1000 cases). This has allowed unequivocal identification of SNPs associated with a variety of traits and diseases. For example, over 30 SNPs in immune-related gene loci have been associated with coeliac disease, with over half also associated with other immune-mediated or inflammatory diseases. This indicates that there are many low-risk genetic risk factors associated with complex traits, and common pathways are implicated in different diseases.

Most human diseases, such as heart disease, diabetes and common mental disorders, are multifactorial traits (Table 2.4).

Table 2.4 Examples of disorders that may have a polygenic inheritance

Disorder	Frequency (%)	Heritability (%)^a
Hypertension	5	62
Asthma	4	80
Schizophrenia	1	85
Congenital heart disease	0.5	35
Neural tube defects	0.5	60
Congenital pyloric stenosis	0.3	75
Ankylosing spondylitis	0.2	70
Cleft palate	0.1	76

a Percentage of the total variation of a trait which can be attributed to genetic factors.

Population genetics

The genetic constitution of a population depends on many factors. The Hardy–Weinberg equilibrium is a concept, based on a mathematical equation, that describes the outcome of random mating within populations. It states that ‘in the absence of mutation, non-random mating, selection and genetic drift, the genetic constitution of the population remains the same from one generation to the next’.

This genetic principle has clinical significance in terms of the number of abnormal genes in the total gene pool of a population. The Hardy–Weinberg equation states that:

where p is the frequency of the normal gene in the population, q is the frequency of the abnormal gene, p² is the frequency of the normal homozygote, q² is the frequency of the affected abnormal homozygote, 2pq is the carrier frequency, and p+q = 1.

Example: The equation can be used, for example, to find the frequency of heterozygous carriers in cystic fibrosis. The incidence of cystic fibrosis is 1 in 2000 live births. Thus q² = 1/2000, and therefore q = 1/44. Since p = 1 – q, then p = 43/44. The carrier frequency is represented by 2pq, which in this case is 1/22. Thus 1 in 22 individuals in the whole population is a heterozygous carrier for cystic fibrosis.

Clinical genetics and genetic counselling

Genetic disorders pose considerable health and economic problems because often there is no effective therapy. In any pregnancy, the risk of a serious developmental abnormality is approximately 1 in 50 pregnancies; approximately 15% of paediatric inpatients have a multifactorial disorder with a predominantly genetic element. 50% of clinical genetics referrals are adults with late-onset disease including neurological, endocrine, gastrointestinal or cancer.

People with a history of a congenital abnormality in a member of their family often seek advice as to why it happened and about the risks of producing further abnormal offspring. Interviews must be conducted with great sensitivity and psychological insight, as parents may feel a sense of guilt and blame themselves for the abnormality in their child.

Genetic counselling should have the following aims:

Obtaining a full history. The pregnancy history, drug and alcohol ingestion during pregnancy and maternal illnesses (e.g. diabetes) should be detailed.

Establishing an accurate diagnosis. Examination of the child may help in diagnosing a genetically abnormal child with characteristic features (e.g. trisomy 21) or whether a genetically normal fetus was damaged in utero.

Drawing a family tree is essential. Questions should be asked about abortions, stillbirths, deaths, marriages, consanguinity and medical history of family members. Diagnoses may need verification from other hospital reports.

Estimating the risk of a future pregnancy being affected or carrying a disorder. Estimation of risk should be based on the pattern of inheritance. Mendelian disorders (see earlier) carry a high risk; chromosomal abnormalities other than translocations typically carry a low risk. Empirical risks may be obtained from population or family studies.

Information giving on prognosis and management with adequate time given so that all information is discussed openly, freely and repeated as necessary.

Continued support and follow-up. Explanation of the implications for other siblings and family members.

Genetic screening. This includes prenatal diagnosis or preimplantation genetic diagnosis (IVF followed by testing of embryos before implantation) if requested, carrier detection and data storage in genetic registers. A large number of molecular genetic tests are now available

The near future? With the development of cheap high-throughput sequencing, couples could be tested for all genes (termed ‘exome’ sequencing) prior to starting a family to assess if they are carriers of recessive mutations in the same disease-associated gene. This information could then be used in prenatal diagnosis.

Genetic counselling should be non-directive, with the couple making their own decisions on the basis of an accurate presentation of the facts and risks in a way they can understand.

FURTHER READING

Rotimi CN, Jorde LB. Ancestry and disease in the age of genomic medicine. N Engl J Med 2010; 363:1551–1558

SIGNIFICANT WEBSITE

Information on molecular genetic tests

http://www.geneclinics.org

Genetic anticipation

It has been noted that successive generations of people with, e.g. dystrophia myotonica and Huntington’s chorea, present earlier and with progressively worse symptoms. This ‘anticipation’ is due to unstable mutations occurring within the disease gene. Trinucleotide repeats such as CTG (dystrophia myotonica) and CAG (Huntington’s chorea) expand within the disease gene with each generation, and somatic expansion with cellular replication is also observed. This type of genetic mutation can occur within the translated region or untranslated (and presumably regulatory) regions of the target genes. This genetic distinction has been used to subclassify a number of genetic diseases which have now been shown to be caused by trinucleotide repeat expansion and display phenotypic ‘anticipation’ (Table 2.3).

Prenatal diagnosis for chromosomal disorders

This should be offered to all pregnant women. Practice and uptake varies in different maternity units, with some offering screening only to high-risk mothers. The risks of Down’s syndrome increase disproportionately and rapidly for children born to mothers older than 35 years. Infants born to mothers with a history or family history of other conditions due to chromosomal abnormalities may be at increased risk.

Personal choice

There should be a detailed discussion with all mothers as to the possible consequences of each screening test before they are offered it. In particular, they should have an understanding of the failure rates, the detection rates, the false positive and the false negative rates of each test so that they can properly exercise choice.

Investigations

The choice of investigation depends on gestational age:

FURTHER READING

Feero WG, Guttmacher AE, Collins FS. Genomic medicine – an updated primer. N Engl J Med 2010; 362(21):2001–2011.

7–11 Weeks (vaginal ultrasound)

Ultrasound is used to confirm viability, fetal number and gestation by crown-rump measurement.

11–13 Weeks and 6 days (combined test)

The combined test comprises:

Ultrasound for nuchal translucency measurement (normal fold <6 mm) to attempt to detect major chromosomal abnormalities (e.g. trisomies and Turner’s syndrome)

Testing of maternal serum for pregnancy-associated plasma protein-A (PAPP-A from the syncytial trophoblast) and β-human chorionic gonadotrophin for trisomy 21.

All serum marker measurements are corrected for gestational ages, a multiple of the mean (MOM) value for the appropriate week of gestation. If abnormalities are detected, it is necessary to continue to discuss whether further investigation is desired or not. Chorionic villus sampling (CVS) at 11–13 weeks under ultrasound control to sample the placental site, or amniocentesis at 15 weeks to sample amniotic fluid and obtain the fetal cells necessary for cytogenetic testing, are the next options.

The combined test is more accurate than the triple test alone at 16 weeks (see below).

FURTHER READING

NICE. NICE Clinical Guideline 62: Antenatal Care. NICE 2008; www.nice.org.uk/CG062.

14–22 Weeks

Ultrasound detects structural abnormalities (e.g. neural tube defects; the gestation period for detection depends on severity). The best time to detect congenital heart defects is 18–22 weeks.

Reported detection rates for all congenital defects vary, e.g. from 14–61% for hypoplastic ventricle to 97–100% for anencephaly.

In time, some of these tests are likely to be superseded by the salvage of fetal cells from the maternal blood, from cervical secretions or by retrieving maternal plasma cell free fetal DNA. Other conditions such as myotonic dystrophy and Huntington’s chorea may be detected from fetal circulating nucleic acids.

FURTHER READING

Dietz HC. New therapeutic approaches to Mendelian disorders. N Engl J Med 2010; 363(9):852–863.

Genomic medicine

Gene therapy

Some genetic disorders, such as phenylketonuria or haemophilia, can be managed by diet or replacement therapy, but most have no effective treatment. One approach to manage inherited genetic disease entails placing a normal copy of a gene into the cells of a patient who has a defective copy of the gene; termed gene therapy.

There are many technical problems to overcome in gene therapy, particularly in finding delivery systems to introduce DNA into a mammalian cell. Very careful control and supervision of gene manipulation will be necessary because of its potential hazards and the ethical issues.

Two major factors are involved in gene therapy:

The introduction of the functional gene sequence into target cells

The expression and permanent integration of the transfected gene into the host cell genome.

Cystic fibrosis (see also p. 821)

CFTR, the cystic fibrosis transmembrane regulator, is an unusual ABC transporter in that it does not function as a primary active transporter but as a ligand-gated chloride channel (Fig. 2.22). The common CF mutation is a 3 bp deletion in exon 10 resulting in the removal of a codon specifying phenylalanine (F508del). In this mutation the CFTR protein is misfolded, thereby causing ineffective biosynthesis and consequently disrupting the delivery of the protein to the cell surface. In the mutation G551 D-CFTR, glycine in position 551 is replaced by aspartate; the CFTR channel reaches the cell surface but fails to open. This has introduced a new era of treatment. VX-770, a potentiating agent which can be given orally, has been developed. It increases the fraction of time that the phosphorylated G551 D-CFTR channel is open allowing bicarbonate and chloride flow across the membrane. Early clinical results are encouraging.

Figure 2.22 Model of cystic fibrosis transmembrane regulator (CFTR). This is an integral membrane glycoprotein, consisting of two repeated elements. The cylindrical structures represent six membrane-spanning helices in each half of the molecule. The nucleotide-binding folds (NBFs) are in the cytoplasm. The regulatory (R) domain links the two halves and contains charged individual amino acids and protein kinase phosphorylation sites (black triangles). N and C are the amino and carboxy termini of the protein, respectively. The branched structure on the right half represents potential glycosylation sites.

There are also over 1000 different mutations of the CFTR gene with many mapping to the ATP-binding domains.

Two other routes of gene therapy have been tried, either with placing the wild-type CFTR cDNA into an adenovirus vector (see Fig. 15.28) to allow infection of human cells or into a plasmid (an engineered circle of DNA) that is then encapsulated into a liposome to allow transfection of human cells. The latter can be conveyed via an aerosol spray to the lung where the liposome fuses with the cell membrane to deliver the CFTR cDNA into the cell. However, neither is yet a treatment option. An alternative method is to suppress premature termination codons and thus permit translation to continue; topical nasal gentamicin (an aminoglycoside antibiotic) has been shown to result in the expression of functional CFTR channels.

Adenosine deaminase (ADA) deficiency

Successful gene therapy for this rare immunodeficiency disease has entailed introducing a normal human ADA gene into the patient’s lymphocytes to reconstitute the function of the cellular and humoral immune system in severe combined immunodeficiency (SCID).

Pharmacogenomics

This is the study of individual SNPs that determine drug behaviour to explain why some patients give variable response to the particular drug. The potential of pharmacogenomic approaches are usually related to single-gene traits that affect drug metabolism, e.g. SNPs in the gene encoding thiopurine-S-methyl-transferase (TPMT), which metabolizes immunosuppressant drugs (e.g. azathioprine). Patient-specific therapies based on their genetic profile will lead to the development of drugs (or drug combinations).

Stem cell therapy

Stem cell therapy has the potential to radically change the treatment of human disease (see p. 33). A number of adult stem cell therapies already exist, particularly bone marrow transplants. It is currently anticipated that technologies derived from stem cell research can be used to treat a wider variety of diseases in which replacement of destroyed specialist tissues is required, such as in Parkinson’s disease, spinal cord injuries and muscle damage.

Ethical considerations

Ethical considerations must be taken into account in any discussion of clinical genetics. For example, prenatal diagnosis with the option of termination may be unacceptable on moral or religious grounds. With diseases for which there is no cure and currently no treatment (e.g. Huntington’s chorea), genetic tests can predict accurately which family members will be affected; however, many people would rather not know this information. One very serious outcome of the new genetic information is that disease susceptibility may be predictable, for example in Alzheimer’s disease, so the medical insurance companies can decline to give policies for individuals at high risk. Society has not yet decided who should have access to an individual’s genetic information and to what extent privacy should be preserved.

The genetic basis of cancer

Cancers are genetic diseases and involve changes to the normal function of cellular genes. However, multiple genes interact during oncogenesis and an almost stepwise progression of defects leads from an overproliferation of a particular cell to the breakdown of control mechanisms such as apoptosis (programmed cell death). This would be triggered if a cell were to attempt to survive in an organ other than its tissue of origin. For the vast majority of cancer cases (especially those in older people), the multiple genetic changes which occur are somatic. For some cancers, however (where the cancer normally occurs at an earlier age), a dominant inherited single-gene defect can give rise to an almost Mendelian trend with lifetime risks of nearly 90%.

Autosomal dominant inheritance

The following are examples of cancer syndromes (see Table 9.3, p. 433) that exhibit dominant inheritance:

Retinoblastoma is an eye tumour found in young children. It occurs in both hereditary (40%) and non-hereditary (60%) forms. The 40% of people with the hereditary form have a germline mutation in the Retinoblastoma gene (RB1) and are also at risk for developing other tumours, particularly osteosarcoma.

Breast and ovarian cancer. Two major genes have been identified – BRCA1 and BRCA2. A strong family history along with germline mutation of these genes accounts for most cases of familial breast cancer and over half of familial ovarian cancers. BRCA1 and 2 proteins bind to the DNA repair enzyme Rad51 to make it functional in repairing DNA breaks. Mutations in the BRCA genes will lead to accumulation of unrepaired mutations in tumour-suppressor genes and crucial oncogenes.

Neurofibromatosis. Inactivation of the NF1 gene will lead to constitutive activation of ras proteins.

Multiple-endocrine-adenomatosis syndromes (see p. 997). Multiple endocrine neoplasia type 1 is associated with the MEN1 gene and type 2 (MEN2) is associated with mutations in the RET proto-oncogene on chromosome 10 and as such are the exception to all the other syndromes which involve tumour suppressor genes.

FURTHER READING

McDermott U, Downing JR, Stratton MR. Genomics and the continuum of cancer care. N Engl J Med 2011; 364(4):340–350.

Wang L, McLeod HL, Weinshilboum RM. Genomics and drug response. N Engl J Med 2011; 364(12):1144–1153.

Autosomal recessive inheritance

Some relatively rare autosomal recessive diseases associated with abnormalities of DNA repair predispose to the development of cancer:

Xeroderma pigmentosum. There is an inability to repair DNA damage caused by ultraviolet light and by some chemicals, leading to a high incidence of skin cancer.

Ataxia telangiectasia. Mutation results in an increased sensitivity to ionizing radiation and an increased incidence of lymphoid tumours.

Bloom’s syndrome and Fanconi’s anaemia. An increased susceptibility to lymphoid malignancy is seen.

Oncogenes

The genes coding for growth factors, growth factor receptors, secondary messengers or even DNA-binding proteins would act as promoters of abnormal cell growth if mutated. This concept was verified when viruses were found to carry genes which, when integrated into the host cell, promoted oncogenesis. These were originally termed viral or ‘v-oncogenes’, and later their normal cellular counterparts, c-oncogenes, were found. Thus, oncogenes encode proteins that are known to participate in the regulation of normal cellular proliferation e.g. erb-A on chromosome 17q11-q12 encodes for the thyroid hormone receptor. See Table 2.5.

Table 2.5 Examples of acquired/somatic mutations and proto-oncogenes

Point mutation
K-RAS	Colorectal and pancreatic cancer
B-RAF	Melanoma, thyroid
ALK	Lung cancer
DNA amplification
MYC	Neuroblastoma
HER2-neu	Breast cancer
Chromosome translocation
BCR/ABL	CML, ALL
PML/RARA	APL
BCL2/IGH	Follicular lymphoma
IGH/CCND1	Mantle cell lymphoma
MYC/IgH	Burkitt’s lymphoma

CML, chronic myeloid leukaemia; ALL, acute lymphoblastic leukaemia; APL, acute promyelocytic leukaemia.

If during cell division an error occurs and two chromosomes translocate, so that a portion swaps over, the translocation breakpoint may occur in the middle of two genes. If this happens then the end of one gene is translocated on to the beginning of another gene, giving rise to a ‘fusion gene’. Therefore sequences of one part of the fusion gene are inappropriately expressed because they are under the control of the other part of the gene.

An example of such a fusion gene (the Philadelphia chromosome) occurs in chronic myeloid leukaemia (CML, see p. 451). Similarly in Burkitt’s lymphoma, a translocation causes the regulatory segment of the myc oncogene to be replaced by a regulatory segment of an unrelated immunoglobulin.

Viral stimulation

When viral RNA is transcribed by reverse transcriptase into viral cDNA and in turn is spliced into the cellular DNA, the viral DNA may integrate within an oncogene and activate it. Alternatively, the virus may pick up cellular oncogene DNA and incorporate it into its own viral genome. Subsequent infection of another host cell might result in expression of this viral oncogene. For example, the Rous sarcoma virus of chickens was found to induce cancer because it carried the ras oncogene.

After the initial activation event, other changes occur within the DNA. A striking example of this is amplification of gene sequences, which can affect the myc gene, for example. Instead of the normal two copies of a gene, multiple copies of the gene appear either within the chromosomes (these can be seen on stained chromosomes as homogeneously staining regions) or as extrachromosomal particles (double minutes). N-myc sequences are amplified in neuroblastomas, as are N-myc or L-myc in some lung small-cell carcinomas.

FURTHER READING

Croce CM. Molecular origins of cancer: oncogenes and cancer. N Engl J Med 2008; 358:502–511.

Esteller M. Epigenetics in cancer. N Engl J Med 2008; 358:1148–1159.

Gearhart J, Pashos EE, Prasad MK. Pluripotency redux – advances in stem-cell research. N Engl J Med 2007; 357:1469–1472.

Hoeijmakers JH. DNA damage, aging, and cancer. N Engl J Med 2009; 361:1475–1485.

Tumour suppressor genes

These genes restrict undue cell proliferation (in contrast to oncogenes), and induce the repair or self-destruction (apoptosis) of cells containing damaged DNA. Therefore, mutations in these genes which disable their function, lead to uncontrolled cell growth in cells with active oncogenes. An example is the germline mutations in genes found in non-polyposis colorectal cancer, which are responsible for repairing DNA mismatches (p. 288).

The RB gene was the first tumour suppressor gene to be described (p. 433). In the familial variety, the first mutation is inherited and by chance, a second somatic mutation occurs with the formation of a tumour. In the sporadic variety, by chance both mutations occur in both the RB genes in a single cell.

Since the finding of RB, other tumour suppressor genes have been described, including the gene p53. Mutations in p53 have been found in almost all human tumours, including sporadic colorectal carcinomas, carcinomas of breast and lung, brain tumours, osteosarcomas and leukaemias. The protein encoded by p53 is a cellular 53 kDa nuclear phosphoprotein that plays a role in DNA repair and synthesis, in the control of the cell cycle and cell differentiation and programmed cell death – apoptosis. p53 is a DNA-binding protein which activates many gene expression pathways but it is normally only short-lived. In many tumours, mutations that disable p53 function also prevent its cellular catabolism. Although in some cancers there is a loss of p53 from both chromosomes, in most cancers (particularly colorectal carcinomas; see Fig. 9.1) such long-lived mutant p53 alleles can disrupt the normal alleles’ protein. As a DNA-binding protein, p53 is likely to act as a tetramer.

Thus, a mutation in a single copy of the gene can promote tumour formation because a hetero-tetramer of mutated and normal p53 subunits would still be dysfunctional. p53 and RB are involved in normal regulation of the cell cycle. Other cancer–associated genes are also intimately involved in control of the cell cycle (Fig. 2.13).