Chapter 6 Natural product chemistry
Natural products in drug discovery
A survey of any pharmacopoeia will show that natural products have a key role as biologically active agents; in fact, it has been estimated that 20–25% of all medicines are derived from such sources. In this definition, the medicinal agent may be a natural product isolated straight from the producing organism (e.g. the β-lactamase inhibitor clavulanic acid isolated from the bacterium Streptomyces clavuligerus), a natural product that has undergone a minor chemical modification (semisynthetic) (e.g. aspirin, derived from salicylic acid, which occurs as esters and glycosides in Salix spp.), or a compound that was totally synthesized based on a particular natural product possessing biological activity (e.g. pethidine, which was based on morphine from the opium poppy, Papaver somniferum). It is sometimes difficult to see how the fully synthetic compound was modelled on the natural product (Fig. 6.1).
Natural products are historically the core of medicines and they are still a major source of drug leads, which is a term used to describe compounds that may be developed into medicines. A particular example of a natural product that is currently one of the best selling drugs is paclitaxel, marketed as Taxol (Fig. 6.2). This drug was developed by BristolMyers Squibb and marketed for the treatment of ovarian and mammary cancers, and became available for use in the USA in 1993. The compound was initially isolated from the bark of the Pacific yew tree, Taxus brevifolia, and demonstrates the best possible qualities of a natural product, being highly functional and chiral. Additionally, paclitaxel occurs in the bark with a wide range of structurally related compounds (taxanes diterpenes); this is a further important and valuable quality of natural products when they are considered as a source in the search for biologically active drug leads. Paclitaxel has many functional groups and chiral centres (11) and these qualities give rise to its distinct shape and fascinating biological activity. It is important not to be overwhelmed at such a complex molecule, but to look at the functional groups that make up the total structure of the compound. Even a natural product that is as structurally complex as paclitaxel can be broken down into the simple chemical features of functional groups and chiral centres.
In the chemotaxonomic approach, knowledge that a particular group of plants contains a certain class of natural product may be used to predict that taxonomically related plants may contain structurally similar compounds. This approach is highly useful when the chemistry and biological activity of a compound is well described and compounds with similar chemical structure are needed for further biological testing. A good example of this is the plant family Solanaceae, which is a rich source of alkaloids of the tropane type. The knowledge that deadly nightshade (Atropa belladonna) produces hyoscyamine (a smooth muscle relaxant) would enable one to predict that the thorn apple (Datura stramonium) would contain structurally related compounds, and this is certainly the case, with hyoscine being the major constituent of this solanaceous plant (Fig. 6.3).
The discovery of drugs from nature is complex and is depicted schematically in Fig. 6.4. The biomass (plant, microbe, marine organism) is collected, dried and extracted into a suitable organic solvent to give an extract, which is then screened in a bioassay to assess its biological activity (bioactivity). Screening or assessment of biological activity is generally divided into two formats depending on the number of extracts to be assessed. In low-throughput screening (LTS), small numbers of extracts (a single extract up to hundreds of extracts) are dispensed into a format that is compatible with the bioassay (e.g. a microtitre plate, sample tubes). This approach is used widely in academic laboratories where only a relatively low number of extracts are assessed. In high-throughput screening (HTS), thousands of extracts are dispensed into a format (usually microtitre plates with many wells, e.g. 384 wells per plate) and screened in the bioassay. This approach is favoured by the pharmaceutical industry, which may have hundreds of thousands of samples (both natural and synthetic) for biological evaluation. This large-scale approach means that decisions can be made rapidly about the status of an extract, which has an impact on the cost of the discovery process.
The polyketides
Polyketides are mainly acetate (C2) derived metabolites and occur throughout all organisms (as fatty acids and glycerides), but it is the microbes, predominantly the filamentous bacteria of the genus Streptomyces, that produce structurally diverse types of polyketides, especially as antibiotic substances. The biosynthesis of these compounds begins (Fig. 6.5) with the condensation of one molecule of malonyl-CoA (CoA is short for coenzyme A) with one molecule of acetyl-CoA to form the simple polyketide acetoacetyl-CoA. In this reaction (Claisen reaction), one molecule of CO2 and one molecule of HSCoA are generated. The reaction occurs because the carbon between both carbonyl groups of malonyl-CoA (the acidic carbon) is nucleophilic and can attack an electropositive (electron-deficient) centre (e.g. the carbon of a carbonyl group).
The curved arrows in Fig. 6.5 indicate the movement of a pair of electrons to form a bond. Further condensation reactions between another molecule of malonyl-CoA and the growing polyketide lead to chain elongation, in which every other carbon in the chain is a carbonyl group. These chains are known as poly-β-keto esters and are the reactive intermediates that form the polyketides. Using these esters, large chains such as fatty acids can be constructed and, in fact, reduction of the carbonyl groups and hydrolysis of the -SHCoA thioester leads to the fatty acid class of compounds. The expanding polyketide chain may be attached as a thioester to either CoA or to a protein called an acyl-carrier protein. Multiple Claisen reactions with additional molecules of malonyl-CoA can generate long-chain fatty acids such as stearic acid and myristic acid.
The poly-β-keto ester can also cyclize to give aromatic natural products, and the way in which the poly-β-keto ester folds determines the type of natural product generated (Fig. 6.6). If the poly-β-keto ester folds as A1, then loss of a proton, followed by an intramolecular Claisen reaction of intermediate A2 (by attack of the acidic carbon on the carbonyl), would result in the formation of a cyclic polyketide enolate A3 which will rearrange to the keto compound with expulsion of the SCoA anion, resulting in the ketone A4. This ketone would readily undergo keto-enol tautomerism to the more favoured aromatic triphenol A5 (phloroacetophenone).
Fatty acids and glycerides
This group of polyketides is widely distributed and present as part of the general biochemistry of all organisms, particularly as components of cell membranes. They are usually insoluble in water and soluble in organic solvents such as hexane, diethyl ether and chloroform. These natural products are sometimes referred to as fixed oils (liquid) or fats (solid), although these terms are imprecise as both fixed oils and fats contain mixtures of glycerides and free fatty acids and the state of the compound (i.e. liquid or solid) will depend on the temperature as well as the composition. Glycerides are fatty acid esters of propane-1,2,3-triol (glycerol). They are sometimes referred to as saponifiable natural products, meaning that they can be converted into soaps by a strong base (NaOH). The term saponifiable comes from the Latin word sapo meaning ‘soap’. Saponification of fatty acids and glycerides with sodium hydroxide results in the formation of the sodium salts of the fatty acids (Fig. 6.7).
Glycerides can be very complicated mixtures as, unlike the example given in Fig. 6.7, the substituents on the glycerol alcohol may be different from each other, and it is not uncommon for lipophilic plant extracts to contain many types of glycerides.
Fatty acids are very important as formulation agents and vehicles in pharmacy and as components of cosmetics and soaps. Table 6.1 lists the common names, chemical formulae, sources and uses of the more common fatty acids.
The tetracyclines
These polyketide-derived natural products are tetracyclic (i.e. have four linear six-membered rings, from which the group was named) and were discovered as part of a screening programme of extracts produced by filamentous bacteria (Actinomycetes), which are common components of soil. The most widely studied group of actinomycetes are species of the genus Streptomyces, which are very adept at producing many types of polyketide natural products of which the antibiotic tetracycline (Fig. 6.8) and the anthracyclic antitumour agents (see Chapter 8) are excellent examples.
The key features of this class of compound are shown in Fig. 6.8. Although tetracycline has numerous functional groups, including a tertiary amine, hydroxyls, an amide, a phenolic hydroxy and keto groups, it is still possible to see that tetracycline is a member of the polyketide class of natural products by looking at the lower portion of the molecule. C10, C11, C12 and C1 are oxygenated, indicating that the precursor of this compound was a poly-β-keto ester. C10 and C11 and C12 and C1 form part of a chelating system that is essential for antibiotic activity and may readily chelate metal ions such as calcium, magnesium, iron or aluminium and become inactive. This is one of the reasons why oral formulations of the tetracycline antibiotics are never given with foodstuffs that are high in these ions (e.g. Ca2+ in milk) or with antacids, which are high in cations such as Mg2+. This group of antibiotics has been long known and they have a very broad spectrum of activity against Gram-positive and Gram-negative bacteria, spirochetes, mycoplasmae, rickettsiae and chlamydiae. Tetracycline comes from mutants of Streptomyces aureofaciens, and the related analogue oxytetracycline from S. rimosus (Fig. 6.9). These antibiotics are widely used as topical formulations for the treatment of acne, and as oral/injection preparations.
Griseofulvin
Another polyketide antibiotic is griseofulvin (Grisovin) from the fungus (mould) Penicillium griseofulvum (Fig. 6.10). This compound was originally isolated by researchers at the London School of Hygiene and Tropical Medicine.
Erythromycin A
Erythromycin A is a complex polyketide from Saccharopolyspora erythrea (Actinomycetes), which is a filamentous bacterium, originally classified in the genus Streptomyces. This compound is a member of the natural product class of macrolide antibiotics; these can contain 12 or more carbons in the main ring system. The term macrolide is derived from the fact that erythromycin is a large ring structure (macro) and is also a cyclic ester referred to as an olide (a lactone). As can be seen from Fig. 6.11, erythromycin A has the best features of natural products, being highly chiral (possessing many stereochemical centres) and having many different functional groups, including a sugar, an amino sugar, lactone, ketone and hydroxyl groups.
The statins
A further group of polyketide-derived natural products is the statins, so named for their ability to lower (bring into stasis) the production of cholesterol, high levels of which are a major contributing factor to the development of heart disease. The rationale behind the use of these compounds is as inhibitors of the enzyme hydroxymethylglutaryl-CoA (HMG-CoA) reductase, which catalyses the conversion of HMG-CoA (Fig. 6.12) to mevalonic acid, one of the key intermediates in the biosynthesis of cholesterol. HMG-CoA reductase became a target for the discovery of the natural product inhibitor mevastatin, which was initially isolated from cultures of the fungi Penicillium citrinum and Penicillium brevicompactum (Fig. 6.12).
Shikimic-acid-derived natural products
Shikimic acid, sometimes referred to as shikimate, is a simple acid precursor for many natural products and aromatic amino acids, including phenylalanine, tyrosine, tryptophan, the simple aromatic acids that are common in nature (e.g. benzoic and gallic acids) and aromatic aldehydes such as vanillin and benzaldehyde that contribute to the pungent smell of many plants (Fig. 6.13).
A number of natural product groups can be constructed from the amino acid phenylalanine, in particular the phenylpropenes, lignans, coumarins and flavonoids, all of which possess a common substructure based on an aromatic 6-carbon ring (C6 unit) with a 3-carbon chain (C3 unit) attached to the aromatic ring (Fig. 6.14). Many reactions can occur to this 9-carbon unit, including oxidation, reduction, methylation, cyclization, glycosylation (addition of a sugar) and dimerization, all of which contribute to the value of natural products as a resource of biologically active compounds and enhance the qualities of structural complexity with the presence of chirality and functionality.
Phenylpropenes
The phenylpropenes are the simplest of the shikimic-acid-derived natural products and consist purely of an aromatic ring with an unsaturated 3-carbon chain attached to the ring. They are biosynthesized by the oxidation of phenylalanine by the enzyme phenylalanine ammonia lyase, which through the loss of ammonia results in the formation of cinnamic acid. Cinnamic acid may then undergo a number of elaboration reactions to generate many of the phenylpropenes. For example, in Fig. 6.15, cinnamic acid is reduced to the corresponding aldehyde, cinnamaldehyde, which is the major component of cinnamon oil derived from the bark of Cinnamomum zeylanicum (Lauraceae) and used as a spice and flavouring. Cinnamon has a rich history, being used by the ancient Chinese as a treatment for fever and diarrhoea and by the Egyptians as a fragrant ingredient in embalming mixtures.
Cinnamon leaf also contains eugenol, the major constituent of oil of cloves derived from Syzygium aromaticum (Myrtaceae). Clove oil was used as a dental anaesthetic and antiseptic, both properties of which are due to eugenol, and the oil is still widely used as a short-term relief for dental pain. These phenylpropenes may have many different functional groups (e.g. OCH3, O-CH2-O, OH) and the double bond may be in a different position in the C3 chain (e.g. eugenol versus anethole) (Fig. 6.16). They are common components of spices, have highly aromatic pungent aromas and many are broadly antimicrobial, with activities against yeasts and bacteria. Some members of this class can also cause inflammation.
Lignans
Lignans are low molecular weight polymers formed by the coupling of two phenylpropene units through their C3 side-chains (Fig. 6.17) and between the aromatic ring and the C3 chain. A common precursor of lignans is cinnamyl alcohol, which can readily form free radicals and enzymatically dimerize to form aryltetralin-type lignans of which the compounds podophyllotoxin, 4′-demethylpodophyllotoxin and α- and β-peltatin (from Podophyllum peltatum and Podophyllum hexandrum, Berberidaceae) are examples (Fig. 6.17).
Coumarins
The coumarins are shikimate-derived metabolites formed when phenylalanine is deaminated and hydroxylated to trans-hydroxycinnamic acid (Fig. 6.18). The double bond of this acid is readily converted to the cis form by light-catalysed isomerization, resulting in the formation of a compound that has phenol and acidic groups in close proximity. These may then react intramolecularly to form a lactone and the basic coumarin nucleus, typified by the compound coumarin itself, which contributes to the smell of newly mown hay.
Coumarins have a limited distribution in the plant kingdom and have been used to classify plants according to their presence (chemotaxonomy). They are commonly found in the plant families Apiaceae, Rutaceae, Asteraceae and Fabaceae and, as with all of the natural products mentioned so far, undergo many elaboration reactions, including hydroxylation and methylation and, particularly, the addition of terpenoid-derived groups (C2, C5 and C10 units) (Fig. 6.19).
Some coumarins are phytoalexins and are synthesized de novo by the plant following infection by a bacterium or fungus. These phytoalexins are broadly antimicrobial; for example, scopoletin is synthesized by the potato (Solanum tuberosum) following fungal infection. Aesculetin occurs in the horse chestnut (Aesculus hippocastanum) and phytotherapeutic preparations of the bark of this species are used to treat capillary fragility. Hieracium pilosella (Asteraceae), also known as mouse ear, contains umbelliferone and was used to treat brucellosis in veterinary medicine and the antibacterial activity of this plant drug may in part be due to the presence of this simple phenol (Fig. 6.20). Khellin is an isocoumarin (chromone) natural product from Ammi visnaga (Apiaceae) and has activity as a spasmolytic and vasodilator.
It has long been known that animals fed sweet clover (Melilotus officinalis, Fabaceae) die from haemorrhaging. The poisonous compound responsible for this adverse effect was identified as the bishydroxycoumarin (hydroxylated coumarin dimer) dicoumarol (Fig. 6.21).
The psoralens are coumarins that possess a furan ring and are sometimes known as furocoumarins or furanocoumarins because of this ring. Examples are psoralen, bergapten, xanthotoxin and isopimpinellin (Fig. 6.22).