Safety paradigm: genetic evaluation of therapeutic grade human embryonic stem cells

Emma Stephenson, Caroline Mackie Ogilvie, Heema Patel, Glenda Cornwell, Laureen Jacquet, Neli Kadeva, Peter Braude, Dusko Ilic

Abstract

The use of stem cells for regenerative medicine has captured the imagination of the public, with media attention contributing to rising expectations of clinical benefits. Human embryonic stem cells (hESCs) are the best model for capital investment in stem cell therapy and there is a clear need for their robust genetic characterization before scaling-up cell expansion for that purpose. We have to be certain that the genome of the starting material is stable and normal, but the limited resolution of conventional karyotyping is unable to give us such assurance. Advanced molecular cytogenetic technologies such as array comparative genomic hybridization for identifying chromosomal imbalances, and single nucleotide polymorphism analysis for identifying ethnic background and loss of heterozygosity should be introduced as obligatory diagnostic tests for each newly derived hESC line before it is deposited in national stem cell banks. If this new quality standard becomes a requirement, as we are proposing here, it would facilitate and accelerate the banking process, since end-users would be able to select the most appropriate line for their particular application, thus improving efficiency and streamlining the route to manufacturing therapeutics. The pharmaceutical industry, which may use hESC-derived cells for drug screening, should not ignore their genomic profile as this may risk misinterpretation of results and significant waste of resources.

1. Human embryonic stem cells: the best stem cell model for capital investment in stem cell therapy

Media-fuelled belief that stem cells possess a virtually unlimited restorative power and represent a universal remedy for all diseases has resulted in considerable funding support and scientific progress in stem cell biology over the last decade. However, despite exciting prospects, the potential of stem cells has not yet materialized. Patients with serious degenerative disorders such as Parkinson's disease, diabetes or cardiac disease have been waiting for the fulfilment of these media-hyped promises, but as yet there are a relatively small number of clinical trials despite years of research in this field.

Most commonly, embryonic stem cells (ESCs) are generated from the inner cell mass of the blastocyst (figure 1). In vitro these appear to be immortal, proliferating indefinitely in an undifferentiated, pluripotent state (figure 2). However, ESCs in vivo lose this property as differentiation proceeds and as development and growth-promoting signals change. By adulthood, the few remaining stem cells are dispersed throughout the body and are difficult to locate. However, they seem to be able to continue to generate identical daughter cells and/or tissue cells at each division. These residual pools of stem cells are suggested to be the source of the tissue regeneration and repair that occurs in adults. Tissue-specific stem cells are present in many organs and systems in adult animals although they differ greatly in their ability to self-renew and differentiate. For example, spermatogonial stem cells in the testis are unipotent and produce only one type of differentiated cell—the spermatozoon. By contrast, mesenchymal stem cells are multipotent and can produce adipocytes, osteoblasts, chondrocytes and myocytes in appropriate culture conditions. Unlike their embryonic counterparts, tissue-specific stem cells are not immortal, and show a decreasing capacity to self-renew with increasing age. This limitation has been associated with the reduced ability to repair the damage that accumulates with ageing, possibly owing to exhaustion of the stem cell pool, or as a consequence of inherited or acquired mutations throughout life that impede normal stem cell function.

Figure 1.

Most commonly, human embryonic stem cells are derived from inner cell mass (ICM) of the blastocyst plated on mitotically inactivated fibroblasts.

Figure 2.

Human embryonic stem cell (hESC) colony. hESCs have high alkaline phosphatase activity (green). Actin filaments (red) are visualized with rhodamine X-conjugated phalloidin in both hESCs and surrounding human foreskin fibroblast feeders.

The difficulty in locating these scarce stem cells from a variety of sources and expanding their number sufficiently for therapeutic use is proving a major hindrance for industry in the translation of stem cell potential to be used in regenerative medicine. Pluripotent stem cells, both human ESCs (hESCs) and induced pluripotent stem cells (iPSCs), are exceptions since they can be produced in theoretically limitless quantities, and therefore capable of providing more cells than from any other source, regardless of differentiation efficacy and stabilization. Thus, they are the cell type likely to yield the most from invested capital. Although both types of stem cells are promising, there are still a number of unresolved technical and biological issues that make iPSCs less likely to be the immediate choice for cell-based therapy. This leaves the pluripotent hESCs as the best stem cell model for capital investment for stem cell therapy.

The initial difficulties with legislation to allow use of human embryos for stem cell research, the lack of consensus for reporting the quality and type of embryos suitable for stem cell derivation, and the route to translation have largely been overcome, especially in the UK, where a regulatory route map to facilitate clinical application has recently been produced as a collaborative project between the regulatory bodies involved: Human Fertilisation and Embryology Authority (HFEA), Human Tissue Authority (HTA) and Medicines and Healthcare Products Regulatory Agency (MHRA).

2. Risk assessment of genetic abnormalities in hesc-based cell therapy

Although reliable and validated methods for deriving, growing and preserving hESCs are within reach, the greatest impediment to the provision of hESCs for therapeutic purposes is that full genotypic and phenotypic characterization criteria have not yet been established, on a scale that can be routinely and reliably applied. Despite wide interest in defining the properties of hESCs, comprehensive characterization has been done with only a subset of lines. The popularity of some of the earliest lines derived, such as H1 and H9, both for research and for the generation of clinical grade banks, arises not necessarily because they are superior lines, as these cells were derived on mouse feeders in the presence of serum, but because by default they are the best characterized. Although variations in derivation and culture conditions can account for differences among lines, even those derived under identical conditions do differ. This genotypic instability represents one of the most critical hurdles in the development of hESC lines as therapeutic products, in terms of cell-based therapy as well as high-throughput drug screening, and a risk for investors considering commercialization opportunities.

The gene-expression profile of hESCs explored by sophisticated techniques, such as serial analysis of gene expression, expressed sequence tag enumeration, microarray analysis and massively parallel signature sequencing, has revealed surprisingly large variations among lines (Adewumi et al. 2007; Allegrucci & Young 2007; Lefort et al. 2008; Närvä et al. 2010). Although many of such studies have been useful in identifying ‘master’ genes of pluripotency, no consistent clustering could be ascribed to variations in chromosomal stability or differentiation propensity of the hESC lines. The number of differences detected is likely to reflect random fluctuations, and no approach has emerged to obviate the establishment of a cell line signature. Both inherent instability within the embryos used to derive the lines, or selection pressures owing to culture conditions could contribute to this variation. To define what is an acceptable cell signature, extensive high-resolution genomic analyses should be done on a large number of hESC lines at early and late passages, cultured under various conditions. Only meta-analysis of such studies might lead to a conclusion.

3. Embryos used for hesc derivation

From 2004, the Medical Research Council awarded a number of strategic grants to UK centres in an effort to facilitate stem cell research. These grants not only provided funds to set up stem cell laboratories in proximity to in vitro fertilization (IVF) units, but also facilitated collaboration between IVF units that had access to human embryos surplus to therapeutic need and scientists trying to derive stem cells. Since it was a requirement of HFEA licences for stem cell research that a sample of any line derived should be deposited in the UK Stem Cell Bank (UKSCB) for international use in research, it also complied with the spirit of the HFEA Act, whereby the number of embryos used for research should be limited by necessity. A uniform patient information and consent process was introduced with the formation of the hESC coordinators network (hESCCO; Franklin et al. 2008). The requirement to have an arms-length trained person to talk to interested patients about the research not only ensured rigour of informed consent, but also streamlined the process and facilitated national collaboration between centres to increase the number of embryos available for stem cell research.

These goals have largely been realized with five UK centres currently attempting to derive hESC lines within a GMP (good manufacturing practice) environment. If cells are intended for human use, the premises and processes are required to conform to the Human Tissue Act (2004) and thus be licensed by the HTA, and also to conform to practices approved by the MHRA. These require the basic information about the donors of the sperm and eggs, in order to minimize risks of transmission of adventitious agents. The European Directive on Cells and Tissues also requires that there should be basic relevant information from the patients' histories, and as a minimum testing for human immunodeficiency virus (HIV) 1 and 2, hepatitis B and C, and syphilis. It is argued that requiring more detailed personal information as is required for blood donors would not only be intrusive but also likely to be unreliable as the ‘donors’ are seen together and may be unlikely to reveal truthful information about the previous or present sexual environment. Similarly, it is neither easy nor necessarily practical to return to the donor couple if further information or tests related to stem cells are needed—these couples are ‘incidental’ donors, as their primary reason for attending IVF is to have a baby. Additionally, as only a minute proportion of embryos generated for IVF will ever be used for stem cell derivation, it is argued that retaining a sample of serum from each patient for later testing would be impractical. Stem cell lines are unique among tissues for donation and therapy in that they can be amplified and stored before release, thus allowing additional post hoc testing for safety. Although one can never ‘test safety into a product’, the availability of large amounts of material and development of new in vitro tests should substantially reduce risk, and be better than inference from unreliable medical histories, or limited testing on pre-derivation serum samples.

4. Inherited and acquired structural chromosomal abnormalities

Chromosomal instability has been reported for several widely used hESC lines (H1, H7, H14, HS181, HS237, SA002.5, hESC5 and BG01—e.g. Hanson & Caisander 2005; Allegrucci & Young 2007; Baker et al. 2007; Catalina et al. 2008). These karyotypic changes emerged beyond passage 13 and were in general losses or gains of whole chromosomes (aneuploidy), rather than structural rearrangements within a diploid karyotype. However, other studies reported a lack of karyotypic changes in a variety of hESC lines (SA001, hES1-6, BG02, BG03, SA003, SA121, SA461, HS235) even when grown for between 34 and 140 passages. Although this cytogenetic resilience of some hESC lines might arise from particular aspects of cell culture (passage methods, presence versus absence of feeders, and so on), it might also reflect an inherent genetic predisposition of some hESC lines to chromosomal instability. In either event, in order to use them as models, there is a legitimate need to identify patient/lineage-specific properties in the starting material and any pathological phenotypes in disease-specific pluripotent cell lines.

4.1. Intrinsic factors—inherent instability

hESC lines are derived from embryos made available following IVF for the treatment of female and male infertility, or from preimplantation genetic diagnosis (PGD) treatment cycles undertaken to avoid transmitting a known genetic disease. Physical or hormonal problems may explain some causes of infertility, but in many cases the reason why a couple is unable to conceive remains unknown; one explanation may be the genetic status of the embryos. Chromosomal abnormality is common in human embryos. Over 50 per cent of preimplantation IVF embryos have chromosomal abnormalities such as aneuploidy, polyploidy and haploidy (Munné et al. 1994; Harper et al. 1995; Marquez et al. 2000; Ilic et al. 2010). Most of these abnormalities are not compatible with embryo development. This is one explanation for why under half of all human conceptions result in live births (Edmonds et al. 1982) and for the relatively low implantation rate after IVF, even when morphologically high-quality embryos are transferred (Gianaroli et al. 1999; Kahraman et al. 2000; Voullaire et al. 2007). It might also be reasonable to speculate that some infertility might be caused by undiagnosed genetic mutations, which affect either embryo quality or the implantation process. Thus, an embryo created by IVF, and consequently the hESC lines derived from it, might also be affected by such abnormalities or mutations. Although genetic links between impaired fertility and susceptibility to various degenerative diseases are yet to be established, the prevalence of balanced structural chromosomal abnormalities in patients seeking IVF or PGD treatment is higher than in the general population. This further highlights the need for comprehensive genetic analyses to become a routine part of hESC line characterization. Alterations at the chromosome level might result in undesirable consequences for the recipients of stem cell therapies (Amariglio et al. 2009).

4.2. Extrinsic factors

Normal human cells in vivo have a rate of spontaneous mutation of 10−7 to 10−8 per nucleotide per cell division. Since there are approximately 3 × 109 nucleotides per haploid human genome, between 30 and 3000 mutations could occur per cell at each cell cycle (Lefort et al. 2009). Fidelity of DNA replication in vitro is likely to be even lower.

The predominant spontaneous mutation in ESCs is the loss of heterozygosity (LOH; Cervantes et al. 2002). In somatic cells, LOH is mediated mostly by mitotic recombination, which is suppressed in ESCs. Chromosome loss/reduplication leading to uniparental disomy (UPD) represents more than half of the LOH events in ESCs, a pathway that is not commonly observed in other somatic cells. Culture parameters and manipulation techniques have been shown to have an effect on gene-expression patterns and genomic integrity in both embryos and stem cells. For example, the hESC lines H1 and H14 grown in a laboratory at the University of Wisconsin predominantly gained an extra chromosome 12, whereas at the University of Sheffield the same lines gained an extra chromosome 17 (Werbowetski-Ogilvie et al. 2009).

4.2.1. Oxygen tension

Both preimplantation embryos from which hESC lines are derived and hESC lines themselves are particularly sensitive to the oxygen tension to which they are exposed in vitro. It is well known that an increase in free oxygen radicals has an adverse effect on fertilization rates, embryo quality and growth in vitro. However, even though hESCs have been shown to grow as well under 3 or 5 per cent oxygen as at 21 per cent, corroborating results have shown reduced spontaneous differentiation, enhanced cell proliferation, and increased plating, freezing and thawing efficiency after maintenance in low-oxygen tension for more than 14 passages (Ezashi et al. 2005). Furthermore, oxygen tension of 2 per cent has been shown to enhance hESC clonal recovery and significantly reduce the acquisition of spontaneous chromosomal abnormalities (Forsyth et al. 2006). Since hESCs have metabolic characteristics of preimplantation embryos, which must rely on anaerobic metabolism instead of oxidative phosphorylation to produce adenosine-5′-triphosphate at 1–5% oxygen tension in the uterus, they have fewer and smaller mitochondria with poorer cristae (Oh et al. 2005; St John et al. 2005). This indicates that the hESCs may not be as well protected against DNA damage caused by reactive oxygen species and free radicals, which are generated during normal oxidative cell metabolism. Indeed, undifferentiated hESCs have a much lower expression of Cu/Zn-superoxide dismutase, glutathione peroxidase 1, and peroxiredoxin 1 and 2, which are enzymes that normally protect cells against such damage (Cho et al. 2006).

4.2.2. Use of enzymes for disaggregating colonies

Several groups have reported that the technique used for passaging cells can induce alterations in karyotype (Draper et al. 2004; Hoffman & Carpenter 2005; Mitalipova et al. 2005). Comparison of mechanical cutting of colonies and the use of enzymes for passage in the same hESC line suggested that mechanical cutting better supports maintenance of a normal karyotype in long-term culture; the enzyme-treated cell line was more likely to develop karyotype abnormalities (most often chromosome 12 and 17 trisomy). The type of enzyme used does not seem to play a role; the phenomenon is observed with all three commonly used enzymes for hESC passage: collagenase, trypsin or accutase. Although it is unclear why the enzyme-based methods favour gross chromosomal rearrangements when this is not seen with other cell types such as fibroblasts, which are passaged in a similar way in vitro, the most likely explanation is that it is linked to well-established dynamic reciprocity between architectural integrity through cell adhesion and karyotype stability (Tlsty 1998). hESCs are polarized and express an epithelial plasma membrane protein profile (Krtolica et al. 2007; Van Hoof et al. 2008). Frequent disruption of cell–cell contacts and polarity may induce karyotype rearrangements and DNA damage that would, in differentiated cells, lead to cell death. However, in order to favour cell proliferation as in early embryos in vivo, DNA-repair mechanisms that protect the genome from endogenous and exogenous factors that induce DNA damage and other genotoxic insults, are not fully developed in ESCs. The activity of genes regulating chromosome segregation, the cell cycle and apoptosis fluctuate during human preimplantation development (Wells et al. 2005). The guardian of the genome, the tumour suppressor gene, p53, which prevents the accumulation of genetic mutations in somatic cells by inducing cell cycle arrest, apoptosis or senescence, does not function in ESCs in response to DNA damage in the same way as in somatic cells. Instead of p53-dependent apoptosis or cell cycle arrest at the G1/S checkpoint following DNA damage, in ESCs p53 suppresses nanog expression and pushes the cells towards differentiation (Aladjem et al. 1998; Song et al. 2010; Zhao & Xu 2010).

Therefore, in order to avoid gross abnormalities, it seems that hESC lines are best passaged manually using mechanical splitting until karyotypically normal frozen stocks are established. However, even manual passage does not protect against structural rearrangements such as microdeletions or duplications. For example, according to fluorescent in situ hybridization analysis, an amplification of 2.5–4.6 Mb at 20q11.21, encompassing 23 genes, has been reported in the oldest hESC line H1 already at passage 24 (Lefort et al. 2008). The number of rearrangements was relatively high at passage 56, even though the passaging has been done manually (Hovatta et al. 2010). However, it is possible that these imbalances had been present from the beginning, since no analyses with such a resolution had been done on earlier passages. Sub-populations of hESCs that accumulate karyotype alterations usually have a growth advantage. Occasionally, though unfortunately not always, they can be selected against when passaged mechanically and the operator is able to make the decision about splitting based on morphological criteria. Before scaling-up cell expansion for the purpose of cell therapy, we have to be certain that the genome of our starting material is stable and normal. The resolution of G-banded karyotype is unable to give such assurance.

4.2.3. Feeder-free culture

In one study, which looked at parameters of functional adaptation to growth directly on a plastic surface during long-term culture of hESCs, successful adaptation was achieved but was paralleled with a karyotype change in 100 per cent of the cells (Imreh et al. 2006). Similarly, initial attempts at derivation in feeder-free-defined conditions resulted in hESC lines with unstable karyotypes (Ludwig et al. 2006). A microenvironment, which retained a three-dimensional extracellular matrix, fared much better in the maintenance of normal karyotype (Ilic 2006). Matrigel, a solubilized basement membrane preparation extracted from the Engelbreth-Holm-Swarm mouse sarcoma, a tumour rich in extracellular matrix proteins, supports large-scale propagation of hESCs without leading to chromosomal imbalances, especially if the cells are passaged mechanically (Xu et al. 2001). This reinforces the importance of the dynamic reciprocity between cell adhesion and karyotype stability as discussed above.

4.2.4. Cryopreservation

The repeated freeze–thaw cycles, whether by conventional slow-cooling methods or vitrification, are unavoidable when banking and expanding cells. Cryo-damage occurs owing to the formation of orderly ice-crystalline lattice structures. Although hESCs are infamous for relatively low recovery upon thawing, the extent to which cryopreservation affects hESC genomic integrity is unknown. The most widely used cryoprotectant dimethylsulphoxide modifies the epigenetic profile of ESCs and increases the production of free radicals that can affect DNA replication and chromatin structure (Iwatani et al. 2006; Diaferia et al. 2008). Ice crystals and gas bubbles formed during thawing can disrupt spindles and induce abnormal segregation of chromosomes (Diaferia et al. 2008). Evidence from zebrafish studies suggests that cryopreservation increases the frequency of mutations in the mitochondrial genome by nearly fivefold (Kopeika et al. 2005). If the normal rate of spontaneous mutations is between 30 and 3000 per cell during each cell cycle, this finding suggests that cryopreservation may result in 150–15 000 mutations in each surviving cell and needs a sufficiently sensitive system to detect them.

5. Current criteria for hesc line acceptance into stem cell banks

Lack of applied technology has resulted in the deposition of most hESC lines in both UK and USA stem cell banks with no more than basic information such as G-banded karyotype and marker expression. This is a cause for the anxiety of manufacturers and end-users and is one factor that may have prevented hESC utilization at the rapid pace that befits their enormous potential. Since such an enormous investment is needed to take a cell line through to a medicinal product, manufacturers need to know as much as possible about the available cell lines in order to select the one most likely to be suitable for developmental needs.

For centres deriving hESC lines in the UK, it is a condition of the HFEA licence that a sample must be deposited in the UKSCB. Therefore, the UKSCB cannot enforce a minimum requirement for characterization of research grade lines, but can make a decision as to which to process further or which to simply archive. However, to deposit a line to be declared of clinical grade, a minimum characterization requirement needs to be established and is being considered to include passage number before submission, DNA fingerprinting, karyotype, viral and sterility testing, viability assays and expression of pluripotent markers.

Advantages cited for the inclusion of cell lines in an international bank, such as the UKSCB, include promotion of the wider use of the cell line by providing stringently tested, well-characterized cells within a quality framework, detailed assessment using standardized methodologies and optimized culture, preservation and characterization through a targeted research programme (www.ukstemcellbank.org.uk). In order for this mission to be achieved, cells will need to be supplied to the UKSCB with as much information as possible, which should include a detailed genotype. In the report describing their derivation of some of the few clinical grade stem cell lines that exist, the A*Star Singapore team provide no more than routine characterization (G-banded karyotype and pluripotent/differentiation marker expression by immunocytochemistry, polymerase chain reaction and fluorescence activated cell sorting), which seems insufficient for confident selection (Crook et al. 2007). There is now a legitimate need to identify lineage-specific properties in the starting material using modern molecular methods for every newly derived hESC line before the line is deposited in national stem cell banks in order to use hESCs as a source in cell-based therapy.

6. Molecular karyotyping

6.1. Copy number variations: molecular diagnostics

For the last 50 years, karyotype analysis has been the ‘gold standard’ for detection of chromosome anomalies. From plain staining and counting, to extended chromosomes, enzymatically treated and stained to give detailed band-by-band comparison along homologous pairs, the methodology for producing chromosome preparations has been improved over the years (Yunis 1976). However, even with optimal conditions, the resolution of this technique is considered to be approximately 3 Mb, while in certain regions of the genome, imbalances of 10 Mb can be hard to detect. G-banded karyotype analysis requires lengthy training for the operatives, and is very subjective, depending on highly developed pattern recognition skills in the analyst. In addition, in shorter chromosome preparations, such as those typically obtained from cell lines, only abnormalities of whole chromosome copy number and very large imbalances can confidently be assigned (figure 3).

Advances in molecular cytogenetic technologies have recently led to the introduction of array comparative genomic hybridization (CGH) as either an add-on test following karyotype analysis for the detection of chromosome imbalance (Edelmann & Hirschhorn 2009) or, in an increasing number of centres, as a first-line test (Ahn et al. 2010). This rapid replacement of karyotype analysis by array CGH testing is fuelled by the greatly increased resolution of array CGH, and the objectivity and ease of the analytical process. Various different array CGH platforms are commercially available; most diagnostic laboratories are in agreement that platforms comprising oligonucleotide probes have better resolution and provide more accurate and reproducible results than those comprising bacterial artificial chromosome clones ( Ylstra et al. 2006; Ahn et al. 2010). Oligonucleotide array platforms are available in a number of formats, with different probe densities across the genome. The resolution of the test will therefore depend on the platform of choice. In our experience, an oligonucleotide platform with 44 000 probes can detect regions of imbalance down to approximately 25 kb, a very considerable improvement over the resolution of karyotype analysis (figure 3).

Research studies over the last few years on populations of individuals with no clinical abnormalities have revealed that copy number variation (CNV) in the genome is extremely widespread, with many ‘normal’ people carrying at least one CNV (Iafrate et al. 2004; Feuk et al. 2006). Data on CNVs present in individuals with clinically abnormal phenotypes are collected by a number of central resources such as DatabasE of Chromosomal Imbalance and Phenotype in Humans using Ensembl Resources (DECIPHER; https://decipher.sanger.ac.uk/application/) and International Standard Cytogenomic Array (ISCA) Consortium based at the Emory University in Atlanta, Georgia (https://isca.genetics.emory.edu/iscaBrowser/). The ISCA database is complementing benign CNV variants in population, which are collected in the Database of Genomic Variants hosted by the Hospital for Sick Children in Toronto. Recently, the Center for Applied Genomics at the Children's Hospital of Philadelphia has introduced a database of CNVs found in healthy population with normal phenotypes. Information on the clinical significance of CNVs present in individuals with clinically abnormal phenotypes, and their potential relevance to, for instance, complex disease susceptibility, is therefore still emerging.

Online resources have been developed to aid in the interpretation of CNVs; likely pathogenicity can be assessed by such factors as size and gene content. However, small regions containing no known genes cannot necessarily be dismissed as benign, owing to the possibility of position effects on nearby genes, or the presence of controlling factors. Pasting the basepair coordinates of any CNV into a resource such as the Database of Genomic Variants (Iafrate et al. 2004) allows visualization of the region, including all the known genes therein; tracks showing features of genomic architecture can be activated if required. Any published studies showing imbalance for the region in control populations can also be visualized. In addition, click-through links for information on each gene, and links providing information on syndromes and important disease genes can be used for additional interpretation of the likely effect of the imbalance.

Many CNVs are inherited from disease-free parents, but the possibility of effects of genetic background and environmental factors on the expression of phenotype means that these inherited CNVs cannot necessarily be dismissed as benign. De novo benign and pathological CNVs are likely to arise for the most part at meiosis or in very early development, as mosaicism for these CNVs is quite rare. However, such mosaicism has been found (Ballif et al. 2006), indicating that post-zygotic events can give rise to de novo CNVs; nevertheless, CNVs seem to be generally stable in mature lineages.

Characterization of hESC lines prior to deposition in central cell banks is essential. Thus far, G-banded karyotype analysis has been the method of choice for establishing the presence or absence of chromosome anomalies in these cell lines. However, chromosome preparations from these cell lines are notoriously hard to generate, resulting in much time-consuming and in many cases ultimately fruitless work. When successful, the quality of the preparations is almost invariably poor, with short, highly condensed chromosomes, allowing only very low resolution analysis. It seems clear that the improved technology now in place in cytogenetic laboratories should be used to improve the detail and accuracy of the characterization of stem cells. Array CGH investigation of human stem cell lines has already been described by some groups (Xu et al. 2001; Wu et al. 2008; Elliott et al. in press). Array CGH testing of primary cultures, with follow-up testing if required of different passages, would give an overall picture of the CNVs present, and track any changes in the CNV profile engendered by the culture conditions, and/or any differentiation protocols used. Obviously, this testing would not only detect submicroscopic CNVs, but would also detect large chromosome imbalances sometimes found in long-term culture, such as whole chromosome trisomies.

With the introduction of array CGH testing into routine diagnostic cytogenetics laboratories, the expertise and technology for efficient and accurate analysis is widely available. Platforms in use in different laboratories will vary, but providing that the platforms, control DNA source, analysis algorithms and genome build used for any test are fully documented with the CNV profile for each cell line, then cross-platform comparisons will be valid, and the gene content of any CNV can be accessed from online resources (see above). Medium-throughput batch testing of patient diagnostic samples means that stem cell analysis could generally be incorporated with minimum effect on work-flow. Using the resources described above, documentation of gene content and interpretation of the CNVs found could be carried out by the array CGH laboratory, the stem cell laboratory and/or any investigators wishing to use the banked stem cells.

The consumable and hardware costs associated with array CGH testing are considerably higher than for karyotype analysis. However, these costs are falling, driven by commercial competition, advances in the technology of array manufacture and roll-out of testing, leading to larger orders. Conversely, array CGH labour costs are minimal when compared with culture and chromosome preparation and analysis from stem cell lines; labour costs for culture and karyotype analysis are unlikely to fall. There are currently no standardized overall costs for either karyotype analysis or array CGH testing; these costs are highly dependent on local conditions at testing laboratories. However, by using parsimonious hybridization strategies, robotics and batch testing, some laboratories are already able to offer array CGH testing at the same price as karyotype analysis (Ahn et al. 2010). Sending stem cell samples to an existing centre that carries out routine array CGH analysis is therefore a cost-effective and efficient way of collecting the information necessary for proper characterization of genetic imbalance in these cell lines.

6.2. Single nucleotide polymorphism

Single nucleotide polymorphisms (SNPs) are stable substitutions of a single base with a frequency of more than 1 per cent in at least one population. They are located throughout the genome and occur at approximately every 1200 bases. SNP identification has benefited from the availability of sequence data from the human genome project as well as the development of high-throughput SNP genotyping platforms, which allow low-cost whole-genome analysis. SNP genotyping arrays are frequently used in association with linkage studies to locate genes involved in disease. Information on SNP variants is freely available at dbSNP (http://www.ncbi.nlm.nih.gov/snp/; Sherry et al. 2001) with the latest build containing over 23 million refSNPs, of which over 14 million have been validated (Database of Single Nucleotide Polymorphisms (dbSNP), Bethesda (MD): National Center for Biotechnology Information, National Library of Medicine (dbSNP Build ID: 131)).

6.2.1. Ethnic diversity

The International HapMap Project (http://hapmap.ncbi.nlm.nih.gov/index.html.en) was set up to catalogue genetic similarity and variation in humans by comparing genomic sequence in individuals from four populations of African, Asian and European ancestry (International HapMap Consortium 2003). The data show that there is ample evidence of SNP allele frequency differences between distinct ethnicities (International HapMap Consortium 2005). The publicly available data have the potential to be used to identify genes affecting health and disease, as well as those that influence response to drug therapies and vaccinations based on an assumption that haplotype frequencies vary between groups with differing vaccine or drug responses. The hope is that this will lead to improved treatments based on likely response to intervention, ultimately leading to a personalized medicine approach. The latest build of dbSNP contains over 14 million validated SNPs in the human genome, but interrogating each of these would be prohibitively expensive for most projects.

Figure 3.

Scale illustration of genomic analyses: one-penny sterling coin (diameter 20.32 mm) lying on the road from London (A) to Oxford (B), which is about 100 km, corresponds to 20 bp on human chromosome 13 (100 million bp). The smallest human chromosome (Y) spans about 51 million bp (51 Mb), whereas the largest (chromosome 1) has about 279 million bp (279 Mb) (International Human Genome Sequencing Consortium 2001; Venter et al. 2001). The human chromosome 13 spans about 100 million bp (100 Mb). An average resolution of conventional Giemsa-stained karyotype (G-banding) is 5–10 Mb, whereas molecular karyotyping has a much higher resolution, averaging around 100 kb, which depends on the number of probes spotted on the arrays. (a) 100 Mb on chromosomes corresponds to 100 km on the map. (b) 10 km on the map corresponds to 10 Mb (G-bending resolution). (c) 100 m on the map corresponds to 100 kb (molecular karyotyping resolution). (d) One-penny image: deep-sequencing resolution can distinguish a single basepair mutation (1 mm corresponds to 1 bp).

The HapMap initiative has enabled the development of a haplotype (i.e. SNPs that tend to be inherited together as a block) map of the human genome with over 3.1 million SNPs (International Hapmap Consortium 2007), and importantly it has identified a subset of informative SNPs, termed tag SNPs, which can uniquely identify the haplotypes. By only genotyping these tag SNPs (estimated to number approximately 500 000), information regarding the surrounding SNPs (or those in linkage disequilibrium) is captured, without the need to genotype every known SNP, but with whole genome coverage. Park et al. (2007) developed an approach to identify ethnically variant SNPs (ESNPs) using data from the HapMap project. The ESNPs are available in the SNP@Ethnos database (http://bioportal.kobic.re.kr/SNPatETHNIC/), which contains over 100 000 variant SNPs that appear uniquely in each ethnic group.

In an era of investment in pharmacogenomics/pharmacogenetics studies, it is essential to have well-characterized hESC lines, which have many possible applications in toxicology and pharmacology, including screening of compounds during drug development prior to clinical trials. A greater understanding of pharmacogenomics will have a major impact on success and offers the possibility of better patient outcomes with the ability to predict which drugs/vaccines are likely to be effective in individuals of particular genotypes for genes affecting drug metabolism. However, potential benefits of hESC lines in research are currently hampered by the lack of ethnic diversity of available lines (Laurent et al. 2010). With accumulating population-specific SNP data available, it is simple to identify the ethnic origin of a given cell line, as recently demonstrated by Laurent et al. (2010) and Mosher et al. (2010) to enable a more targeted approach by the inclusion of hESC lines derived from populations relevant to the study being conducted.

6.2.2. UPD—a form of LOH

With the development of algorithms to interrogate data generated from SNP arrays (Zhao et al. 2004; Nannya et al. 2005), it is possible to determine chromosome copy number, enabling the detection of aneuploidy and other chromosomal anomalies, such as UPD, with a resolution that is intermediate between cytogenetic techniques and DNA sequencing. This might be particularly useful since hESCs in culture have been shown to acquire small chromosomal amplifications or deletions (Maitra et al. 2005; Lefort et al. 2008; Spits et al. 2008; Hovatta et al. 2010; Närvä et al. 2010).

UPD, a condition in which both alleles have originated from a single parent, occurs as heterodisomy and isodisomy (Engel 1980; Robinson 2000). Heterodisomy, when sequences from both homologues from the transmitting parent are present, might cause abnormalities only if the genes within the involved region are subject to genomic imprinting. Although rare, disruptions of normal gene expression owing to UPD are known; for instance, maternal UPD for chromosome 15 gives rise to Prader-Willi syndrome (OMIM 176270), while paternal UPD for the same chromosome leads to Angelman syndrome (OMIM 105830). Isodisomy, when two identical sequences from one parental homologue are present, could allow transmission of recessive mutations from a heterozygous parent and a number of such debilitating genetic conditions have been reported, including cystic fibrosis, haemophilia A, spinal multiple atrophy and various endocrine neoplasias (Robinson 2000). This can arise either following correction of meiotic monosomy (‘monosomy rescue’), leading to whole chromosome isodisomy, or as a consequence of somatic recombination at mitosis, leading to segmental isodisomy.

Very few hESC lines are analysed for the presence of UPD. Though, among 17 lines analysed by Närvä et al. (2010) one, FES 21, had identical q arms on chromosome 16, whereas all others had a heterozygous set of chromosomes.

7. Conclusions

It is well known that some hESC lines are easier to maintain in culture, whereas others readily undergo spontaneous differentiation. These characteristics govern investigators' choice as to which hESC line they will use in their study. From the first derivation of hESCs (Thomson et al. 1998), the Thomson lines H1, H7 and H9 became the most widely used, just because they are easy to maintain in an undifferentiated state. However, ease of propagation is not necessarily a desirable feature as it may be associated with changes in genomic integrity and therefore loss of therapeutic value. Selective growth advantages provided by genetic abnormalities in the cells may favour such easy maintenance. Indeed, using high-resolution CGH analysis, Hovatta et al. (2010) found 71 genes deleted and 1471 duplicated in ‘normal’ hESC line H1 after less than 60 passages. It has also been reported that H9 has chromosomal abnormalities (Lefort et al. 2008; Werbowetski-Ogilvie et al. 2009).

A study comparing early and late passages of 17 hESC lines (Närvä et al. 2010), using high-resolution genetic techniques, found that in extended cultures, on average, 24 per cent LOH and 66 per cent CNV had undergone changes (calculated false-positive estimate for CNV was 12.5%). A new LOH site with an average of 1000 kb was created at the rate of about 1.3 per passage. It appears that LOH had no preference for a particular chromosome; so far, it was detected in all chromosomes except chromosome 21 and Y.

The physiological process of regeneration, where remaining tissues organize themselves to replace a lost body part, has long been recognized in lower vertebrates such as the phenomenon of limb regeneration in the newt (Brockes 1997). The regenerative medicine field aims to translate the tremendous potential of stem cell biology to achieve tissue or organ regeneration in humans. However, if tissue-specific stem cells or hESCs are to be used to treat a wide variety of human diseases, then several formidable challenges are still to be overcome, which include thorough genotyping and phenotyping, well-controlled derivation and differentiation, cell efficacy, immunogenicity, tumourigenicity, appropriate cell-delivery systems, and short- and long-term safety (Civin & Rao 2006; De Sousa et al. 2006; Gruen & Grabel 2006; Skottman & Hovatta 2006; Bongso et al. 2008; Unger et al. 2008).

The importance of complete validation of any stem cells destined for therapy cannot be underestimated. History demonstrates that infection from donated tissue is possible, often unexpected and devastating. The first case report documenting the transmission of HIV during blood transfusion appeared in 1983 (Ammann et al. 1983). Prion diseases can also be transmitted via tissue and blood donation and may be more ubiquitous than we suspect (Llewelyn et al. 2004; Miller 2009). The substantial risks associated with unregulated therapies have been highlighted by the recent report describing a donor-derived brain tumour following transplantation of uncharacterized foetal neural stem cells (Amariglio et al. 2009). Unlike conventional donation where blood or tissue from one donor may be transplanted to two or three patients, the expansion of stem cell cultures could allow a single hESC line to be used for hundreds, if not thousands of patients. The risk of disease transmission from a single donor thus increases exponentially (Braude et al. 2005).

The route to preventing such events is to perform full characterization and screening of cell lines. Besides prudent post hoc testing for adventitious infective agents, genotype is one of the most important characterization parameters. Submicroscopic changes could render a cell line almost useless with respect to manufacturing a medicinal product, as differentiation and efficacy results may become invalid. This would void the commercialization potential and waste any investment used in development of that line. Array CGH for identifying chromosomal imbalance and submicroscopic CNVs should be introduced as a new quality standard, an obligatory diagnostic testing for every newly derived hESC line before the line is deposited in national stem cell banks. SNP testing for the identification of ethnicity and genome-wide LOH would provide valuable additional information.

Footnotes

  • One contribution to a Theme Supplement ‘Translation and commercialization of regenerative medicines’.

  • Received June 30, 2010.
  • Accepted August 16, 2010.

References

View Abstract