Quality of Biotechnology Products:
Analysis of the Expression Construct in Cells used for Production of r-DNA Derived Protein Products

Q5B
Current Step 4 version
dated 30 November 1995

This Guideline has been developed by the appropriate ICH Expert Working Group and has been subject to consultation by the regulatory parties, in accordance with the ICH Process. At Step 4 of the Process the final draft is recommended for adoption to the regulatory bodies of European Union, Japan and USA.

I. Introduction

This document presents guidance regarding the characterisation of the expression construct for the production of recombinant DNA protein products in eukaryotic and prokaryotic cells. This document is intended to describe the types of information that are considered valuable in assessing the structure of the expression construct used to produce recombinant DNA derived proteins. This document is not intended to cover the whole quality aspect of rDNA derived medicinal products.

The expression construct is defined as the expression vector containing the coding sequence of the recombinant protein. Segments of the expression construct should be analysed using nucleic acid techniques in conjunction with other tests performed on the purified recombinant protein for assuring the quality and consistency of the final product. Analysis of the expression construct at the nucleic acid level should be considered as part of the overall evaluation of quality, taking into account that this testing only evaluates the coding sequence of a recombinant gene and not the translational fidelity nor other characteristics of the recombinant protein, such as secondary structure, tertiary structure, and post-translational modifications.

II. Rationale for Analysis of the Expression Construct

The purpose of analysing the expression construct is to establish that the correct coding sequence of the product has been incorporated into the host cell and is maintained during culture to the end of production. The genetic sequence of recombinant proteins produced in living cells can undergo mutations that could alter the properties of the protein with potential adverse consequences to patients. No single experimental approach can be expected to detect all possible modifications to a protein. Protein analytical techniques can be used to assess the amino acid sequence of the protein and structural features of the expressed protein due to post-translational modifications such as proteolytic processing, glycosylation, phosphorylation, and acetylation. Data from nucleic acid analysis may be useful since protein analytical methods may not detect all changes in protein structure resulting from mutations in the sequence coding for the recombinant protein. The relative importance of nucleic acid analysis and protein analysis will vary from product to product.

Nucleic acid analysis can be used to verify the coding sequence and the physical state of the expression construct. The nucleic acid analysis is performed to ensure that the expressed protein will have the correct amino acid sequence but is not intended to detect low levels of variant sequences. Where the production cells have multiple integrated copies of the expression construct, not all of which may be transcriptionally active, examination of the transcription product itself by analysis of mRNA or cDNA may be more appropriate than analysis of genomic DNA. Analytical approaches that examine a bulk population of nucleic acids, such as those performed on pooled clones or material amplified by the polymerase chain reaction, may be considered as an alternative to approaches that depend on selection of individual DNA clones. Other techniques could be considered that allow for rapid and sensitive confirmation of the sequence coding for the recombinant protein in the expression construct.

The following sections describe information that should be supplied regarding the characterisation of the expression construct during the development and validation of the production system. Analytical methodologies should be validated for the intended purpose of confirmation of sequence. The validation documentation should at a minimum include estimates of the limits of detection for variant sequences. This should be performed for either nucleic acid or protein
sequencing methods. The philosophy and recommendations for analysis expressed in this document should be periodically reviewed to take advantage of new advances in technology and scientific information.

III. Characterisation of the Expression System

A. Expression Construct and Cell Clone Used to Develop the Master Cell Bank (MCB)

The manufacturer should describe the origin of the nucleotide sequence coding for the protein. This should include identification and source of the cell from which the nucleotide sequence was originally obtained. Methods used to prepare the DNA coding for the protein should be described.

The steps in the assembly of the expression construct should be described in detail. This description should include the source and function of the component parts of the expression construct, e.g., origins of replication, antibiotic resistance genes, promoters, enhancers, whether or not the protein is being synthesised as a fusion protein. A detailed component map and a complete annotated sequence of the plasmid should be given, indicating those regions that have been sequenced during the construction and those taken from the literature. Other expressed proteins encoded by the plasmid should be indicated. The nucleotide sequence of the coding region of the gene of interest and associated flanking regions that are inserted into the vector, up to and including the junctions of insertion, should be determined by DNA sequencing of the construct.

A description of the method of transfer of the expression construct into the host cell should be provided. In addition, methods used to amplify the expression construct and criteria used to select the cell clone for production should be described in detail.

B. Cell Bank System

Production of the recombinant protein should be based on well-defined Master and Working Cell Banks. A cell bank is a collection of ampoules of uniform composition stored under defined conditions each containing an aliquot of a single pool of cells. The Master Cell Bank (MCB) is generally derived from the selected cell clone containing the expression construct. The Working Cell Bank (WCB) is derived by expansion of one or more ampoules of the MCB. The cell line history and production of the cell banks should be described in detail, including methods and reagents used during culture, in vitro cell age, and storage conditions. All cell banks should be characterised for relevant phenotypic and genotypic markers which could include the expression of the recombinant protein or presence of the expression construct.

The expression construct in the MCB should be analysed as described below. If the testing cannot be carried out on the MCB, it should be carried out on each WCB.

Restriction endonuclease mapping or other suitable techniques should be used to analyse the expression construct for copy number, for insertions or deletions, and for the number of integration sites. For extrachromosomal expression systems, the percent of host cells retaining the expression construct should be determined.

The protein coding sequence for the recombinant protein product of the expression construct should be verified. For extrachromosomal expression systems, the expression construct should be isolated and the nucleotide sequence encoding the product should be verified without further cloning. For cells with chromosomal copies of the expression construct, the nucleotide sequence encoding the product could be verified by recloning and sequencing of chromosomal copies.
Alternatively, the nucleic acid sequence encoding the product could be verified by techniques such as sequencing of pooled cDNA clones or material amplified by the polymerase chain reaction. The nucleic acid sequence should be identical, within the limits of detection of the methodology, to that determined for the expression construct as described in Section III.A. and should correspond to that expected for the protein sequence.

C. Limit for In Vitro Cell Age for Production

The limit for in vitro cell age for production should be based on data derived from production cells expanded under pilot plant scale or full scale conditions to the proposed in vitro cell age or beyond. Generally, the production cells are obtained by expansion of the Working Cell Bank; the Master Cell Bank could be used to prepare the production cells with appropriate justification.

The expression construct of the production cells should be analysed once for the MCB as described in Section III.B., except that the protein coding sequence of the expression construct in the production cells could be verified by either nucleic acid testing or analysis of the final protein product. Increases in the defined limit for in vitro cell age for production should be supported by data from cells which have been expanded to an in vitro cell age which is equal to or greater than the new limit for in vitro cell age.

IV. Conclusion

The characterisation of the expression construct and the final purified protein are both important to ensure the consistent production of a recombinant DNA derived product. As described above, it is considered that analytical data derived from both nucleic acid analysis and evaluation of the final purified protein should be evaluated to ensure the quality of a recombinant protein product.

Glossary of Terms

Expression Construct

The expression vector which contains the coding sequence of the recombinant protein and the elements necessary for its expression.

Flanking Control Regions

Non-coding nucleotide sequences that are adjacent to the 5' and 3' end of the coding sequence of the product which contain important elements that affect the transcription, translation, or stability of the coding sequence. These regions include, e.g., promoter, enhancer, and splicing sequences and do not include origins of replication and antibiotic resistance genes.

Integration Site

The site where one or more copies of the expression construct is integrated into the host cell genome.

In vitro Cell Age

Measure of time between thaw of the MCB vial(s) to harvest of the production vessel measured by elapsed chronological time in culture, by population doubling level of the cells, or by passage level of the cells when subcultivated by a defined procedure for dilution of the culture.

Master Cell Bank (MCB)

An aliquot of a single pool of cells which generally has been prepared from the selected cell clone under defined conditions, dispensed into multiple containers and stored under defined conditions. The MCB is used to derive all working cell banks. The testing performed on a new MCB (from a previous initial cell clone, MCB or WCB) should be the same as for the MCB unless justified.

Pilot Plant Scale

The production of a recombinant protein by a procedure fully representative of and simulating that to be applied on a full commercial manufacturing scale. The methods of cell expansion, harvest, and product purification should be identical except for the scale of production.

Relevant Genotypic and Phenotypic Markers

Those markers permitting the identification of the strain of the cell line which should include the expression of the recombinant protein or presence of the expression construct.

Working Cell Bank (WCB)

The Working Cell Bank is prepared from aliquots of a homogeneous suspension of cells obtained from culturing the MCB under defined culture conditions.