Computational Design for the Production of Secretory Recombinant Codon-Optimized Human Stem Cell Factor (SCF) in Chinese Hamster Ovary (CHO) Cells With an Appropriate Signal Peptide: An Intensive In Silico Study

Background: Production of the recombinant proteins in mammalian cells is an important issue with a bio-therapeutic purpose. Numerous efforts have been focused on the improvement of the yields of recombinant proteins, which include optimization of conventional biological processes, selection of appropriate signal peptides, codon optimization, and re-engineering of cells to produce more proteins. Stem cell factor (SCF) is a blood cytokine which activates the c-Kit receptor. This factor is crucial not only for the differentiation of hematopoietic progenitor cells but also for the survival, proliferation, and differentiation of mast cells. Recently, its therapeutic role in several diseases such as Alzheimer’s and myocardial infarction has been investigated. Therefore, the aim of this study was to design a secretory recombinant human SCF with the maximal yield in an appropriate mammalian host cell as Chinese hamster ovary (CHO) cells using the computational studies. Methods: As the first step, computational simulation studies were carried out to design the appropriate signal peptide for the human SCF protein. Codon optimized coding sequence of hSCF was transferred into a eukaryotic expression vector (pBudCE4.1). Recombinant vector (pBudCE4.1/SCF) was transfected into CHO cells and the stably transformed cells were screened and isolated. Subsequently, the expression of SCF was defined by reverse transcription polymerase chain reaction (RT-qPCR) in stably transformed cells. Results: Our bioinformatics studies indicated that Azurocidin signal peptide could be a suitable signal peptide for the production of SCF proteins in the CHO cells. Accordingly, computational studies revealed that the presence of 6×His-tag did not have a significant impact on the three-dimensional structure of the protein. Furthermore, the expression of hSCF was significant in the stable CHO cells. Conclusion: The use of this approach may, therefore, lead to the production of highly efficient recombinant hSCF, which would be feasible for the mass production of this factor for therapeutic purposes.

leads to improved protein secretion (10). However, signal peptides with more negative charges could be transferred efficiently. H-region is very important for signal peptides as this part is recognized by SRP to be connected to the area and inhibits protein synthesis and translation, which can be necessary to prevent its release into the cytosol (11). The H-region can facilitate the protein movement in translocon (12). Therefore, any modification in the length and hydrophobicity of the signal peptides would effectively change the efficiency of the protein movement in Translocon (13)(14)(15)(16). Interestingly, any mutation within the regions at the downstream of the signal peptide could affect the protein translocation (17).
Stem cell factor (also known as SCF, KIT-ligand, KL, or steel factor) is a cytokine that binds to the c-KIT receptor (CD117). SCF can exist both as a transmembrane protein and a soluble protein. This cytokine plays an important role in proliferation, migration, survival, and differentiation of hematopoietic stem cells (HSCs), melanocytes, and germ cells, functions as a blood cytokine that binds to the c-Kit receptor and triggers a series of biological effects. The recombinant SCF has been produced and used for clinical and research purposes. SCF may be used along with other cytokines to culture HSCs and hematopoietic progenitors. The expansion of these cells ex vivo (outside the body) would allow advances in bone marrow transplantation, in which HSCs are transferred to a patient to re-establish blood formation. One of the problems of injecting SCF for therapeutic purposes is that SCF activates mast cells. The injection of SCF has been shown to cause allergiclike symptoms and the proliferation of mast cells and melanocytes. Cardiomyocyte-specific overexpression of transmembrane SCF promotes stem cell migration and improves cardiac function and animal survival after myocardial infarction (18)(19)(20). To ensure the efficient secretion of a recombinant SCF in mammalian cells, selection of an appropriate signal peptide seems to be necessary. Therefore, the purpose of this study was to examine four different signal peptides already introduced by Kober et al (21).

Selection of Appropriate Signal Peptides
As the first step, the sequence of the most appropriate signal peptides was selected according to the criteria introduced by Kober et al (21) for the optimization of the secretion of proteins in the CHO cells. Therefore, four distinct signal peptides were introduced in this study (Table 1).

Structural Modeling and Molecular Dynamics Simulation
Structural modeling studies were carried out by implementing the I-TASSER server (http://zhanglab. ccmb.med.umich.edu/I-TASSER/). As the first step, the SCF protein sequence, including the original sequence of the signal peptide and four distinct aforementioned signal peptides, was recorded on a server called I-TASSER. In the next step, to facilitate the purification of the secretory SCF, we decided to add both 6×His-tag and c-Myc epitopes at the C-terminus of SCF. It should be mentioned that both epitopes were present in the final expression vector. Therefore, the fusion type of SCF-(His)6-Myc was also recorded on this server to assess whether the addition of His-tag could modulate the structure and function of SCF. The accuracy of the data derived from I-TASSER was also checked by Procheck_NT (http://services.mbi. ucla.edu/PROCHECK) and VADAR (Nihserver.mbi.ucla. edu/ERRATV2) programs. Additionally, ERRAT program (http://services.mbi.ucla.edu/ERRAT) was used to check the quality of the models of all structures and compare them with the original SCF. Meanwhile, the signal peptides cleavage site for each construct was predicted through the Signal IP 4.1 server (http://www.cbs.dtu.dk/services/ SignalP) to ensure the proper separation of the signal peptides.
Finally, the molecular dynamics (MD) simulation was performed using Gromacs software, version 4.5.4 (Department of Biophysical Chemistry at Groningen University, Holland) (22,23). Briefly, by considering the n_ aut.gro of each recorded input derived from the I-TASSER server, the size of simulation box for the original SCF was set to 0.8 A° and for the other constructs, it was taken as 1.6 A°. By considering the size of simulation box for each of the constructs, the number of water molecules, Na + , and Clwas calculated accordingly in 10 nanoseconds (ns) for the simulation process (Table 2). It was used for modeling and MD to examine the effect of poly-histidine tag and the Myc epitope on the structure and function of SCF. Therefore, for the first SCF, the original sequence of signal peptides was recorded on the I-TASSER server. MD simulation was performed in the isothermal-isobaric ensemble under periodic boundary conditions. The LINCS algorithm was also used to constrain all bond lengths during the equilibration step and 5-ns free MDs (24). To The sequences of selected signal peptides, their accession numbers and the original protein and the host are indicated.
Design for Production of Human SCF in CHO Cells reduce artifacts in the calculation of stacking interaction, Amber force field was used with a simple water molecule (25,26). The energy minimization step was performed using the steepest descent method to conjugate the gradient methods. It was followed by 100 picoseconds (ps) of the equilibration step, imposing positional restraints on the non-H atoms. The simulation was conducted at a temperature of 300 K. To reach this threshold temperature, each construct was separately used as an input in a temperature bath using the Berendsen coupling method (27). A cutoff of 1.0 nm was selected for the Coulomb interaction and 0.9 nm was adopted for the Lennard-Jones interaction. The time step was 2 femtoseconds (fs), with coordinates stored after every 2 ps MD simulation was performed for 10 ns.

SCF Coding Sequence
According to the data obtained by the simulation studies, E construct was predicted to be the best signal peptide for the secretion of SCF. Hence, a chimeric version of E fused with human SCF coding sequence (CDS) (GenBank No: M59964.1), termed as E-SCF, was chosen and its nucleotides were codon-optimized through Genscript (http://www. genscript.com/cgi-bin/tools/rare_codon_analysis) and IDT server (https://eu.idtdna.com/CodonOpt) for production in the CHO cells. The codon-optimized E-SCF in the type of BamH I-E-SCF-Sal I was ordered from the Generay Co. (China) for further subcloning of E-SCF. Finally, we used the plasmid encoding BamH I-E-SCF-Sal I, named as pGHn/SCF, for further experiments.

Construction of the Recombinant pBudCE4.1 (+) Encoding E-SCF
The recombinant pGHn/SCF vector was transformed into E. coli Top10 competent cells. Plasmid extraction was carried out on bacterial colonies using Plasmid Mini Extraction Kit (BIONEER, Korea). The BamHI-SalI DNA fragment containing E-SCF was digested from pGHn/SCF using respective endonucleases (Fermentas, Lithuania) and inserted into the same place in the pBud CE4.1 vector, resulting in the production of pBudCE4.1/E-SCF.

CHO Cell Culture and Transfection
CHO-K1 cell line was obtained from Royan Institute for Stem Cell Biology and Technology (Tehran, Iran) and cultured in Ham's F12 (Sigma, USA) containing 10% fetal calf serum and 1% penicillin-streptomycin. Cell culture incubation was performed in a humidified atmosphere of 5% CO2 at 37ºC. Approximately, 6.25 × 10 5 cells were passaged in each well of a six-well plate dish one day prior to the transfection. Transfection of pBudCE4.1/E-SCF was carried out using Lipofectamine LTX reagent (Invitrogen, Carlsbad, CA, USA) and data were compared with those obtained by transfection of pBudCE4.1.

Isolation of a Stable CHO Cell Line Expressing SCF
Two days post-transfection, the cells were passaged and treated with 60 mg/mL Zeocin (Invitrogen, USA) for 4 weeks. The emerged colonies were isolated for further screening by RT-PCR using specific primers (Table 3) to confirm the presence of SCF expression. Stably transformed CHO-K1 cells with pBudCE4.1/E-SCF were nominated as CHO/SCF cells and compared with the Mock cells, the stable transforming CHO cells with pBudCE4.1.

RT-PCR and Quantitative Real Time PCR for Quantification of Gene Expression
Total RNA was extracted from transfected CHO-K1 cells according to the protocol using TRizol reagent (Invitrogen) based on manufacturer instruction. Approximately, 1 µg of total RNA was used to synthesize cDNA by Revert Aid First Strand cDNA synthesis kit which was purchased from Thermo Scientific (USA) utilizing random hexamer primer. The final step of reverse transcription polymerase chain reaction (RT-PCR) was completed using the specifically designed primers for SCF ( Table 3). The The size of simulation box for each construct was derived from n_aut.gro. All of the items were calculated according to the GROMACS software. The number of ions was selected according to their cellular concentrations (140 mM).

Western Blot Analysis
To confirm the results of real-time PCR technique, the expression of SCF at the protein level was also observed using indirect western blot analyses. To do the experiment, the supernatant was condensed with acetone/DTT and alcohol. An equivalent amount of supernatant was condensed and separated by SDS-PAGE gel and transferred on PVDF membrane. The membrane was incubated in Tris-Buffered Saline containing 3% Bovine Serum Albumin and then the strips reacted with anti-His-tag monoclonal antibody (ab18184) for one hour at 37ºC. Then, the strips were incubated with horseradish peroxides conjugated goat anti-mouse IgG (Dako, P0447) for 1 hour at 37ºC. The strips were visualized after color development in diaminobenzidine/H 2 O 2 substrate solution for 15 minutes at room temperature.

Statistical Analysis
Microsoft Excel (2007) and SPSS software (version 17.0) were used to analyze the data which were expressed as mean ± standard error of mean (SEM) and obtained from three independent treatments of replicated observations. One-way analysis of variance (ANOVA) was performed to identify statistical differences between treatments, which were considered to be significant at P<0.05.

Results
Signal peptides can have a major effect on the secretion of recombinant proteins expressed in mammalian cells and its performance is interchangeable among different species.
The purpose of this research was to select one of the four signal peptides introduced by Kober et al (21) to efficiently increase the secretion of SCF in the CHO cells. To this end, we performed the modeling of six structures and studied the solubility, polarity, and other kinetic characteristics of the structures (Figures 1 and 2). Moreover, the effect of the 6×His-tag and c-Myc epitopes on the protein structure of SCF was investigated. We carried out MD simulation and then the best signal peptide was chosen for the optimal expression of the recombinant SCF. To synthesize the chimeric structure of the SCF CDS, we performed codon optimization in order to insert it into a suitable expression vector (pBudCE4.1) (Figure 3). Figure S1 (see Supplementary file 1) shows the potential energy and the kinetic energy system of six structures in the presence of water and 140 mM of NaCl during 10 ns of the MD simulation ( Table 2). The study showed the potential energy of the organization during the simulation in terms of energy balance. As indicated, there were many similarities in terms of the energy level between SCF-(His)6-Myc and E-SCF-(His)6-Myc. The average potential and kinetic energy values during the last 6 ns of the final 10 ns simulation were selected, as shown in Table 4.

MD Simulation
The variations in the system temperature (Kelvin) and root-mean-square deviation (RMSD) of the SCF atoms are shown in Figures S2 and 4. The changes in the average of temperatures ( Figure S2) showed that the temperature during the simulation system was balanced.   (Table 1) with the original signal peptides of SCF in the threedimensional structure.

Design for Production of Human SCF in CHO Cells
Moreover, the RMSD results suggested that during 10 ns of simulation, the molecular structures of the studied proteins were at the maximum stability ( Table 4). The average RMSD values (Table 4) were 0.40 ± 0.015 and 0.46 ± 0.016 for SCF and SCF-(His)6-Myc, respectively. Therefore, we could conclude that adding a signal peptide E to the SCF protein structure could not change RMSD significantly. As can be seen in Figure 4, Panel g, the structure of M-SCF-(His)6-Myc during simulation was less stable; hence, it could be inferred that this signal peptide caused instability. Figure 5 shows the radius of gyration (Rg) of structures in the presence of water and 140 mM NaCl within 10 ns of MD simulation, under similar conditions expressed above (Table 4). Figure 5 shows the Rg value of M-SCF-(His)6-Myc. As indicated, this structure was less stable when compared with SCF. Therefore, this signal peptide reduced the access level of the protein structure. Among the signal peptides, signal peptides C and E indicated the highest change (1.77) in the Rg, while the structure of E-SCF-(His)6-Myc indicated a minimal change (1.77 ± 0.008).
Since RMSD is not an appropriate parameter to reflect the mobility of the structural elements, we analyzed the root mean square fluctuation (RMSF) (Figure 6 and Table   4) to assess the flexibility of the structures. SCF structure showed little flexibility ( Figure 6, panel a). In addition, RMSF analyses ( Figure 6, panel f) indicated that the signal peptide E caused the biggest change in the flexibility of SCF (0.23 ± 0.15), which was in a good association with the Rg data ( Figure 6, Panel e). The variations in RMSF of M-SCF-(His)6-Myc ( Figure 6, Panel g) showed that the signal peptide M could reduce the value of RMSF and cause some flexibility in the structure of SCF.
The average number of hydrogen bonds between six structures and water molecules during the time of simulation was measured to be 435.8±11.9 and 529.8±12.3 for SCF and SCF-(His)6-Myc molecules, respectively. On the other hand, the implementation of signal peptides E or M (Table 1)    (509.43±12.34) ( Table 4). The assessment of the changes in the number of hydrogen bonds (Figure 7) between CSF and water showed that SCF dissolved in water. Interestingly, His-SCF had more solubility (Figure 7, Panel b). (g). Table 5 shows the kinetic characteristics of the six studied structures in the presence of water molecules during 10-ns MD simulations. The value of electrostatic energy contribution in binding was greater than that of the van der Waals forces. Moreover, the non-polar solvation energy contributed largely to the binding of the SCF molecules to water molecules. Average energy changes in electrostatic binding energy and the binding energy between the atoms of SCF-(His)6-Myc and the molecules of water indicated that there were strong electrostatic forces between these molecules. Interestingly, the levels of variations in the polar solvation energy (-27186.778 kJ/mol), non-polar solvation energy (-461.708 kJ/mol), van der Waals energy (-2327.880 kJ/mol) between water and SCF-(His)6-Myc, and electrostatic interactions (-27186.346 kJ/mol) between SCF-(His)6-Myc and water were greater than those of SCF. The mean changes in the binding energy (Table 5) showed that the electrostatic energy and the binding energy between the atoms and molecules of water and A2-SCF-(His)6-Myc, as compared with the original signal peptide of SCF, reduced significantly. The van der Waals energy, polar solvation energy, non-polar solvation energy, and the binding energy between water and A2-SCF-(His)6-Myc were equal (-2145.485, 23264.78, -636.629, and -5577.743 kJ/mol, respectively), thereby suggesting that the efficacy of the signal peptide A2 was greater than that of the original signal peptide of SCF in reducing the solubility and polarity of SCF. Meanwhile, the polar and non-polar solvation energy in the presence of the signal peptide B changed considerably, as compared to SCF-(His)6-Myc, showing that this replacement reduced the solubility and polarity of SCF. The assessment of the electrostatic binding energy and the binding energy between the atoms of E-SCF-(His) 6-Myc and the molecules of water indicated that there were strong electrostatic forces between the former and the latter.

led to the formation of an equal number of hydrogen bonds between proteins and water
In the presence of the signal peptides E, the non-polar solvation energy reached -639.834 kJ/mol, suggesting that the signal peptide E reduced the non-polar solvation energy. Additionally, the change in the electrostatic binding energy and the binding energy between the atoms of M-SCF-(His)6-Myc and the molecules of water indicated that the signal peptide M reduced the amount of the electrostatic force between water and M-SCF-(His)6-My, as compared to the original signal peptide of SCF. The van der Waals energy, polar solvation energy, and the binding energy between water and M-SCF-(His)6-Myc were equal to -2065.76, 2046.5, and -6529.493 kJ/mol,  Design for Production of Human SCF in CHO Cells respectively, suggesting that replacing the signal peptide M with the original signal peptide of SCF reduced the polar solvation energy.

Ectopic Expression of SCF in CHO Cells
As depicted in Figure 3 and described in the "Materials and Methods" section, after selecting the signal peptide E based on the simulation results and the artificial synthesis of the chimeric gene, the replicating recombinant vector pGHn/SCF was used for further experiments. The recombinant vector was transformed in E. coli and plasmid extraction was performed ( Figure S3). To ensure that the recombinant plasmid contained the bona fide SCF CDS, digestion was accomplished by means of BamHI and SalI.
As the next stage, SCF CDS was subcloned into pBudCE4. To determine whether SCF protein was successfully produced in the CHO cells, the secretion of the SCF protein was detected by the western blot in the CHO cells media after transfection (Figure 8, Panel c). However, SCF was not detected in the transfected cell (empty vector) media.

Discussion
To achieve the appropriate bioreactor cell lines with a high expression level of the recombinant proteins, we needed an efficient expression system. To our knowledge, the secretion of the protein is mediated by transporting it into the endoplasmic reticulum (28). Accordingly, Zhang et al and Dalton and Barton showed that the secreted protein could be improved by selecting the appropriate signal peptide (7,14). Kober et al also performed a comprehensive study using 16 signal peptides from different species, finding that among the selected signal peptides, albumin and human azurocidin signal peptides were the most appropriate ones feasible for commercial purposes (21).
SCF acts as a survival and growth factor required for the expansion of HSCs, helping to maintain and renew HSCs. Moreover, SCF plays an important role in the proliferation and differentiation of mast cells, melanocytes, and germ cells (29)(30)(31). Overexpression of SCF induces the migration of neural stem/progenitor cell through activating the C-kit  (32). Additionally, Asghari et al showed the expression of SCF by implementing the vector PET-266 (+) in E. coli, although the yielded protein was not functional (20).
Therefore, the purpose of this research was to design the most suitable signal peptide for the appropriate secretion of the SCF protein in the CHO cell media. We also testified the effects of the His tag on protein structure through MD simulation. Based on the simulation results, we could conclude that signal peptide switching could not influence these parameters and therefore, the least changes in the protein structure occurred; however, it could alter the van der Waals energy levels, electrostatic binding energy, and the polar and non-polar solvation energy between water and SCF, ultimately affecting the solubility and polarity of SCF. On the other hand, by comparing the MD simulation modes of SCF and SCF-(His)6-Myc, we could conclude that adding 6×His tag to SCF resulted in greatest changes in the values of RMSF, RMSD, and Rg. In addition, simulation results suggested that the signal peptide E (azurocidin signal peptide) could be suitably used instead of the original signal peptide of SCF as the binding energy of the signal peptide improved. Our finding was in a good agreement with the data obtained by Kober et al (21). We assumed that this signal peptide could be more accessible to the SRP complex. The experimental results also confirmed our in silico analysis.

Conclusion
MD simulation could serve as a useful tool for the selection of signal peptides and assessment of the function of a protein. In the present study, we testified the human azurocidin signal peptide for the secretion of SCF in the CHO cell media, observing that this signal peptide could act more efficiently than the original signal peptide of SCF.

Conflict of Interests
The authors declare no conflict of interests.