Abstract: | SARS-CoV-2 is a newly discovered beta coronavirus at the end of 2019, which is highly pathogenic and poses a serious threat to human health. In this paper, 1875 SARS-CoV-2 whole genome sequences and the sequence coding spike protein (S gene) sampled from the United States were used for bioinformatics analysis to study the molecular evolutionary characteristics of its genome and spike protein. The MCMC method was used to calculate the evolution rate of the whole genome sequence and the nucleotide mutation rate of the S gene. The results showed that the nucleotide mutation rate of the whole genome was 6.677 × 10?4 substitution per site per year, and the nucleotide mutation rate of the S gene was 8.066 × 10?4 substitution per site per year, which was at a medium level compared with other RNA viruses. Our findings confirmed the scientific hypothesis that the rate of evolution of the virus gradually decreases over time. We also found 13 statistically significant positive selection sites in the SARS-CoV-2 genome. In addition, the results showed that there were 101 nonsynonymous mutation sites in the amino acid sequence of S protein, including seven putative harmful mutation sites. This paper has preliminarily clarified the evolutionary characteristics of SARS-CoV-2 in the United States, providing a scientific basis for future surveillance and prevention of virus variants. |