June 29, 2020.

We are still in the middle of the COVID19 pandemic, caused by the SARS-CoV-2 coronavirus. As an exercise in “bioinformatics” the following tutorial is meant to learn about the virus sequences of the “spike glycoprotein S” that allows the virus to attach and enter into cells.

The purpose of the tutorial is is also explore how using small Unix tools one can accomplish a lot of analysis without the need of programming.

I show a lot of “how I work” which by the way may not be the fastest way from a computer science perspective. But the point is to “get things done” without complicated software programming.

This tutorial was created in 2 formats concurrently: HTML and PDF with R and RStudio using bookdown and a modified rstudio4edu-book template. While HTML was the primary output focus, care was taken to insure the readability of the PDF version. Links to web sites are “live” (clickable) in both versions.



TUTORIAL “A sequence alignment and analysis of SARS-CoV-2 spike glycoprotein

(last updated: June 29, 2020)

  • HTML (opens in new Tab)
  • PDF (opens in new Tab)

Exercise files

  • All files:
  • Useful “starter” files
    • spike_raw_85.fa – original file with only 85 sequences.
    • spike_32.fa original filtered sequence file of 32 sequences. (Should be renamed spike_filtered.fa)