- Write a script that will
- run BWA on one of the samples from the Gierlinski dataset
- run STAR on the same sample
- Subset the aligned reads to select only those that map to chromosome I.
- Compare the output from BWA and STAR, and summarize any results or differences.
- Which optional
SAM
fields does STAR
add and what do they represent?
- Which optional
SAM
fields does BWA
add and what do they represent?
- How does the interpretation of the mapping quality field differ in both?
- Find a read that has been split in
STAR
. How did BWA
handle the mapping of that read?
Project work: (due on Feb 18!)
- Download at least one FASTQ file that you will be working with for your project. Document the following details:
- where did you get it from?
- what publication is it linked to?
- who generated the data?
- how was the NA extracted?
- what library prep was used?
- what cell type was used?
- what was the treatment/experimental condition?
- what sequencing platform was used?
- Align the FASTQ file with an appropriate aligner (you may have to build a new index). Document:
- parameters (and why you chose them)
- summary of outcome and basic QC
Compile the .Rmd
file and send both the .Rmd and the HTML files to angsd_2019@zoho.com by Monday night.