With the way we did our genetic sequencing (Short Paired-End Illumina Sequencing), the output from the sequencing machine for each sample is a list of many short sequences of DNA from the sample. There are many steps that need to be taken to turn those pieces of DNA into meaningful conclusions about what parts of the DNA determine a trait. In this project the goal is to determine what genetic loci determine host preferences of Aedes Aegypti mosquitos by analyzing DNA from samples with known host preferences.
There are many tools available to perform all sorts of operations on genetic data for a project. When building a Bioinformatic pipeline, the challenge mostly comes down to identifying which steps you want to take based on the type of data you have and the type of analysis you are doing, and selecting the best tools to help you accomplish these steps. When I was creating this pipeline, I read a large number of academic publications to learn which tools would be the best fit for our project and collaborated closely with a PHD student in the lab to learn about the biological concepts that are important to consider when building the pipeline.
Back to Top