Ändringar mellan två versioner
Här visas ändringar i "Remote Blast" mellan 2015-11-02 11:18 av Lars Arvestad och 2015-11-30 19:02 av Lars Arvestad.
Remote Blast
In this assignment, you will use BioPython to run a Blast job at NCBI.
Data Download the human CST3 protein sequence.
 Preliminaries 
 * Figure out what Blast does and how it works.
 * Go to the NCBI Blast web page (find it yourself...) and start a Blast comparison with CST3 against the so-called non-redundant protein database: nr. This means that you comparison should use the "blastp" subprogram (protein query against protein DB).
 * Find the highest scoring hit in mouse 
 * What is the E-value, and what does that mean?
 * Look at the actual alignment. How alike are the sequences?
  
  Programming Write a Python program that conducts a Blast search of a given protein sequence against the nr database at NCBI. There is good support for this in BioPython.
 Requirements 
 * Your program is an executable Python file taking one input: a file containing a protein sequence.
 * Output is the blast report given in XML format, presented to stdout.
  To present:
 
 * You should be able to give a brief explanation of Blast.
 * What is the E-value, and what does that mean?
 * You should understand the online output from a Blast run.
 * Your code for the remoteblast program.
 * Demonstrate a successful run of remoteblast.
  Example session Usage of your program should be something like this:
orange-01> ./remoteblast cst3.fa <?xml version="1.0"?> <!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN" "NCBI_BlastOutput.dtd"> <BlastOutput> <BlastOutput_program>blastx</BlastOutput_program> <BlastOutput_version>blastx 2.2.6 [Apr-09-2003]</BlastOutput_version> <BlastOutput_reference>~Reference: Altschul, Stephen F., et al (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search~programs", Nucleic Acids Res. 25:3389-3402.</BlastOutput_reference> <BlastOutput_db>sprot</BlastOutput_db> <BlastOutput_query-ID>lcl|QUERY</BlastOutput_query-ID> <BlastOutput_query-def>CST3</BlastOutput_query-def> ... Output trimmed for brevity! Note that "orange-01>" is the commandline prompt.