Stev 2055 PC lab Introduction: The best way to learn how to do something is by doing it.
You can't really appreciate how things work, or even what is
available to use, until you actually find a need to use
something. This is true of bioinformatics. So far, you have
had a taste of bioinformatics by using the tutorials and
completing the first two guided exercises. Next, you need to
create a need to extend beyond the introduction. Now is the time to start asking some relevant questions
and generate an independent project relating bioinformatics
to virology. By doing this, you can learn more about
molecular biology, genetics, and evolution while at the same
time learning something specific about a virus or group of
viruses. You may wish to focus on structural characteristics
of a protein and use modeling tools; or to focus on sequence
conservation of a domain or motif and use alignment and/or
phylogenetic tools; or you may wish to focus on variable
sequences of receptor proteins. Before deciding, you may
want to explore your texts and browse on-line, such as at
the Taxonomy site or through All the Virology on
the Web. You might want to check out some current topics
at the CDC. You can also go to PubMed at NCBI, then
search the journal literature for ideas and background. For
more ideas, go to the Links
page and do some serious browsing. You may choose to do this project alone, or with one or
two partners. If working in a group, clearly state in your
proposal what each person's responsibilities are to the
project. A productive alternative to a true group project is
for each member of a partnership to define their focus
within a general idea framework. For example, each of three
members selects a different virus group, yet they all do the
same type of applications and analysis in order to test a
central hypothesis. [This way you can collaborate on the
learning, but you each have the satisfaction of doing your
own thing.] Draft proposal due 3/12. Returned 3/14. The normal process of coming up with an idea and
executing a project goes something like the following: "I
have no idea at all. I don't know enough about any of this
stuff yet." ... "There is so much to sort through- how am I
going to choose?" ..."Well I had the perfect idea,
but it would take months to do right." ... "I just spent
hours downloading & learning how to use the software I
found I needed. Then it crashed" ... "I'm dead. My
hypothesis is bogus or maybe there just aren't enough
sequences available yet to prove my point." ... "I'm saved!
This new site is perfect- I just wish I knew about it at the
beginning. Then I could have done this assignment in just a
couple of hours." Then there are the procrastinators: "It's due when?
Tomorrow? Oh, gee... Hey, do you want to go out and help me
brainstorm a good excuse?" 1. First, write down a few key ideas during a bit
of browsing and brainstorming. 2. Next, pre-search these ideas to see if there is
some data available which would be useful. The search
strategies introduced in Exercise 2 can serve as a basis for
getting started. For a given target, you may find more
nucleotide sequences than protein sequences, so you may want
to work with the nucleotides rather than the proteins. On
the other hand, you may be more interested in comparing
structures, therefore you'll want to restrict your search to
proteins for which there is structural data. If you are
stuck, talk to me or to another student. 3. You may very well need to locate and use some
applications and analysis tools, about which
you are unfamiliar. In fact, you may not even know what to
call them in order to search for them. Browse through some
of the key sites listed below to learn
about the variety of applications and tools available. These
may help you sharpen your focus on what you want to do and
how to do it. 4. Decide which of your ideas shows the most
promise and write a brief draft proposal. Read about
the goal below, to help scale the size of project you are
proposing. If working with a partner or two, clearly state
your management plan. Due in one week [3/12].
Hardcopies will be available for pickup with feedback within
48 hours; e-mail submitted copies will be answered within 48
hours. [Late submissions may not be returned so quickly,
so be prompt.] You have a choice of one of the following methods of
reporting your results. 1. For an individual project, a finished hardcopy
report on the search is to be limited to three pages of text
and four pages of appended graphs, tables, images, etc. For
a group report on a more comprehensive topic, there is a
limit of six pages text and eight pages of appended
material. Although somewhat flexible, the basic format
should include: 2. Post a web page report, which can then be
linked to the class web site to share with the rest of the
class. You should include clear explanations and any
necessary images, tables, etc. You may include links to
other pages or sites as appropriate. Logistics will be
discussed in section meetings. Points = 10. Due 4/18. Grading will be on
content, organization, spelling, & grammar. Note: For those eager-beavers who want to combine
this project report with their personal Paper Chase question
[worth 30 points], please see me first. A combined
report/Paper Chase question would be worth 40 points, due
4/18, and would follow the above format, with a limit of six
pages text and eight pages of appended material. 1. Once you receive feedback on your proposal,
begin mapping and identifying the components of the
project. Tag possible applications to components and/or
questions in your map. Identify unknowns, difficulties, or
sticking points. 2. Prepare a problems & questions list before
3/19 or 3/21 lab sessions. Bring your map and
problems/questions list to lab. During the lab sessions,
we will try to answer & discuss as many of these as
possible. You may find others wanting to learn about some of
the same applications or having similar difficulties. Feel
free to form collaborative groups. 3. Now that you are organized, you have about 4
weeks to complete your project. [For most, getting to
this point is the hardest part. The rest should be come
together nicely.] Key sites you may want to
use: 1. NCBI: Nice to start someplace familiar. Sub-sites of interest: [there are others, as
well] 2. Performing multiple sequence analyses
[MSA] at EBI: http://www.ebi.ac.uk/clustalw/
: Alternative site if the above site can't be
accessed. 3. Biology Workbench: Biology Workbench is a powerful integrated tool, allowing
you to search multiple databases simultaneously and to use a
very wide variety of tools to examine proteins, nucleotide
sequences, alignments, and structures. All you need to do
the first time is to choose a user name & password. The
cool thing is that it can save your work sessions, so you
can come back to them, even months later. You can upload and
download from it as well, so you can easily transfer
material to a log, and to a report. The downside to using
Biology Workbench is that it takes a little practice to
navigate following a simple rule of not using the "back"
button, because it can cause problems. If it's so cool, why didn't I introduce you to it in the
first place? Several reasons. I was introduced to it for the
first time at the end of July, and I'm still learning about
its foibles. In the meantime, it has been undergoing some
revisions. Since there is so much to it, it would be nice to
introduce it by having some sequence groups available as
"lab sets" and have exercises set up to work with them.
Since I haven't been able to do that yet, I decided to ease
up to it this time around. I will give a basic introduction in lab. Why should
you be interested? Well, if you are interested in
alignments and/or phylogeny, this one-stop site for
applications is a real blessing and well worth the time
spent getting up to speed. If you have had problems running
MSAs, you will find this site a real time-saver. 4. Additional structure modeling and analysis tools:
[If not working in Biology Workbench.] 5. Other major sites: Take the time to see what
each has to offer. National Biotechnology Information Facility-
1000's of link resources: Resource list at NOAA is nicely organized: http://www.nwfsc.noaa.gov/bioinformatics.html Access to good documentation on many applications at
Oxford: [left-hand frame- index] Additional access to lots of cool applications via
Pasteur Institute: [Boxshade, PHYLIP, and
treealign to name but a few] http://bioweb.pasteur.fr/intro-uk.html San Diego Supercomputing Center has lots of
useful links and resources: 6. On-line instruction and support: http://biotech.icmb.utexas.edu/pages/bioinfo.html A detailed, highly linked site: [Note: Some
applications have restricted access, but there is still a
significant amount of useful information.] http://www-biology.ucsd.edu/others/dsmith/workshop.html Make use of the forum "Project Strategies", to ask
questions, beg for input, or offer suggestions or link
sites. Check back periodically and read new postings. They
may contain information or answers which you'll find
useful.
a
Computer Exercise 3: Bioinformatics Projects
Project report due 4/18. For details on the report,
see goal below.Part 1: Generating a project idea
Part 2: Defining your goal
Part 3: Reaching the goal
http://www.ncbi.nlm.nih.gov
http://www2.ebi.ac.uk/clustalw/
http://workbench.sdsc.edu
or tryEuropean Bioinformatics Institute: Access
to much more than ClustalW.
A basic general introduction:
Updated 1/5/02 by thatcher@sonoma.edu