SSU home

..Biology Department home

Virology

Home | Index | Syllabus | Schedule | Study aids | Computing | Links | Interactive

 

Part 1: Getting started

Interactive discussion

Part 2: Defining the goal

Part 3: Reaching the goal

Computing 1: Intro to Virology & Modeling

Computing 2: Bioinformatics Searching

Key sites

a


Computer Exercise 3: Bioinformatics Projects

Stev 2055 PC lab

Introduction:

The best way to learn how to do something is by doing it. You can't really appreciate how things work, or even what is available to use, until you actually find a need to use something. This is true of bioinformatics. So far, you have had a taste of bioinformatics by using the tutorials and completing the first two guided exercises. Next, you need to create a need to extend beyond the introduction.

Now is the time to start asking some relevant questions and generate an independent project relating bioinformatics to virology. By doing this, you can learn more about molecular biology, genetics, and evolution while at the same time learning something specific about a virus or group of viruses. You may wish to focus on structural characteristics of a protein and use modeling tools; or to focus on sequence conservation of a domain or motif and use alignment and/or phylogenetic tools; or you may wish to focus on variable sequences of receptor proteins. Before deciding, you may want to explore your texts and browse on-line, such as at the Taxonomy site or through All the Virology on the Web. You might want to check out some current topics at the CDC. You can also go to PubMed at NCBI, then search the journal literature for ideas and background. For more ideas, go to the Links page and do some serious browsing.

You may choose to do this project alone, or with one or two partners. If working in a group, clearly state in your proposal what each person's responsibilities are to the project. A productive alternative to a true group project is for each member of a partnership to define their focus within a general idea framework. For example, each of three members selects a different virus group, yet they all do the same type of applications and analysis in order to test a central hypothesis. [This way you can collaborate on the learning, but you each have the satisfaction of doing your own thing.]

Draft proposal due 3/12. Returned 3/14.
Project report due 4/18. For details on the report, see goal below.

[top of page]

Part 1: Generating a project idea

The normal process of coming up with an idea and executing a project goes something like the following: "I have no idea at all. I don't know enough about any of this stuff yet." ... "There is so much to sort through- how am I going to choose?" ..."Well I had the perfect idea, but it would take months to do right." ... "I just spent hours downloading & learning how to use the software I found I needed. Then it crashed" ... "I'm dead. My hypothesis is bogus or maybe there just aren't enough sequences available yet to prove my point." ... "I'm saved! This new site is perfect- I just wish I knew about it at the beginning. Then I could have done this assignment in just a couple of hours."

Then there are the procrastinators: "It's due when? Tomorrow? Oh, gee... Hey, do you want to go out and help me brainstorm a good excuse?"

1. First, write down a few key ideas during a bit of browsing and brainstorming.

2. Next, pre-search these ideas to see if there is some data available which would be useful. The search strategies introduced in Exercise 2 can serve as a basis for getting started. For a given target, you may find more nucleotide sequences than protein sequences, so you may want to work with the nucleotides rather than the proteins. On the other hand, you may be more interested in comparing structures, therefore you'll want to restrict your search to proteins for which there is structural data. If you are stuck, talk to me or to another student.

3. You may very well need to locate and use some applications and analysis tools, about which you are unfamiliar. In fact, you may not even know what to call them in order to search for them. Browse through some of the key sites listed below to learn about the variety of applications and tools available. These may help you sharpen your focus on what you want to do and how to do it.

4. Decide which of your ideas shows the most promise and write a brief draft proposal. Read about the goal below, to help scale the size of project you are proposing. If working with a partner or two, clearly state your management plan. Due in one week [3/12]. Hardcopies will be available for pickup with feedback within 48 hours; e-mail submitted copies will be answered within 48 hours. [Late submissions may not be returned so quickly, so be prompt.]

 [top of page]

Part 2: Defining your goal

You have a choice of one of the following methods of reporting your results.

1. For an individual project, a finished hardcopy report on the search is to be limited to three pages of text and four pages of appended graphs, tables, images, etc. For a group report on a more comprehensive topic, there is a limit of six pages text and eight pages of appended material. Although somewhat flexible, the basic format should include:

  • Title
  • Author(s)
  • Abstract- brief
  • Hypothesis or purpose of project [besides earning points]
  • Introduction/Background- brief
  • Methods- include databases; websites & URLs; search and analysis tools, along with settings
  • Results- include reference to images & tables
  • Discussion/Conclusions- make clear the significance of results
  • References- literature, web sites- including support sites you used

2. Post a web page report, which can then be linked to the class web site to share with the rest of the class. You should include clear explanations and any necessary images, tables, etc. You may include links to other pages or sites as appropriate. Logistics will be discussed in section meetings.

Points = 10. Due 4/18.  Grading will be on content, organization, spelling, & grammar.

Note: For those eager-beavers who want to combine this project report with their personal Paper Chase question [worth 30 points], please see me first. A combined report/Paper Chase question would be worth 40 points, due 4/18, and would follow the above format, with a limit of six pages text and eight pages of appended material.

[top of page]

Part 3: Reaching the goal

1. Once you receive feedback on your proposal, begin mapping and identifying the components of the project. Tag possible applications to components and/or questions in your map. Identify unknowns, difficulties, or sticking points.

2. Prepare a problems & questions list before 3/19 or 3/21 lab sessions. Bring your map and problems/questions list to lab. During the lab sessions, we will try to answer & discuss as many of these as possible. You may find others wanting to learn about some of the same applications or having similar difficulties. Feel free to form collaborative groups.

3. Now that you are organized, you have about 4 weeks to complete your project. [For most, getting to this point is the hardest part. The rest should be come together nicely.]

 

Key sites you may want to use:

1. NCBI: Nice to start someplace familiar.

http://www.ncbi.nlm.nih.gov 

Sub-sites of interest: [there are others, as well]

  • BLAST
  • Entrez- nucleotides, proteins, genome, etc.
  • Structure

2. Performing multiple sequence analyses [MSA] at EBI:

http://www2.ebi.ac.uk/clustalw/

http://www.ebi.ac.uk/clustalw/ : Alternative site if the above site can't be accessed.

[top of page]

3. Biology Workbench:

http://workbench.sdsc.edu

Biology Workbench is a powerful integrated tool, allowing you to search multiple databases simultaneously and to use a very wide variety of tools to examine proteins, nucleotide sequences, alignments, and structures. All you need to do the first time is to choose a user name & password. The cool thing is that it can save your work sessions, so you can come back to them, even months later. You can upload and download from it as well, so you can easily transfer material to a log, and to a report. The downside to using Biology Workbench is that it takes a little practice to navigate following a simple rule of not using the "back" button, because it can cause problems.

If it's so cool, why didn't I introduce you to it in the first place? Several reasons. I was introduced to it for the first time at the end of July, and I'm still learning about its foibles. In the meantime, it has been undergoing some revisions. Since there is so much to it, it would be nice to introduce it by having some sequence groups available as "lab sets" and have exercises set up to work with them. Since I haven't been able to do that yet, I decided to ease up to it this time around.

I will give a basic introduction in lab. Why should you be interested? Well, if you are interested in alignments and/or phylogeny, this one-stop site for applications is a real blessing and well worth the time spent getting up to speed. If you have had problems running MSAs, you will find this site a real time-saver.

4. Additional structure modeling and analysis tools: [If not working in Biology Workbench.]

[top of page]

5. Other major sites: Take the time to see what each has to offer.

European Bioinformatics Institute: Access to much more than ClustalW.

http://www.ebi.ac.uk

National Biotechnology Information Facility- 1000's of link resources:

http://www.nbif.org/

Resource list at NOAA is nicely organized:

http://www.nwfsc.noaa.gov/bioinformatics.html

Access to good documentation on many applications at Oxford: [left-hand frame- index]

http://www.molbiol.ox.ac.uk/

Additional access to lots of cool applications via Pasteur Institute: [Boxshade, PHYLIP, and treealign to name but a few]

http://bioweb.pasteur.fr/intro-uk.html

San Diego Supercomputing Center has lots of useful links and resources:

http://www.sdsc.edu/ResTools/biotools/biotools20.html

6. On-line instruction and support:

A basic general introduction:

http://biotech.icmb.utexas.edu/pages/bioinfo.html

A detailed, highly linked site: [Note: Some applications have restricted access, but there is still a significant amount of useful information.]

http://www-biology.ucsd.edu/others/dsmith/workshop.html

 

Interactive discussion forum:

Make use of the forum "Project Strategies", to ask questions, beg for input, or offer suggestions or link sites. Check back periodically and read new postings. They may contain information or answers which you'll find useful.

 

[top of page]

 

Home | Index | Syllabus | Schedule | Study aids | Computing | Links | Interactive

 Updated 1/5/02 by thatcher@sonoma.edu