Wednesday, 10 September 2014

MULTIPLE SEQUENCE ALIGNMENT

Manually align these sequences to search for degrees of relatedness
Set your own scores for gap penalties and matches
MAKGOBA
ACTCCGTCTCAATATGTCTCA
CALCIUM
CCGACTCCGTCTCAATATGTC
SARAH
ACTCCGTTATCTCAATATGTC
SAMMY
GATGTGATGATACGTAGATTT
Now use ClustalW to alignment the following proteins given their protein IDs from UniProt.
P12882
Q9UKX2
P02565 
P11055 
P13541 
P12847 

Monday, 8 September 2014

SEQUENCE ALIGNMENT

Pairwise Sequence Alignment is used to identify regions of similarity that may indicate functional, structural and/or evolutionary relationships between two biological sequences (protein or nucleic acid).
By contrast, Multiple Sequence Alignment (MSA) is the alignment of three or more biological sequences of similar length. From the output of MSA applications, homology can be inferred and the evolutionary relationship between the sequences studied.

Global Alignment

Global alignment tools create an end-to-end alignment of the sequences to be aligned. There are separate forms for protein or nucleotide sequences.
Needle (EMBOSS)
EMBOSS Needle creates an optimal global alignment of two sequences using the Needleman-Wunsch algorithm.
Stretcher (EMBOSS)
EMBOSS Stretcher uses a modification of the Needleman-Wunsch algorithm that allows larger sequences to be globally aligned.

Local Alignment

Local alignment tools find one, or more, alignments describing the most similar region(s) within the sequences to be aligned. There are separate forms for protein or nucleotide sequences.
Water (EMBOSS)
EMBOSS Water uses the Smith-Waterman algorithm (modified for speed enhancements) to calculate the local alignment of two sequences.
Matcher (EMBOSS)
EMBOSS Matcher identifies local similarities between two sequences using a rigorous algorithm based on the LALIGN application.

Genomic Alignment
Genomic alignment tools concentrate on DNA (or to DNA) alignments while accounting for characteristics present in genomic data.
Wise2DBA 
Wise2DBA (DNA Block Aligner) aligns two sequences under the assumption that the sequences share a number of colinear blocks of conservation separated by potentially large and varied lengths of DNA in the two sequences.
GeneWise 
GeneWise compares a protein sequence to a genomic DNA sequence, allowing for introns and frameshifting errors.
PromoterWise 
PromoterWise compares two DNA sequences allowing for inversions and translocations, ideal for promoters.

CLASSWORK NO. 2
GIVEN THE 2 SEQUENCES, USE PAIRWISE ALIGNMENT TOOLS TO DETERMINE EXTENT OF IDENTITY OF THE SEQUENCES
ACTCCGTCTCAATATGTCTCAAGATGGCGGCCAATGTGGGATCGATGTTTCAATATTGGAAGCGCTTTGATTTACAGCAGCTGCAGGATTTGCGCAAGCAGGTAGCGCCGCTGCTGAAGAGTTTCCAAGGAGAGATTGATGCACTGAGTAAAAGAAGCAAGGAAGCTGAAGCAGCTTTCTTGAATGTCTACAAAAGATTGATTGACGTCCCAGATCCCGTACCAGCTTTGGATCTCGGACAGCAACTCCAGCTCAAAGTGCAGCGCCTGCACGATATTGAA
AND 
ACTCCGTCTCAATATGTCTCAAGACAATGTGGGATCGATGTTTCAATATTGGAAGCGCTTTGATTTACAGCAGCTGCAGGATTTGCGCAAGCAGGTAGCGCCGCTGCTGAAGAGTTTCCAAGGAGAGATTGATGCACTGAGTAAAAGAAGCAAGGAAGCTGAAGCATTGAATGTCTACAAAAGATTGAAAGACGTCCCAGATCCCGTACCAGCTTTGGATCTCGGACAGCAACTCCAGCTCAAAGTGCAGCGCCTGCACGATATTGAATGGCGGCGCTTTC

Bioinformatics (SCIN072) MODULE OUTLINE


MODULE OUTLINE


Module Title
BIOINFORMATICS AND SPECIAL TOPICS

Module Code
SBIN704
SCIN072

No. of Credits
20

Department
BMBT

School
SMLS
Pre-requisites Module Code
NONE
Co-requisites Module Code
NONE

Module Lecturers
Dr KLM Moganedi
Dr TM Matsebatlela
Dr VP Bagla

Office Address
3026 Life Science building (Moganedi)
2nd level Life Science building (Matsebatlela)

Email address
Kgabo.Moganedi@ul.ac.za
Thabe.Matsebatlela@ul.ac.za
victor.bagla@ul.ac.za

Telephone No.
3630 (Moganedi)
2337 (Matsebatlela)
3855 (Bagla)

Consultation Time
10h00 – 14h00 Monday to Friday

Lecture Periods
Mondays – Thursdays
08h00 – 10h00
Practical facilitator
Dr KLM Moganedi
Office address
3026 Life Science building
E-mail address
Kgabo.Moganedi@ul.ac.za
Telephone no.
3630

Important Dates
19/09 Test 1-Bioinformatics
18/09 Submission Assignment 1

Learning Hours
>3 hours per lecture
>Computer exercises during own time
Quarter/Semester
4th Quarter/ 2nd Semester
Module outline evaluated by
Prof I Ncube

Module Structure
09/09/2013 – 18/10/2013
4 x 180 minutes per week, presented between 7h30 and 12h00    
Assessment Method

Weighting

Description
Exercises
Assignment
Theory test
Summative Exam

Formative assessment 60%
Exercises:         25%,
Test :                35%
Summative assessment 40%
Theory:              70%,
(Bioinformatics – 30%
Phytochemistry - 30%
Phylogeny – 10 %
Comprehension – 30%)
Comprehension 30%,
Minimum Formative Assessment mark for exam admission =  40%
Final mark =  60% Formative Assess Mark           
                    40% Summative Assess Mark

Minimum Final Assessment mark to Pass = 50%
MODULE DESCRIPTION
The theory component of this module is presented in contact sessions by the lecturers. The exercises that complement theory will be performed by students on their own but guided by written instructions and descriptions from the lecturer. These will be discussed in class. The emphasis is on collection, storage and retrieval of biological data and how this information can be used to discern relatedness between organisms and to determine functions and structures of genes and proteins. The component of phytochemistry will place emphasis on phytochemical contents and medicinal value of plants in drug discovery and quality control of herbal products.
MODULE CONTENT
Collection and storage of sequences; Biological data sources; Biological Databases; Searching and retrieval of biological data from databases; Sequence Alignment; Prediction of protein-coding genes; cDNA based approach; Ab initio gene discovery; Gene identification through comparative analysis; Functional genomics: From gene sequence to function, mutagenesis as a tool for studying gene function; Transposon mutagenesis, gene knockout, knock-ins, gene silencing by antisense RNA and RNAi; Phylogenomics. Type of phytochemicals that are of relevance in ethnopharmacology, value of their prediction in extracts, approaches of prediction of actives, their extraction, fractionation, and isolation.  It will also deal with problems associated with in-vitro and in vivo assays in ethnopharmacology.
MODULE OBJECTIVES
·      To introduce the importance of bioinformatics in biological research; source and storage of biological data.
·      To acquaint students with how to search and retrieve biological data from biological databases.
·      To presents the importance of performing sequence alignments and how they are carried out.
·      To describe how protein-coding genes are predicted from nucleotide sequences and their functions are determined.
·       To learn how protein structures are predicted.
·      To present how sequences from different organisms can be compared in order to learn about their evolution in phylogenomics.
·      Provide an insight into the medicinal value of plants and how best their medical active constituents can be harnessed  either as extracts, fractions or single entities for drug discovery
LEARNING OUTCOMES

After successfully completing the module, the student should show the ability to:
·     Explain how protein and nucleotide sequences are generated, stored and retrieved from databases
·     Describe the importance of protein and nucleic acid sequences in homology searches, inference of the function of the biomolecules and evolutionary relationships between organisms
·     Demonstrate the utilization of protein and nucleic acid sequences in homology searches and inference of evolutionary relationship between organisms 
·     Discuss different approaches of determining protein-coding genes from nucleotide sequences and their levels of reliability
·     Deduce gene function from mutagenesis data
·      Have the ability to analyze and argue research information from a research paper
·      Have a knowledge of type of phytochemicals contained in plants and their biological relevance
·      Have an understanding of the approaches to identify the presence of actives in a plant
·      Know the type of solvent to use for extraction of a desired phytochemical of interest
·      Detect the presence of phytochemicals in a plant using detection methods and reagents used for TLC
·      Have the ability to detect type of compounds associated with a particular biological activity
·      Demonstrate the ability to extract, fractionate and isolate plant constituent
·      Have a knowledge of why in-vivo assays should complement in-vitro assays
ASSESSMENT CRITERIA

The students must be able to:
·     Describe protein and DNA sequencing techniques
·     Differentiate types of databases according to biological data that is stored
·     Demonstrate searching and retrieval of biological data from databases
·     Explain the biological significance of performing sequence alignments
·     Describe different approaches of determining protein-coding genes from nucleotide sequences and their levels of reliability
·     Compare and contrast different approaches of determining the function of a gene
·     Correlate sequence information and degree of relatedness between organisms
·     Explain the relevance and describe the applicability of a special topic to the research niche area in the department
·     Explain the scientific merit of a selected research paper
Indicate correlation of the aim and methodology used to the findings of a research paper
·         For phytochemistry will be as expected from the learning outcomes
REFERENCE MATERIALS FOR THE MODULE

These books are available in the department (BMBT) for student loan:
1.    Lesk AM. 2002. Introduction to Bioinformatics. Oxford University press
2.    Gibson G and Muse SV. 2004. A primer of Genome Science. 2nd ed, Sinauer Associates, Inc publishers.
3.    Xiong J. 2006. Essential Bioinformatics. Cambridge University Press
4.    Felsenstein J. 2004. Inferring phylogenies. Sinauer Associates, Inc.
5.    Page RDM and Holmes EC. 1998. Molecular evolution. Blackwell Publishing company
TENTATIVE LIST OF LECTURE TOPICS

1.    Generation of biological data
2.    Collection and storage of biological data
Biological Databases
3.    Searching and retrieval of biological data from databases
4.    Sequence Alignment
5.    Prediction of protein-coding genes
o    Ab initio gene discovery
o    Gene identification through comparative analysis
6.    Functional genomics
7.    Phylogenomics
8.    Phytochemistry and its application in drug discovery and quality control of herbal preparation
9.    Type of phytochemicals
10.  How to predict the chemical types of actives present in a plant
11.  Approaches to prediction of chemical types of actives present in a plant
12.  Extraction, fractionation and isolation of plant constituents
13.  Problems associated with in-vitro and in vivo assays in ethnopharmagology
14.  Value of predicting the chemical types of actives present in a plant
15.  Comprehension: Research paper