CSE – 535
Mid Term Exam Content
Bioinformatics: Bioinformatics is the science of storing, retrieving and analyzing of biological information.
- Bioinformatics is a highly interdisciplinary field involving many different types of specialists including biologists, molecular life scientists, computer scientists and mathematicians.
- The term ‘Bioinformatics’ was coined by Pouline Hogeweg and Ben Hesper to describe the ‘Study of informatic progresses is biotic system’. But Margaret O. Dayhoff was a pioneer in the field of Bioinformatics.
- Bioinformatics includes biological studies that use computer programming as part of their methodology as well as specific analysis “pipelines” that are repeatedly used particularly in the field of genomics.
- Bioinformatics also tries to understand the organizational principles within nucleic acid and protein sequences called proteomics
Major research areas of Bioinformatics:
- Sequence Analysis.
- Genome Analysis.
- Computational evolutionary biology.
- Literature Analysis.
- Analysis of Gene expression.
- Analysis of Regulation.
- Analysis of protein expression.
- Analysis of mutation in cancer.
- Comparative Genomics
- Hi-throughput Image Analysis.
Applications of Bioinformatics:
- Molecular Medicine
- Personalized Medicine
- Preventive Medicine
- Gene therapy
- Drug development
- Microbial genome application
- Waste clean up
- Climate change studies
- Alternative energy sources
- Antibiotic resistance
- Forensic analysis of microbes
- Bio-weapon creation
- Evolutionary studies
- Crops improvement
- Insect resistance
- Improve nutritional quality
- Development of drought resistance varieties
- Veterinary sequence
List of Protein Databases:
# Protein Databases
- PDB (www.rcsb.org/pdb): A database for solved protein structure.
- Uniprot (https://www.uniprot.org/): A protein information database.
- CATH (www.cathdb.info/): It is a protein structure classification database.
# Disease Database
- OMIM (www.ncbi.nlm.nih.gov/omim): A database for genetic diseases.
- IEDB (www.iedb.org): An epitope database and prediction source.
# Metabolic Database
- HMDB (http://www.hmdb.ca/): A database for small molecules metabolitics found in human body.
- ECMDB (www.ecmdb.ca/): A database for metabolitics found E-coli.
# Literature Database
- PubMed (https://www.ncbi.nlm.nih.gov/pubmed/)
# Sequence Database
- Genbank (www.ncbi.nlm.nih.gov/genbank): A sequence database.
- EMBL (www.embl.org/): A nucleotide sequence database.
# Pathway Database
- KEGG (Kyoto encyclopedia of Genomes and Genomic): This is an interaction network database.
- MINT (https://mint.bio.uniroma2.it/): The Molecular interaction database.
- BioGRID (www.thebiogrid.org/): A database for protein-protein interaction, genetic interaction, chemical interactions and post-translational modifications.
Drug Discovery Process:
Historical Milestones in the field of Bioinformatics:
1965 – Margaret Dayhoff – Atlas of protein sequence
1970 – Needleman Wunsch algorithm
1977 – DNA sequencing & software to analyze it.
1981 – The concept of sequence motif
1981 – Smith waterman algorithm developed
1982 – Genbank release 3 made public
1982 – Phage Lambda genome sequenced
1983 – Sequence database searching algorithm
1985 – FASTP/FASTN: Fast sequencing similarity searching
1988 – National center for Biotechnology information (NCBI) established at NIH/NLM
1988 – EMBnet network for database distributor
1990 – BLAST: Fast sequence similarity searching
1991 – EST. expressed sequence tag sequencing
1993 – Sanger center, Hinxton, UK.
1994 – EMBL: European Bioinformatics Institute, Hinxton, UK.
1995 – First Bacterial genomes completely sequenced
1996 – Yeast genome completely sequenced
1997 – PSA – BLAST
1998 – Worm (Multi Cellular) genome completely sequenced
1999 – Fly genome completely sequenced
Classical Tools in Bioinformatics:
# Database interface
– Genbank /EMBL/DDBJ, Medline, Swissprot, PDB
# Sequence Alignment
– BLAST, FASTA, Claustral, MultAlin, DiAlign
# Structure Prediction
– Swiss Modeler
# Gene Finding
– Genscan, Genome scan, Genemark, Grail
# Protein Domain Analysis
– Pfam, BLOCKS, ProDom
# Pattern Identification/Characterization
– Gibbs sampler, AlignACE, MEME
# Protein folding prediction
– PredictProtein, Swiss Modeler
Final Exam Content
Classification of Databases:
Chemical structure of Nucleic Acid: