Dr. Prativa Deka
Associate Professor
Department of Botany
Mangaldai College, Mangaldai
E-Mail: pdeka.mld@gmail.com
Bioinformatics
Biologists
Collect Molecular Data:
DNA & Protein Sequences,
Gene Expression, etc.
Computer scientists
(+Mathematicians, Statisticians, etc.)
Develop Tools, Softwares, Algorithms
to Store and Analyze the Data.
Bioinformaticians
Study of Biological Questions by
Analyzing Molecular Data
Bioinformatics: The field of science in which biology, computer
science and information technology merge into a single discipline
Paulien
hogeweg
From DNA to Genome
3
Watson and Crick
DNA model
Sanger sequences
insulin protein
Sanger dideoxy
DNA sequencing
PCR (Polymerase
Chain Reaction)
1955
1960
1965
1970
1975
1980
1985
ARPANET
(early Internet)
PDB (Protein Data
Bank)
Sequence
alignment
GenBank database
Dayhoff’s Atlas
4
1995
1990
2000
SWISS-PROT
database
NCBI
World Wide Web
BLAST
FASTA
EBI
Human Genome
Initiative
First human
genome draft
First bacterial
genome
Yeast genome
05/05/2025 5
Biological Databases
What is a database?
– A collection of related data elements
• tables
• columns (fields)
• rows (records)
– Records retrieved using a query language
– Database technology is well established
05/05/2025 6
• Tables (entitites)
• basic elements of information to track, e.g.,
gene, organism, sequence, citation
• Columns (fields)
• attributes of tables, e.g. for citation table, title,
journal, volume, author
• Rows (records)
• actual data
• whereas fields describe what data is stored, the rows
of a table are where the actual data is stored
05/05/2025 7
How online database work?
When you query an online database, your query is translated into SQL, the database
is interrogated, and the answer displayed on your web browser.
Your computer and
browser (the “client”)
Software to receive
and translate the
instructions you enter
into your browser (on
the “server”)
The database itself
Image source: David Lane and Hugh E. Williams. Web Database Applications with PHP & MySQL. O’Reilly (2002).
Why biological databases?
• Make biological data available to scientists
– Consolidation of data (gather data from different sources)
– Provide access to large dataset that cannot be published
explicitly (genome, proteome,…)
• Make biological data available in computer-readable format
– Make data accessible for automated analysis
Bioinformatics: “To extract, store and to analysis the
biological data”
05/05/2025 9
Biological Databases
• Over 1000 biological databases
• Vary in size, quality, coverage, level of interest
• Many of the major ones covered in the annual
Database Issue of Nucleic Acids Research
• What makes a good database?
• comprehensiveness
• accuracy
• is up-to-date
• good interface
• batch search/download
• API (web services, DAS, etc.)
Types of Biological Databases
Flow of Databases in Bioinformatics
Biological experiments
Biological Databases
Computational Biology
Plants Genomes Databases
Plant Genomes
Databases
Ten Important Bioinformatics Databases
• GenBank www.ncbi.nlm.nih.gov nucleotide sequences
• Ensembl www.ensembl.org human/mouse/Plants genome
• PubMed www.ncbi.nlm.nih.gov literature references
• NR www.ncbi.nlm.nih.gov protein sequences
• SWISS-PROT www.expasy.ch protein sequences
• InterPro www.ebi.ac.uk protein domains
• OMIM www.ncbi.nlm.nih.gov genetic diseases
• Enzymes www.chem.qmul.ac.uk enzymes
• PDB www.rcsb.org/pdb/ protein structures
• KEGG www.genome.ad.jp metabolic pathways
• In 1965, Dayhoff gathered all the available sequence data to create the first
bioinformatics database (Atlas of Protein Sequence and Structure).
NCBI (National Center for Biotechnology
Information)
•over 30 databases including
GenBank, PubMed, OMIM,
and GEO
• Access all NCBI resources
via Entrez
(www.ncbi.nlm.nih.gov/Entr
ez/)
Protein Data Bank (PDB)
BLAST For Sequence Alignment
• Basic Local Alignment Search Tool
– Altschul et al. 1990,1994,1997
• A best method for local alignment
• Designed specifically for database searches
• Benefits-Speed, User friendly, Statistical rigor,
More sensitive
• Types of BLAST- BLASTN, BLASTP, BLASTX,
TBLASTN, TBLASTX
Luscombe, Greenbaum, Gerstein (2001)
THANK YOU

Lecture 1 Bioinformatics Free Lecture Download

  • 1.
    Dr. Prativa Deka AssociateProfessor Department of Botany Mangaldai College, Mangaldai E-Mail: pdeka.mld@gmail.com Bioinformatics
  • 2.
    Biologists Collect Molecular Data: DNA& Protein Sequences, Gene Expression, etc. Computer scientists (+Mathematicians, Statisticians, etc.) Develop Tools, Softwares, Algorithms to Store and Analyze the Data. Bioinformaticians Study of Biological Questions by Analyzing Molecular Data Bioinformatics: The field of science in which biology, computer science and information technology merge into a single discipline Paulien hogeweg
  • 3.
    From DNA toGenome 3 Watson and Crick DNA model Sanger sequences insulin protein Sanger dideoxy DNA sequencing PCR (Polymerase Chain Reaction) 1955 1960 1965 1970 1975 1980 1985 ARPANET (early Internet) PDB (Protein Data Bank) Sequence alignment GenBank database Dayhoff’s Atlas
  • 4.
    4 1995 1990 2000 SWISS-PROT database NCBI World Wide Web BLAST FASTA EBI HumanGenome Initiative First human genome draft First bacterial genome Yeast genome
  • 5.
    05/05/2025 5 Biological Databases Whatis a database? – A collection of related data elements • tables • columns (fields) • rows (records) – Records retrieved using a query language – Database technology is well established
  • 6.
    05/05/2025 6 • Tables(entitites) • basic elements of information to track, e.g., gene, organism, sequence, citation • Columns (fields) • attributes of tables, e.g. for citation table, title, journal, volume, author • Rows (records) • actual data • whereas fields describe what data is stored, the rows of a table are where the actual data is stored
  • 7.
    05/05/2025 7 How onlinedatabase work? When you query an online database, your query is translated into SQL, the database is interrogated, and the answer displayed on your web browser. Your computer and browser (the “client”) Software to receive and translate the instructions you enter into your browser (on the “server”) The database itself Image source: David Lane and Hugh E. Williams. Web Database Applications with PHP & MySQL. O’Reilly (2002).
  • 8.
    Why biological databases? •Make biological data available to scientists – Consolidation of data (gather data from different sources) – Provide access to large dataset that cannot be published explicitly (genome, proteome,…) • Make biological data available in computer-readable format – Make data accessible for automated analysis Bioinformatics: “To extract, store and to analysis the biological data”
  • 9.
    05/05/2025 9 Biological Databases •Over 1000 biological databases • Vary in size, quality, coverage, level of interest • Many of the major ones covered in the annual Database Issue of Nucleic Acids Research • What makes a good database? • comprehensiveness • accuracy • is up-to-date • good interface • batch search/download • API (web services, DAS, etc.)
  • 10.
  • 11.
    Flow of Databasesin Bioinformatics Biological experiments Biological Databases Computational Biology
  • 12.
  • 13.
    Ten Important BioinformaticsDatabases • GenBank www.ncbi.nlm.nih.gov nucleotide sequences • Ensembl www.ensembl.org human/mouse/Plants genome • PubMed www.ncbi.nlm.nih.gov literature references • NR www.ncbi.nlm.nih.gov protein sequences • SWISS-PROT www.expasy.ch protein sequences • InterPro www.ebi.ac.uk protein domains • OMIM www.ncbi.nlm.nih.gov genetic diseases • Enzymes www.chem.qmul.ac.uk enzymes • PDB www.rcsb.org/pdb/ protein structures • KEGG www.genome.ad.jp metabolic pathways • In 1965, Dayhoff gathered all the available sequence data to create the first bioinformatics database (Atlas of Protein Sequence and Structure).
  • 14.
    NCBI (National Centerfor Biotechnology Information) •over 30 databases including GenBank, PubMed, OMIM, and GEO • Access all NCBI resources via Entrez (www.ncbi.nlm.nih.gov/Entr ez/)
  • 16.
  • 17.
    BLAST For SequenceAlignment • Basic Local Alignment Search Tool – Altschul et al. 1990,1994,1997 • A best method for local alignment • Designed specifically for database searches • Benefits-Speed, User friendly, Statistical rigor, More sensitive • Types of BLAST- BLASTN, BLASTP, BLASTX, TBLASTN, TBLASTX
  • 18.
  • 19.

Editor's Notes

  • #11  The purpose of databases is to curate the increasing amount of experimental data from various disciplines of biology Bioinformatics can be viewed as this circle of science where high throughput experimental procedures produce large quantity of data This data is stored and processed in databases from which it can be extracted for analysis with computational methods The results form the computational methods in turn inspire additional experiments and hypotheses concerning biological phenomena