Preface | p. xi |
Acknowledgments | p. xiii |
Biography | p. xv |
Introduction | p. 1 |
Relational Database Fundamentals | p. 5 |
Introduction | p. 5 |
Tables, Rows, and Columns | p. 5 |
External and Internal Representations of Data | p. 7 |
Advantages over Spreadsheets | p. 8 |
Size and Speed | p. 8 |
Multiple Users | p. 8 |
Relationships among Tables | p. 9 |
One-to-Many Relationships | p. 9 |
One-to-One Relationships | p. 11 |
Many-to-Many Relationships | p. 12 |
Entity Relationship Diagrams | p. 12 |
Uniqueness | p. 14 |
Sequences | p. 14 |
Keys | p. 15 |
Primary Keys | p. 15 |
Foreign Keys | p. 15 |
Constraints | p. 16 |
Indexes | p. 16 |
Joining Tables | p. 16 |
Normal Forms | p. 17 |
First Normal Form | p. 17 |
Second Normal Form | p. 18 |
Third Normal Form | p. 19 |
Summary of Normal Forms | p. 20 |
References | p. 20 |
Structured Query Language (SQL) | p. 21 |
Introduction | p. 21 |
Databases, Schemas, Tables, Rows, and Columns | p. 21 |
Create | p. 22 |
Insert | p. 23 |
Select | p. 24 |
Update and Delete | p. 25 |
SQL Functions | p. 26 |
Regular Functions | p. 26 |
Aggregate Functions | p. 27 |
Domains, Triggers, and Views | p. 28 |
Unions, Intersections, and Differences | p. 29 |
References | p. 30 |
Relational Database Management Systems | p. 31 |
Introduction | p. 31 |
Standard SQL | p. 32 |
A Sampling of Differences | p. 32 |
Server and Client | p. 33 |
Compatibility | p. 35 |
References | p. 35 |
Client and Web Applications | p. 37 |
Introduction | p. 37 |
Command Line Programs | p. 37 |
Web-Based Applications | p. 38 |
Client Applications | p. 39 |
SQL Interfaces in Various Languages | p. 41 |
Perl | p. 43 |
Python | p. 44 |
PHP | p. 44 |
Java | p. 45 |
References | p. 46 |
Data Storage, Searching, and Manipulation | p. 47 |
Introduction | p. 47 |
General Schema Design Decisions | p. 47 |
Sample Schema for Tracking Chemical Samples | p. 49 |
Schemas for PubChem Data | p. 53 |
BioAssay Data | p. 54 |
Substances | p. 56 |
Compounds | p. 58 |
Data Constraints and Data Integrity | p. 60 |
Developing Complex SQL | p. 63 |
Subselect Statements | p. 66 |
Views | p. 67 |
References | p. 70 |
Computer Representations of Molecular Structures | p. 71 |
Introduction | p. 71 |
SMILES Representation of Molecular Structure | p. 72 |
Extensions to SQL for Chemical Structures | p. 72 |
SMARTS Representation of Molecular Searches | p. 74 |
SMILES and SMARTS Quirks | p. 76 |
Hydrogen Atoms | p. 76 |
Aromaticity | p. 77 |
Tautomers | p. 77 |
Valence | p. 80 |
Chirality | p. 80 |
Isotopes | p. 81 |
Salts and Mixtures | p. 81 |
InChI and Canonical SMILES | p. 82 |
SMILES and Inorganic Structures | p. 82 |
Other SMILES Extensions | p. 82 |
Input and Output of Molecular Structures | p. 83 |
Useful SQL Extensions | p. 85 |
SMILES as an SQL Data Type | p. 86 |
Domains | p. 86 |
Triggers | p. 87 |
Summary | p. 88 |
References | p. 88 |
Molecular Fragments and Fingerprints | p. 91 |
Introduction | p. 91 |
Fragments | p. 91 |
Fragment Keys | p. 92 |
MACCS Keys and Other Fragment Keys | p. 95 |
Fingerprints | p. 95 |
Similarity Measures | p. 96 |
Computing Fragment-Based Properties | p. 96 |
References | p. 98 |
Reactions and Transformations | p. 99 |
Introduction | p. 99 |
Reaction SMILES | p. 99 |
Transformations | p. 100 |
Unimolecular Transformations | p. 101 |
Multi-Component Transformations | p. 104 |
Canonical Reaction SMILES | p. 106 |
References | p. 107 |
PostgreSQL Extensions | p. 109 |
Introduction | p. 109 |
Composite Data Types | p. 109 |
Composite Data Type for Experimental Values | p. 111 |
Array Data Types for Two- and Three-Dimensional Coordinates | p. 115 |
Functions in Other Languages | p. 117 |
Plpgsql | p. 117 |
Plperl, Plpython, Pltcl | p. 118 |
Core Chemical Functions | p. 119 |
C Language Functions | p. 120 |
Object RDBMS | p. 121 |
References | p. 121 |
Three-Dimensional Molecular Structure Tables | p. 123 |
Introduction | p. 123 |
Using Tables Instead of Files | p. 123 |
Molfile and Other Common File Formats | p. 124 |
Processing SDF Files | p. 125 |
Using Tables Instead of Files in Client Programs | p. 131 |
File Import, Export, and Conversions | p. 132 |
Functions Using Three-Dimensional Atomic Coordinates | p. 133 |
Conformations | p. 135 |
Other Representations of Three-Dimensional Molecular Structure | p. 136 |
References | p. 136 |
More on Client and Web Interfaces to RDBMS | p. 137 |
Introduction | p. 137 |
Store All Possible Data in the RDBMS | p. 139 |
Advanced SQL Techniques | p. 140 |
Placeholders in SQL Statements | p. 141 |
Bind Values in SQL Statements | p. 142 |
Web Applications | p. 143 |
R Programs | p. 147 |
Hierarchical Clustering | p. 147 |
Linear Models | p. 148 |
References | p. 153 |
Applications | p. 155 |
Introduction | p. 155 |
Compound Registration | p. 155 |
Experimental Chemical and Biological Data Integration | p. 162 |
Data from External Sources | p. 164 |
Utilities | p. 167 |
molgrep | p. 168 |
molcat | p. 168 |
molview | p. 169 |
molarb | p. 170 |
molrandom | p. 170 |
molnear | p. 171 |
molsame | p. 171 |
References | p. 172 |
Appendix | p. 173 |
Introduction | p. 173 |
Symbols and Bonds from Simplified Molecular Input Line Entry System (SMILES) | p. 173 |
Normalizing Data | p. 175 |
SQL Functions | p. 176 |
Public166keys | p. 176 |
Orsum | p. 176 |
Tanimoto | p. 176 |
Euclid | p. 177 |
Hamming | p. 177 |
Nbits_set | p. 177 |
Amw | p. 177 |
Tpsa | p. 181 |
Tables Used in Functions | p. 182 |
Amw | p. 183 |
Tpsa | p. 183 |
Public166keys | p. 183 |
Core Function Implementation for PostgreSQL | p. 188 |
PerlMol/plperlu | p. 188 |
FROWNS/plpythonu | p. 191 |
OpenBabel/python | p. 197 |
C Language PostgreSQL Functions | p. 203 |
Database Utilities Dbutils | p. 205 |
Loading Files into Simple Tables | p. 206 |
Smiloader | p. 207 |
Sdfloader | p. 208 |
References | p. 210 |
Index | p. 211 |
Table of Contents provided by Ingram. All Rights Reserved. |