Sybase Database Administration: Part 3
 
William Genosa 
In the Nov./Dec. issue, part two of this series discussed
database 
tables and supplied two programs that generate the sql
required to 
re-create all the user tables for a given Sybase database.
In this 
issue I focus on indexes and provide two more tools
which together 
generate the sql required to re-create all the indexes
that exist 
within a user database. 
Database Indexing 
Relational databases store data within objects known
as tables. Indexes 
are also database objects and all database objects are
placed on segments. 
Segments are logical partitions defined on physical
disk devices (raw 
partitions). 
A book contains an index to speed location of information
within the 
book. A database administrator may build an index on
a table to improve 
the performance of queries by decreasing the amount
of disk I/O required 
to locate rows within a table. 
There are various ways of building an index. Sybase
uses the B-tree 
method, which consists of multiple index levels forming
a tree-shaped 
structure, with the root being the beginning of the
index. Index and 
data information are stored in pages of 2,048 bytes.
When an index 
page becomes full, the next entry will split that index
page into 
two pages, each half full. Page splitting can slow performance
significantly 
since many other index pages may require changes as
a result of the 
split page. To reduce page splitting, Sybase allows
the database administrator 
to specify a fillfactor when an index is created; the
fillfactor determines 
how much free space to leave in each page for growth. 
Clustered and Non-Clustered Indexes 
There are two types of Sybase B-tree indexes, the clustered
index 
and the non-clustered index. Figure 1 illustrates a
table with no 
index: data has been added to the table with no specific
order. Figure 2 
illustrates the same table after the creation of a
clustered index. 
Figure 3 illustrates the table when only a non-clustered
index were 
created on it. 
A clustered index physically alters the table it is
created on, so 
there can be only one clustered index per table. A non-clustered
index 
does not alter the table it is created on, and so there
can be more 
than one non-clustered index per table. A clustered
index has pointers 
to data pages, where the data is stored in the physical
order of the 
indexed attributes. Once the data page is located, it
must be scanned 
to find a particular row. A non-clustered index contains
pointers 
to each individual row of a table. A clustered index
must reside on 
the same segment as the table it is created on. A non-clustered
index 
can reside on a different segment than the table it
was created on. 
Planning an Index 
Consider the matter of your application before creating
indexes. If 
you will be performing queries on a table that will
retrieve a range 
of data, you may want to create a clustered index either
on the attributes 
you will be retrieving, or on the search augments of
the where 
clause of your query. This improves performance, because
after the 
first row is found, remaining rows are sequentially
stored on the 
data page. Similarly, if your application does a lot
of inserts and 
deletes, a clustered index can improve performance because
the inserts 
will be distributed across many data pages. Without
a clustered index, 
all inserts would be placed on the last data page (at
the end of the 
heap). Because Sybase uses page-level locking, the last
page would 
become a "hot spot" and cause performance
degradation. 
Because a clustered index must be located on the same
segment as the 
table it is created on, you can move a table to a specific
segment 
by specifying the segment in the index creation statement.
If the 
index is not needed, it can be dropped after the table
has been moved 
to the new segment. 
Index creation also helps maintain uniqueness. Relational
theory does 
not allow duplicate rows (tuples) within a table (relation).
Each 
table should have an attribute or collection of attributes
(composite 
key) which uniquely identifies every row. That unique
attribute is 
known as the primary key. To ensure that the primary
key is unique, 
a unique index can be created on the field or collection
of fields 
which comprise the primary key. Even where a primary
key may require 
a unique index, the decision as to whether that index
should be clustered 
or non-clustered should be based on the nature of the
transactions 
to be made against the table. 
Normalization and Logical Design 
Primary keys are cornerstones of normalization. Normalization
is the 
process of logical database design which eliminates
redundant data 
and ensures the data accurately represents its real-world
entities. 
There are five rules of normalization, but most database
designers 
only follow the first three. I present those three here
along with 
two other laws required for a sound logical design. 
First Normal Form 
First Normal Form (1NF) requires that attributes be
atomic. Since 
an attribute is a subset of a domain, and since Sybase
domains are 
atomic (cannot be de-composed), all Sybase tables are
in first normal 
form by definition. 
Second Normal Form 
Second Normal Form (2NF) requires that all attributes
for a given 
relation (table) contain only information specifically
about the primary 
key of that table (functional dependence). 
Third Normal Form 
Third Normal Form (3NF) requires that all attributes
which are not 
part of the primary key be mutually independent. This
means that non-key 
attributes cannot have any functional dependence on
each other. 
Entity Integrity 
Entity Integrity states that a primary key cannot contain
NULL values. 
A Null represents the absence of a value. 
Referential Integrity 
Referential Integrity states that a foreign key must
be an existing 
value of a primary key of another relation. Any attribute
of a relation 
which is a primary key in some other relation would
be considered 
a foreign key. 
The Index Creation Tools 
The create.index program (Listing 1) will generate the
sql 
required to re-create all the indexes for a table named
in the first 
augment. The program is broken into five sections. 
The first section of create.index sets up environment
variables. 
Because this program is part of a suite of tools, it
allows variables 
to be assigned via a configuration file executed with
the dot 
command. If several programs in the suite are to be
executed, only 
one configuration file would require editing. (Other
tools from this 
suite include create.diskinit, in Sys Admin Sept./Oct.
1994, and create.table, in Sys Admin Nov./Dec. 1994.) 
Section two simply checks for the existence of an augment.
If no table 
name is specified, a message is displayed on the terminal
describing 
the usage and the program terminates. 
Section three retrieves each index name and type for
the table named 
in augment one. Table names are stored in the system
table, called 
SYSOBJECTS, of every database. The index name and type
would be stored 
in SYSINDEXES. The index type is determined by the attribute
indid 
of SYSINDEXES. The query in section three joins the
two tables SYSOBJECTS 
and SYSINDEXES on the common attribute id, which is
the primary 
key of SYSOBJECTS and a foreign key of SYSINDEXES. Output
of the query 
is piped into a while loop which will remove any rows
returned 
with an indid of zero. This is because a table with
no clustered 
index will have an indid of zero. A table with a clustered
index will have an indid of one. A non-clustered index
will 
have an indid of two or greater. Because this process
is only 
concerned with indexes, tables (indid of 0) are removed.
Note 
that SYSINDEXES also has an attribute called segment.
Any 
table with only a clustered index will have only one
row in SYSINDEXES. 
The index and the table both occupy the same data pages
and must therefore 
reside on the same segment. A non-clustered index would
have its own 
entry in SYSINDEXES because it does not share data pages
with the 
table. Non-clustered indexes can reside on a segment
other than the 
segment where the table resides. 
Section four of create.index is where most of the work
is 
accomplished. For each index named in section three,
a while 
loop queries the database for information on various
index creation 
options, as well as the segment the index is located
on. Unions provide 
the output of each option in a suitable format for processing
by awk. 
Awk generates the sql to re-create each index with all
the proper 
options and segment placement. SCCS key-word identifiers
are also 
inserted in every header for software and version control
purposes. 
In Sybase an index can contain up to sixteen attributes.
The sql query 
in section four also has a while loop which executes
the index_col 
function. This function returns each attribute of the
indexes queried. 
Section five of create.index removes the temporary work
file 
created in section three. 
create.index creates the indexes only for a single table.
A second program, create.index.all, calls create.index
for every user table in a given database. This allows
the database 
administrator to generate the indexes for all tables
in a user database 
without losing the ability to generate sql for index
creation on a 
single table. create.index.all simply executes a query
to 
retrieve all user table names, then runs create.index
on each 
table name returned. These tools, along with those presented
in the 
earlier articles in this series, give Sybase database
administrators 
a means of automating some major management tasks.
 
 
 About the Author
 
Bill Genosa is a UNIX Systems/Database Administrator
with Harman 
Consumer Group, located in Woodbury, New York. He works
with RS6000 
computers on systems that serve some 150 users across
the US and in 
Denmark and Singapore. You can reach Bill at 186 Bryant
Avenue, Floral 
Park, NY 11001, or via email as wgenosa@attmail.com. 
 
   
  |