CHAPTER-IV DATABASE MANAGEMENT AND ITS ENVIRONMENT
September 7, 2017 | Author: Alisha Cross | Category: N/A
Short Description
Download CHAPTER-IV DATABASE MANAGEMENT AND ITS ENVIRONMENT...
Description
C H APTER-IV DATABASE MANAGEMENT AND ITS ENVIRONMENT 4.1
EVOLUTION OF DATABASE MANAGEMENT SYSTEMS The past two decades have witnessed enormous growth in the number and
importance of database applications. A database engine, a software system that manages how the data is stored and retrieved, provides the basic functions of a database. One such database engine is Microsoft Jet (McManush, 1999). This database engine is frequently used in Microsoft Visual Basic, a design and development tool and Microsoft Access, the database management tool in digital environment. Databases are used to store, manipulate and retrieve data in nearly every type of organization including business, health care, education, tours and travels, government, resource management, libraries and archives. The database approach emphasizes the integration and sharing of data through out the organization having various departments or units (Fig. 4-1). It offers a number of potential advantages compared to the older file processing systems. The database approach allows a single user to implement a series of stand-alone databases in a personal computer or the data may be shared to a number of users in a local area network of computers. In the beginning of computer-based data processing, there were no databases. During those days computers were considerably less powerful than today’s personal computer. These were employed only for scientific and engineering calculations. Gradually computers were used in the business world. Computer file processing systems were developed for this purpose. The traditional file processing systems had a number of shortcomings and limitations. Some of them are - dependency on programmed data, duplication of data, limited data sharing, lengthy development time and excessive programme maintenance. As a result, these systems have been replaced by database processing systems in most critical business applications today (McFadden et al 2000). Though the file processing systems were still dominant during 1960s, the first database management systems (DBMS) were introduced in this decade primarily for large and complex venture such as the Apollo moon-landing project. During 1970s the
49
hierarchical and network DBMS were developed to handle complex data structures. These are regarded as the first generation DBMS. To overcome some of the limitations such as difficult access and limited data independence, E. F. Codd and others developed the relational data model during 1970s. With the relational model, all data are represented in the form of tables. This model provides ease of access for nonprogrammers in database management. This model is considered as second generation DBMS, received widespread acceptance and diffusion in 1980s. To cope with these increasingly complex data, objectoriented databases were introduced during the late 1980s. The decade of the 1990s ushered in a new era of computing, then Internet applications becoming increasingly important. During 1990s multimedia data (including graphics, sound, images and video) became increasingly common. These are considered to be the third generation of DBMS Now a days, a relatively simple fourth generation of language called SQL (Structured Query Language) is used for data retrieval. Now we have the ability to manage increasingly complex data types. The continued developments of ‘universal servers’ are based on object-relational DBMS where wide range of data types can be managed. This decade has witnessed a fully distributed database having access to multiple locations.
Fig. 4-1: Database Applications in Computer Network of an Organization
50
4.2
DATABASE APPLICATIONS A database application is a computer-based application programme or a set of
related programmes that is used to perform a series of activities on behalf of database users. Each database application performs some combinations of the following basic operations: 1. Create - to add new data to the existing database. 2. Read - to acquire current information or to prepare report. 3. Update - to modify current database. 4. Delete - to delete current data from the existing database. The range of database applications may vary from personal computer databases to workgroup database with fewer than 25 persons as user and Department database, where the users may range from 25 to 100 persons responsible for more diverse range of functions. The database applications may be extended to enterprise database for whole organization, where users many be from many departments. Such databases are intended to support organization-wide operations and decision-making. Further, there is the possibility of development of data warehouse, an integrated decision support database for the organization connecting PCs, workgroups and various departments as in Fig. 4,1. A summary of the various types of database applications and management that has been described in this section is shown in Table- 4.1. Table 4-1: Summary of Database Applications Type of
Typical Number
Database
of Users
Typical Architecture
Typical Size of Database
1
Desktop PC
Work Group
5- 25
Client/Server (two-tier)
Megabytes - Gigabytes
Department
25 - 100
Client/Server (three-tier)
Gigabytes
>100
Client/Server (distributed
Gigabytes - Terabytes
Personal
Megabytes
Computer (PC)
Enterprise
or parallel server) Source: McFadden et al 2000.
51
4.3
DATA AND METADATA IN A DATABASE A database is an organized collection of information that's logically related to a
particular subject or purpose. It may be of any size and any complexity. Data consists of facts, text, graphics, images, sound and video segments that have meaning to the users. The data become useful only when these are placed in some context. The primary mechanism for providing context for data is metadata. Metadata are data that describe the properties or characteristics of other data. Some of these properties include data definitions, data structures, and rules or constraints. Metadata allow database designers and users to understand what data exists, what the data mean and what the fine distinction between seemingly similar data items. An example metadata of district level population database intended for keeping little information is provided Table- 4.2. The management of metadata is at least as crucial as managing the associated data since data without clear meaning can be confusing, misinterpreted or erroneous. Table 4-2: Metadata of district level population database Field Name
Data Type
Length
DISTCODE
Integer
10
District Code
DIST_NAME
String or Texts
30
Name of the district
TOTALPOP
Integer
7
Total Population
M ALEPOP
Integer
5
Male Population
FEMALE_POP Integer
5
Female Population
SCPOPLN
Integer
6
Scheduled Caste Population
STPOPLN
Integer
6
Scheduled Tribe Population
LITERACY
Float / Decimal
6
Over all literacy rate in percent
REMARKS
Texts
250
Description
Important remarks
about the
district
Similarly, the metadata of GIS database gives the information on the type of features, earth model, datum, projection system and units of measurements. In the ease of satellite data the metadata information relating to the size of data, number of rows and columns, date and time of pass of the satellite, path - row information, longitudes and latitudes of comers and the centre of the image, projection system (in the case of 52
Geocoded image), pixel resolution and the resampling techniques employed etc. are indicated. The form in which data are stored does not necessarily resemble the form in which they are presented to application programmes (software). The data structure, which the application programme employs, is referred to as logical structure. This structure is defined by various data type in use. The data structure, which is actually stored on tape, disks, and magnetic or other media, is called physical structure. The words logical and physical will be used to describe various aspects of data, logical always referring to the ways application developer sees it and physical always referring to the way the data are recorded on the storage medium.
4.4
THE RELATIONAL DATABASE MODEL Database modelling is particularly important in relational database design because
the process of capturing data as elemental units, defining the relationships for the captured data and normalizing or reducing the data set down to the smallest number of redundantly stored pieces of information is not something that is easily visualized. In the model dating from the 1960s and 1970s, data are stored in file systems, with each file containing sets of related data called records, all of a single type. The basic item of data in a record called attribute, and the record is also referred to as a row or tuple. The database file was called a table. When one needs the information, a tree of files in a hierarchy or network of database is searched. The hierarchical data model is attributed to the release of IBM’s Information Management System (IMS) in 1968. In this model, different kinds of records relate to one another in a hierarchy. IBM’s DB2 is database is a direct descendent of this database technology. The networked database model generalizes the hierarchical model by allowing a set of records to be part of two or more hierarchies. In networked or hierarchical systems related attributes are often stored in repeating fields. In many cases these models store redundant data in order to support data access and reporting needs. The access in these systems is often fast but relies on indexes that are difficult to maintain. In the 1970s an IBM San Jose mathematician, Edgar F. Codd, proposed a model called relational algebra that gives a set of rules and behaviours for creating related tables
53
and joining the data into virtual tables (Vtab). A relational database collects related facts into tables containing records. In a relational database data item or attribute of a record is referred to as a row or tuple. The purpose of relational database theory is to create least number of tables that can store non-redundant data. A Database Management System (DBMS) is a group of programmes that handles various kinds of database. It is a software application that is used to create, maintain, and provide controlled access to user databases. A DBMS that relies on relational model is called Relational Database Management System (RDBMS). In many software packages like FoxPro, dBase, Access, Paradox, SQL Server, Ingres, Oracle and Sybase to name a few are based on RDBMS principles (Martin, 1992). CSIIIQBSBI&8Slll3EIIB^E&lilllillillllllillllllill[llilllllilli[|||||lilllllllillilllllllllllllllilllillllllHlifllHBIBfllfeifeMtfB^^SP^^^&''■^^ £ fe
£ < *
I a b le
F ie ld
W in d o w
H e lp
iai mmm once] a \k\&m mi sed 1
of
20
s e le c te d
b e bp
_______
| (t )i» |Q]
Fig. 4-2: Independent Tables of Spatial Database and Attribute Database A standard GIS package will no doubt contain full-featured DBMS and RDBMS functionalities. For example, a spatial data created for polygon features of the Gram Panchayats and the attribute database created from Census of India, 2001 of the study area
54
is shown in the Fig. 4-2. The data table of map features (Gp.shp) has appeared as “Attributes of Gp.shp” and the other attributes related to this map data have been named as “gp2kl.dbf\ which is seen on the background. Here, a common field “Gpcode' is created for both the set of data. This field will establish the relation between spatial data (Gp.shp) and the data table “gp2kl.dbf\ The entity in both the data set is represented by the Gram Panchayats. The field represented by the column “Gpcode” in each of the table is acting as entity identifier. Therefore, relational link can be established between these two data tables. This has been successfully implemented in Fig. 4-3 using the RDBMS functionality of the GIS package. It is important to note that the spatial data set was created using PC Arc Info’s Data Automation Kit and the digitizing system.
After
necessary data modeling, the data set is converted to Arc View shape files (Gp.shp). Accordingly, the geometric attributes of area and perimeter are calculated based on the ellipsoid WGS84 and UTM projection system. The data table gp2kl.dbf has been designed in dBase 5.0 for Windows defining each attribute properly. C
A M V lr* W W .
Be
Edi
Ia tfc
Field
W indow
Help
g] B 1
20 tdected
d
cmsa
___ 20 j Bcxpathai V ie*
19 S ng m a n 1 8 1 Junrpat
i
"
13
M adhapaia
17
Lonfliap
3 4 51
Kachua
6
15
Changkhola
Cha
1 6 1 Baktigun 14
G a n k h tn d a
nS
11
Gerjajpam
To
Changchaki
1 11
j
8
4 5 5 15 8
45 0 2.736
1494.30
1476 .8 49 " Singimari
1655
£
136 7 5 3'
13 5 1.2 0 4 Jm ip a i
15 9 7
(
Borpathai
10 71
2267
1:
10 79 .4 1
1659.533
39 8 25 6 1774 76 '
33 35.348' >-ongM:
1406.80
139 0.053 C h v d d u b
160 7
c
10 8 2 5 8
1069.383 B a k iia Gun
1656
1C
1 7 5 3 .16 3
M adhapaia
9
1 4 1 9 .1 4
14 0 1.6 5 5
1423 .6 2
1408.583 Getjai Pam
11
1716 42
1695.540
12
3368 0
334.445
8 Debnaiitafc
13
13 3 6 12
1319 .8 66
_______
Kachua
10
9 K ampui
G a n J t h m la
3808
2C
2324
i;
063
i
215 3
11 i
Changchaki
128 5
Kam pu
15 35
t
Debnarttak
1501
c r
b N izK a m p ur
14
14 10 16
13 9 1.6 0 7
N iz Kampur
5 Tetelisora
15
19 3 0 5 5
19 14 .2 2 9
T etdisora
12 4 8
16
2 5 40 9 8
2508 826
Nambar Laiung G a o n
2893
N izK a th ia to i
y
.ii
1 T
gD3 ME Eg
Nam boi Lafcmg G aon
992
f ' 15
4 N c Kathiatoi
17
2006 98
19 0 0 .772
1 ! KoncM
18
2 2 6 7.2 5
2 23 7.5 49 1 K en do!
t s t
t
1846
1C
15 44
/
3
Rangatoo
19
19 43 03
15 18 2 76
2
DakhnNanoi
20
2074.30
2046.659 D e k h in N w w
R angaki
2624
.. . . . . . . . . . . . . . . . . j
Fig. 4-3: Joined Spatial Database and Attribute Database
55
u
Hence, the metadata for Gp.shp will be Feature class -polygon, Datum - WGS 84, Ellipsoid -WGS84, Projection - UTM, Units - Metres, Data type - Digital vector and so on. There may be more elements of the metadata for this feature. Similarly, the metadata for each of the attributes is specified in the Table Designer Programme of dBase 5.0 for Windows like the example that has been cited in Table-4.2 earlier for a district level population database.
4.5
DATABASE ENVIRONMENT A typical database environment consists of nine major components. These
components and their relationships are shown in Fig. 4-4 and a very brief description of each o f the component is given below: 1. Computer-Aided Software Engineering (CASE) Tools: These are automated tools used to design databases and application programmes. 2. Repository: It is a storehouse for all data definitions, data relationships, screen and report formats and other system components. A repository contains an extended set of metadata important for managing databases and other components of an information system. 3. Database Management System (DBMS): It consists of commercial software and occasionally hardware system used to define, create, maintain, and provide controlled access to the database and also to repository. 4. Database: An organized collection of logically related data, usually designed to meet the information needs of multiple users in an organization. It is important to distinguish between the database and repository. The repository contains definitions of data, where as the database contains occurrence of data. 5. Application Programmes: These are computer programmes that are used to create and maintain database and provide information to the users. 6. User Interface: It is a set of tools dealing with languages, menus and other facilities by which users interact with various system components, such as CASE tools, application programmes, the DBMS and the repository.
56
7. Data Administrators: These are the persons who are responsible for the over all information resources of an organization. Data Administrators use CASE tools to improve the productivity of database planning and design.
Fig. 4-4: Components of the Database Environment
8. System Developers: The persons who are responsible for system analysis and designing new application programmes. System developers often use CASE tools for system requirements, analysis and programme design. 9. End Users: These are the persons throughout the organization who add, delete and modify data in the database and who request or receive information from it. All user interactions with the database must be routed through the DBMS. The DBMS operational environment shown in Fig. 4-4 is an integrated system of hardware, software, data and people that is designed to facilitate the storage, retrieval and control of information resource and to improve the productivity of the organization. With advances in software the user interface is becoming very user-friendly. These systems promote end-user computing, which means that users who are not computer experts can define their own reports, displays and simple applications.
57
4.6
ADVANTAGES OF DATABASE APPROACH The database approach offers a number of potential advantages compared to
traditional file processing systems. The primary advantages are summarized below: 1. Programme-Data independence: The separation of data description (metadata) from the application programmes that use the data is called data independence. With the database approach, the data descriptions are stored in a central location called repository. This property of database systems allows the user’s data to change and evolve without changing the application programmes. 2. Minimal Data Redundancy: - The design goal of database approach is that previously separate and redundant data files are integrated into a single logical structure. Each primary fact is recorded ideally in only one place in the database (Fig. 4-2). For example, the fact that the total population (represented by the field ‘Tot_p’) for Gpcode =1 is 5893 is recorded in one place in the table gp2kl.dbf will remain as a value. The database approach does not eliminate redundancy entirely, but it allows the designer to carefully control the type and amount of redundancy. 3. Improved Data Consistency: - By eliminating (or controlling) the redundancy, we greatly reduce the opportunities for inconsistency. For example if a kind of data is stored at one location that can be updated at the same location subsequently. Finally we avoid the wasted storage space that results from redundant data storage. 4. Improved Data Sharing: - A database is designed as a shared resource. Authorized users are allowed to use the database and each user or a group of users is provided one or more user view to facilitate this use. A user view is logical description of some portion of the database that is required by the user to perform some task. For example, the user can prepare a report from the table by using only the names of Gram Panchayats and the population attributes or the user may update information by using the form. Therefore, the report and the entry form may be regarded as user views. As a result the data sharing capability increases in a network environment.
58
5. Increased Productivity of Application Development: - A major advantage of database approach is that it greatly reduces the cost and time for developing new business applications, 6. Enforcement of Standards: - When database approach is implemented with M l management support, the database administration function should be granted single-point authority and responsibility for establishing and enforcing data standards such as naming conventions, data quality standards and uniform procedure for accessing, updating and protecting data. The data repository provides database administrators with a powerful set of tools for developing and enforcing these standards. 7. Improved Data Quality: - The database approach provides a number of tools and processes to improve data quality. Two of the more important are: (a) Database designers can specify integrity constraints that are enforced by the DBMS. A constraint is a kind of rule that cannot be violated by the users. (b) One of the objectives of data warehouse environment is to clean up operational data before they are placed in the data warehouse. 8. Improved Data Accessibility and Responsiveness: - With the relational database, end users without any programming experience can retrieve and display the results using Structured Query Language (SQL). 9. Reduced Programme Maintenance: - In traditional file processing system stored data cannot be altered or added frequently, these are highly programme dependent. In a database environment, data are more independent of the application programmes that use them. As a result the programme maintenance can be highly reduced in modem database management. However, many organizations attempting to realize these benefits are frustrated due to old data model and the database management software in use in their organization. Fortunately, relational model and the newer object-oriented model provide significantly better environment for achieving these benefits.
59
View more...
Comments