CHAPTER-IV DATABASE MANAGEMENT AND ITS ENVIRONMENT

September 7, 2017 | Author: Alisha Cross | Category: N/A

Share Embed Donate

Report this link

Short Description

Download CHAPTER-IV DATABASE MANAGEMENT AND ITS ENVIRONMENT...

Description

C H APTER-IV DATABASE MANAGEMENT AND ITS ENVIRONMENT 4.1

EVOLUTION OF DATABASE MANAGEMENT SYSTEMS The past two decades have witnessed enormous growth in the number and

importance of database applications. A database engine, a software system that manages how the data is stored and retrieved, provides the basic functions of a database. One such database engine is Microsoft Jet (McManush, 1999). This database engine is frequently used in Microsoft Visual Basic, a design and development tool and Microsoft Access, the database management tool in digital environment. Databases are used to store, manipulate and retrieve data in nearly every type of organization including business, health care, education, tours and travels, government, resource management, libraries and archives. The database approach emphasizes the integration and sharing of data through out the organization having various departments or units (Fig. 4-1). It offers a number of potential advantages compared to the older file processing systems. The database approach allows a single user to implement a series of stand-alone databases in a personal computer or the data may be shared to a number of users in a local area network of computers. In the beginning of computer-based data processing, there were no databases. During those days computers were considerably less powerful than today’s personal computer. These were employed only for scientific and engineering calculations. Gradually computers were used in the business world. Computer file processing systems were developed for this purpose. The traditional file processing systems had a number of shortcomings and limitations. Some of them are - dependency on programmed data, duplication of data, limited data sharing, lengthy development time and excessive programme maintenance. As a result, these systems have been replaced by database processing systems in most critical business applications today (McFadden et al 2000). Though the file processing systems were still dominant during 1960s, the first database management systems (DBMS) were introduced in this decade primarily for large and complex venture such as the Apollo moon-landing project. During 1970s the

49

hierarchical and network DBMS were developed to handle complex data structures. These are regarded as the first generation DBMS. To overcome some of the limitations such as difficult access and limited data independence, E. F. Codd and others developed the relational data model during 1970s. With the relational model, all data are represented in the form of tables. This model provides ease of access for nonprogrammers in database management. This model is considered as second generation DBMS, received widespread acceptance and diffusion in 1980s. To cope with these increasingly complex data, objectoriented databases were introduced during the late 1980s. The decade of the 1990s ushered in a new era of computing, then Internet applications becoming increasingly important. During 1990s multimedia data (including graphics, sound, images and video) became increasingly common. These are considered to be the third generation of DBMS Now a days, a relatively simple fourth generation of language called SQL (Structured Query Language) is used for data retrieval. Now we have the ability to manage increasingly complex data types. The continued developments of ‘universal servers’ are based on object-relational DBMS where wide range of data types can be managed. This decade has witnessed a fully distributed database having access to multiple locations.

Fig. 4-1: Database Applications in Computer Network of an Organization

50

4.2

DATABASE APPLICATIONS A database application is a computer-based application programme or a set of

related programmes that is used to perform a series of activities on behalf of database users. Each database application performs some combinations of the following basic operations: 1. Create - to add new data to the existing database. 2. Read - to acquire current information or to prepare report. 3. Update - to modify current database. 4. Delete - to delete current data from the existing database. The range of database applications may vary from personal computer databases to workgroup database with fewer than 25 persons as user and Department database, where the users may range from 25 to 100 persons responsible for more diverse range of functions. The database applications may be extended to enterprise database for whole organization, where users many be from many departments. Such databases are intended to support organization-wide operations and decision-making. Further, there is the possibility of development of data warehouse, an integrated decision support database for the organization connecting PCs, workgroups and various departments as in Fig. 4,1. A summary of the various types of database applications and management that has been described in this section is shown in Table- 4.1. Table 4-1: Summary of Database Applications Type of

Typical Number

Database

of Users

Typical Architecture

Typical Size of Database

1

Desktop PC

Work Group

5- 25

Client/Server (two-tier)

Megabytes - Gigabytes

Department

25 - 100

Client/Server (three-tier)

Gigabytes

>100

Client/Server (distributed

Gigabytes - Terabytes

Personal

Megabytes

Computer (PC)

Enterprise

or parallel server) Source: McFadden et al 2000.

51

4.3

DATA AND METADATA IN A DATABASE A database is an organized collection of information that's logically related to a

particular subject or purpose. It may be of any size and any complexity. Data consists of facts, text, graphics, images, sound and video segments that have meaning to the users. The data become useful only when these are placed in some context. The primary mechanism for providing context for data is metadata. Metadata are data that describe the properties or characteristics of other data. Some of these properties include data definitions, data structures, and rules or constraints. Metadata allow database designers and users to understand what data exists, what the data mean and what the fine distinction between seemingly similar data items. An example metadata of district level population database intended for keeping little information is provided Table- 4.2. The management of metadata is at least as crucial as managing the associated data since data without clear meaning can be confusing, misinterpreted or erroneous. Table 4-2: Metadata of district level population database Field Name

Data Type

Length

DISTCODE

Integer

10

District Code

DIST_NAME

String or Texts

30

Name of the district

TOTALPOP

Integer

7

Total Population

M ALEPOP

Integer

5

Male Population

FEMALE_POP Integer

5

Female Population

SCPOPLN

Integer

6

Scheduled Caste Population

STPOPLN

Integer

6

Scheduled Tribe Population

LITERACY

Float / Decimal

6

Over all literacy rate in percent

REMARKS

Texts

250

Description

Important remarks

about the

district

Similarly, the metadata of GIS database gives the information on the type of features, earth model, datum, projection system and units of measurements. In the ease of satellite data the metadata information relating to the size of data, number of rows and columns, date and time of pass of the satellite, path - row information, longitudes and latitudes of comers and the centre of the image, projection system (in the case of 52

Geocoded image), pixel resolution and the resampling techniques employed etc. are indicated. The form in which data are stored does not necessarily resemble the form in which they are presented to application programmes (software). The data structure, which the application programme employs, is referred to as logical structure. This structure is defined by various data type in use. The data structure, which is actually stored on tape, disks, and magnetic or other media, is called physical structure. The words logical and physical will be used to describe various aspects of data, logical always referring to the ways application developer sees it and physical always referring to the way the data are recorded on the storage medium.

4.4

THE RELATIONAL DATABASE MODEL Database modelling is particularly important in relational database design because

the process of capturing data as elemental units, defining the relationships for the captured data and normalizing or reducing the data set down to the smallest number of redundantly stored pieces of information is not something that is easily visualized. In the model dating from the 1960s and 1970s, data are stored in file systems, with each file containing sets of related data called records, all of a single type. The basic item of data in a record called attribute, and the record is also referred to as a row or tuple. The database file was called a table. When one needs the information, a tree of files in a hierarchy or network of database is searched. The hierarchical data model is attributed to the release of IBM’s Information Management System (IMS) in 1968. In this model, different kinds of records relate to one another in a hierarchy. IBM’s DB2 is database is a direct descendent of this database technology. The networked database model generalizes the hierarchical model by allowing a set of records to be part of two or more hierarchies. In networked or hierarchical systems related attributes are often stored in repeating fields. In many cases these models store redundant data in order to support data access and reporting needs. The access in these systems is often fast but relies on indexes that are difficult to maintain. In the 1970s an IBM San Jose mathematician, Edgar F. Codd, proposed a model called relational algebra that gives a set of rules and behaviours for creating related tables

53

and joining the data into virtual tables (Vtab). A relational database collects related facts into tables containing records. In a relational database data item or attribute of a record is referred to as a row or tuple. The purpose of relational database theory is to create least number of tables that can store non-redundant data. A Database Management System (DBMS) is a group of programmes that handles various kinds of database. It is a software application that is used to create, maintain, and provide controlled access to user databases. A DBMS that relies on relational model is called Relational Database Management System (RDBMS). In many software packages like FoxPro, dBase, Access, Paradox, SQL Server, Ingres, Oracle and Sybase to name a few are based on RDBMS principles (Martin, 1992). CSIIIQBSBI&8Slll3EIIB^E&lilllillillllllillllllill[llilllllilli[|||||lilllllllillilllllllllllllllilllillllllHlifllHBIBfllfeifeMtfB^^SP^^^&''■^^ £ fe

£ < *

I a b le

F ie ld

W in d o w

H e lp

iai mmm once] a \k\&m mi sed 1

of

20

s e le c te d

b e bp

_______

| (t )i» |Q]

Fig. 4-2: Independent Tables of Spatial Database and Attribute Database A standard GIS package will no doubt contain full-featured DBMS and RDBMS functionalities. For example, a spatial data created for polygon features of the Gram Panchayats and the attribute database created from Census of India, 2001 of the study area

54

is shown in the Fig. 4-2. The data table of map features (Gp.shp) has appeared as “Attributes of Gp.shp” and the other attributes related to this map data have been named as “gp2kl.dbf\ which is seen on the background. Here, a common field “Gpcode' is created for both the set of data. This field will establish the relation between spatial data (Gp.shp) and the data table “gp2kl.dbf\ The entity in both the data set is represented by the Gram Panchayats. The field represented by the column “Gpcode” in each of the table is acting as entity identifier. Therefore, relational link can be established between these two data tables. This has been successfully implemented in Fig. 4-3 using the RDBMS functionality of the GIS package. It is important to note that the spatial data set was created using PC Arc Info’s Data Automation Kit and the digitizing system.

After

necessary data modeling, the data set is converted to Arc View shape files (Gp.shp). Accordingly, the geometric attributes of area and perimeter are calculated based on the ellipsoid WGS84 and UTM projection system. The data table gp2kl.dbf has been designed in dBase 5.0 for Windows defining each attribute properly. C

A M V lr* W W .

Be

Edi

Ia tfc

Field

W indow

Help

g] B 1

20 tdected

d

cmsa

___ 20 j Bcxpathai V ie*

19 S ng m a n 1 8 1 Junrpat

i

"

13

M adhapaia

17

Lonfliap

3 4 51

Kachua

6

15

Changkhola

Cha

1 6 1 Baktigun 14

G a n k h tn d a

nS

11

Gerjajpam

To

Changchaki

1 11

j

8

4 5 5 15 8

45 0 2.736

1494.30

1476 .8 49 " Singimari

1655

£

136 7 5 3'

13 5 1.2 0 4 Jm ip a i

15 9 7

(

Borpathai

10 71

2267

1:

10 79 .4 1

1659.533

39 8 25 6 1774 76 '

33 35.348' >-ongM:

1406.80

139 0.053 C h v d d u b

160 7

c

10 8 2 5 8

1069.383 B a k iia Gun

1656

1C

1 7 5 3 .16 3

M adhapaia

9

1 4 1 9 .1 4

14 0 1.6 5 5

1423 .6 2

1408.583 Getjai Pam

11

1716 42

1695.540

12

3368 0

334.445

8 Debnaiitafc

13

13 3 6 12

1319 .8 66

_______

Kachua

10

9 K ampui

G a n J t h m la

3808

2C

2324

i;

063

i

215 3

11 i

Changchaki

128 5

Kam pu

15 35

t

Debnarttak

1501

c r

b N izK a m p ur

14

14 10 16

13 9 1.6 0 7

N iz Kampur

5 Tetelisora

15

19 3 0 5 5

19 14 .2 2 9

T etdisora

12 4 8

16

2 5 40 9 8

2508 826

Nambar Laiung G a o n

2893

N izK a th ia to i

y

.ii

1 T

gD3 ME Eg

Nam boi Lafcmg G aon

992

f ' 15

4 N c Kathiatoi

17

2006 98

19 0 0 .772

1 ! KoncM

18

2 2 6 7.2 5

2 23 7.5 49 1 K en do!

t s t

t

1846

1C

15 44

/

3

Rangatoo

19

19 43 03

15 18 2 76

2

DakhnNanoi

20

2074.30

2046.659 D e k h in N w w

R angaki

2624

.. . . . . . . . . . . . . . . . . j

Fig. 4-3: Joined Spatial Database and Attribute Database

55

u

Hence, the metadata for Gp.shp will be Feature class -polygon, Datum - WGS 84, Ellipsoid -WGS84, Projection - UTM, Units - Metres, Data type - Digital vector and so on. There may be more elements of the metadata for this feature. Similarly, the metadata for each of the attributes is specified in the Table Designer Programme of dBase 5.0 for Windows like the example that has been cited in Table-4.2 earlier for a district level population database.

4.5

DATABASE ENVIRONMENT A typical database environment consists of nine major components. These

components and their relationships are shown in Fig. 4-4 and a very brief description of each o f the component is given below: 1. Computer-Aided Software Engineering (CASE) Tools: These are automated tools used to design databases and application programmes. 2. Repository: It is a storehouse for all data definitions, data relationships, screen and report formats and other system components. A repository contains an extended set of metadata important for managing databases and other components of an information system. 3. Database Management System (DBMS): It consists of commercial software and occasionally hardware system used to define, create, maintain, and provide controlled access to the database and also to repository. 4. Database: An organized collection of logically related data, usually designed to meet the information needs of multiple users in an organization. It is important to distinguish between the database and repository. The repository contains definitions of data, where as the database contains occurrence of data. 5. Application Programmes: These are computer programmes that are used to create and maintain database and provide information to the users. 6. User Interface: It is a set of tools dealing with languages, menus and other facilities by which users interact with various system components, such as CASE tools, application programmes, the DBMS and the repository.

56

7. Data Administrators: These are the persons who are responsible for the over all information resources of an organization. Data Administrators use CASE tools to improve the productivity of database planning and design.

Fig. 4-4: Components of the Database Environment

8. System Developers: The persons who are responsible for system analysis and designing new application programmes. System developers often use CASE tools for system requirements, analysis and programme design. 9. End Users: These are the persons throughout the organization who add, delete and modify data in the database and who request or receive information from it. All user interactions with the database must be routed through the DBMS. The DBMS operational environment shown in Fig. 4-4 is an integrated system of hardware, software, data and people that is designed to facilitate the storage, retrieval and control of information resource and to improve the productivity of the organization. With advances in software the user interface is becoming very user-friendly. These systems promote end-user computing, which means that users who are not computer experts can define their own reports, displays and simple applications.

57

4.6

ADVANTAGES OF DATABASE APPROACH The database approach offers a number of potential advantages compared to

traditional file processing systems. The primary advantages are summarized below: 1. Programme-Data independence: The separation of data description (metadata) from the application programmes that use the data is called data independence. With the database approach, the data descriptions are stored in a central location called repository. This property of database systems allows the user’s data to change and evolve without changing the application programmes. 2. Minimal Data Redundancy: - The design goal of database approach is that previously separate and redundant data files are integrated into a single logical structure. Each primary fact is recorded ideally in only one place in the database (Fig. 4-2). For example, the fact that the total population (represented by the field ‘Tot_p’) for Gpcode =1 is 5893 is recorded in one place in the table gp2kl.dbf will remain as a value. The database approach does not eliminate redundancy entirely, but it allows the designer to carefully control the type and amount of redundancy. 3. Improved Data Consistency: - By eliminating (or controlling) the redundancy, we greatly reduce the opportunities for inconsistency. For example if a kind of data is stored at one location that can be updated at the same location subsequently. Finally we avoid the wasted storage space that results from redundant data storage. 4. Improved Data Sharing: - A database is designed as a shared resource. Authorized users are allowed to use the database and each user or a group of users is provided one or more user view to facilitate this use. A user view is logical description of some portion of the database that is required by the user to perform some task. For example, the user can prepare a report from the table by using only the names of Gram Panchayats and the population attributes or the user may update information by using the form. Therefore, the report and the entry form may be regarded as user views. As a result the data sharing capability increases in a network environment.

58

5. Increased Productivity of Application Development: - A major advantage of database approach is that it greatly reduces the cost and time for developing new business applications, 6. Enforcement of Standards: - When database approach is implemented with M l management support, the database administration function should be granted single-point authority and responsibility for establishing and enforcing data standards such as naming conventions, data quality standards and uniform procedure for accessing, updating and protecting data. The data repository provides database administrators with a powerful set of tools for developing and enforcing these standards. 7. Improved Data Quality: - The database approach provides a number of tools and processes to improve data quality. Two of the more important are: (a) Database designers can specify integrity constraints that are enforced by the DBMS. A constraint is a kind of rule that cannot be violated by the users. (b) One of the objectives of data warehouse environment is to clean up operational data before they are placed in the data warehouse. 8. Improved Data Accessibility and Responsiveness: - With the relational database, end users without any programming experience can retrieve and display the results using Structured Query Language (SQL). 9. Reduced Programme Maintenance: - In traditional file processing system stored data cannot be altered or added frequently, these are highly programme dependent. In a database environment, data are more independent of the application programmes that use them. As a result the programme maintenance can be highly reduced in modem database management. However, many organizations attempting to realize these benefits are frustrated due to old data model and the database management software in use in their organization. Fortunately, relational model and the newer object-oriented model provide significantly better environment for achieving these benefits.

59

CHAPTER-IV DATABASE MANAGEMENT AND ITS ENVIRONMENT

Short Description

Description

Comments