Database Management System Chapter 1 – An Overview
องค์ประกอบของข้อมูล Bit : 0 or 1 Byte : 1 Byte consists of 8 Bits ( 1 Byte is one character) Ex. 01000001 represents a character “A” Word : A word consists of several bytes. Field : A field or an attribute is a boundary of data. Record : A record consists of a group of field values that represent one person or one thing. File : A file consists of several records that share the same structure. Data Base : A Database consists of several related files.
THE TRADITIONAL FILE ENVIRONMENT A data file is a collection of logically related records. In a file management environment, each application has a specific data file related to it, containing all the data records needed by the application. Registrar’s office Class programs Class file Employee data Student data Employee data contains information of the instructor who is an employee. Student data is repeated. Course data Registration data Accounting dept. Accounts programs Accounts file Employee data Student data Tuition data Financial aid
Program – Data Dependence Problems with the File processing Program – Data Dependence Program – Data Dependence is the tight relationship between data stored in various files and the specific programming language used to access or execute them. Every traditional computer program has to describe the input file format with which it works. Program – Data Dependence leads to the potential for Data redundancy. Data Redundancy is the presence of duplicate data in multiple data files. Data redundancy occurs when different divisions, functional areas, and groups in an organization independently collect the same piece of data and build separately their own systems with different languages, platforms, and data file format . Data Redundancy leads to the potential for data inconsistency.
Data Isolation : File processing can leads to difficulty in accessing data from different applications. (Data can not be shared). With applications uniquely designed and implemented, data are scattered in various files, and files may be in different formats. Integrity problems : The data values stored in the database must satisfy certain types of consistency constraints. Developers enforce these constraints in the system by adding appropriate code in the various application programs. However, when new constraints are added, it is difficult to change the programs to enforce them. Security problems: Enforcing security constraints in the application is much more difficult than that of in the database environment. Not every user of the database system should be able to access all the data. Applications should not have to be developed with regard to how the data are stored. That is, the applications and data in computer systems should be independent - a characteristic called application/data independence.
Databases : The modern approach Registrar’s office Class programs Academic info Student data Database management system Course data Accounting dept. Accounts programs Registration data Employee data Team data Athletics dept. Sports programs Tuition data Financial aid A database, which is a logical group of related files, can eliminate many of the problems associated with a traditional file environment. With the database approach, all the data are typically contained in the same storage location, rather than residing in many different files across the organization. Unlike the traditional approach, in which different programs access the different data files, the database is arranged so that one set of software programs (DBMS) provides access to all the data.
DBMS: Database Management System DBMS consists of a collection of interrelated data and a set of programs to access those data. The primary goal of DBMS is to provide an environment that is both convenient and efficient to use in retrieving and storing database information. DBMS acts as the interface software between users and data stored in the database by hiding the detail of internal
ฐานข้อมูลกลาง คลังข้อมูล มาตรฐานข้อมูล การให้บริการ The enterprise system collects data from various key business processes and stores the data in a single comprehensive data repository where they can be used by other parts of the business ฐานข้อมูลกลาง การให้บริการ กรรมวิธีนำเข้า มาตรฐานข้อมูล คลังข้อมูล
Locating Data in Databases A centralized database has all the related files in one physical location. Central Location
Locating Data in Databases A distributed database has complete copies of a database, or portions of a database, in more than one location. Location B Location A Central Location Location C Location D
There are two types of distributed databases: replicated partitioned A replicated database has complete copies of the entire database in many locations, primarily to alleviate the single-point-of-failure problems of a centralized database as well as to increase user access responsiveness. A partitioned database is subdivided, so that each location has a portion of the entire database (usually the portion that meets users’ local needs). This type of database provides the response speed of localized files without the need to replicate all changes in multiple locations.
Databases : The modern approach Registrar’s office Class programs Academic info Student data Database management system Course data Accounting dept. Accounts programs Registration data Employee data Team data Athletics dept. Sports programs Tuition data Financial aid A database, which is a logical group of related files, can eliminate many of the problems associated with a traditional file environment. With the database approach, all the data are typically contained in the same storage location, rather than residing in many different files across the organization. Unlike the traditional approach, in which different programs access the different data files, the database is arranged so that one set of software programs (DBMS) provides access to all the data.
Abstract View of Data DBMS provides the ability for many different users to share data and process resources. How can a single, unified database meet the differing requirements of so many users? That is, the system hides certain details of how the data are stored and maintained. A DBMS minimizes these problems by providing 3 views of the database data: (Since many database-systems users are not computer trained, developers hide the complexity from users through several levels of abstraction : A physical view : The lowest level of abstraction describes how the data are actually stored. It deals with the actual, physical arrangement and location of data in the direct access storage devices (DASDs). A logical view The next higher level of abstraction describes what data are stored in the database, and what relationships exist among those data. The logical describes the entire database in terms of a small number of relatively simple structures.(Users may not want to know all the technical details of physical storage. An external view : The higher level of abstraction describes only part of the entire database. Many users of the database do not need all this information; instead, they need to access only a part of the database.
Data Abstraction External View 1 View 2 View 3 External External Conceptual (Logical) Level Physical Level
Information Services Data Model Database Management System Logical User Views Data elements and relationships needed for Billing, Load Research, or Maintenance processing Billing Application Load Research Application Maintenance Application Maintenance Data Model Billing & Research Data Model Data elements and relationships (the schema) needed for the support of all information services. Information Services Data Model Database Management System Software Interface The DBMS provides access to the information services databases Service Databases Physical Data Views Organization and location of data on the storage media.
One strength of a DBMS is that while there is only one physical view of the data, there can be an endless number of different external views - one specifically tailored to each individual user, if necessary. Data Model Data Model defines how data are conceptually structured. A collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints. Examples of data model : hierarchical, network, relational, object-oriented, entity-relation model, object-relational, hypermedia, and multidimensional models.
Instance และ Schema Instance ของฐานข้อมูล : เหล่าข้อมูลที่ถูกจัดเก็บในฐานข้อมูลในขณะใดขณะหนึ่ง ดังนั้น Instance จึงเปรียบเสมือนกับค่าของตัวแปรในขณะใดขณะหนึ่งใน Programming Language การเปลี่ยนแปลงข้อมูลในระบบฐานข้อมูล มีสาเหตุมาจากการที่ข้อมูลมีการ Insert, Delete หรือ Update เกิดขึ้นตลอดเวลา Schema : เค้าร่าง (โครงสร้าง) ที่ได้จากการออกแบบฐานข้อมูลโดยรวม เช่นการกำหนดว่า ระบบฐานข้อมูลควรประกอบด้วยไฟล์อะไรบ้าง และไฟล์เหล่านั้นมีความสัมพันธ์กันอย่างไร แต่ละไฟล์ควรประกอบด้วยฟิลด์ (fields) อะไรบ้าง
Instance and Schema Instance of Database : The collection of information stored in the database at a particular moment. Instance of Database change over time as information is inserted and deleted. The concept of instances can be understood by analogy to a program written in a programming language. Each variable has a particular value at any given instant of time. Schema : The overall design of the database which must be specified as following : What files are included in a database? What fields are included in each file? How those files are related? A database schema corresponds to the variable declarations in a program.
Database Language A database system provides a Data Definition Language (DDL) to specify the database schema and a Data Manipulation Language (DML) to express database queries and updates. Data Definition Language (DDL) A database schema can be specified by a set of definitions expressed by a special language called a Data-Definition Language. DDL defines what types of information are in the database and how they will be structured. The DDL defines each data element as it appears in the database before that data element is translated into the forms required by the applications.
Examples of Data Definition Language (DDL) Database Languages Examples of Data Definition Language (DDL) CREATE TABLE CREATE VIEW CREATE INDEX ALTER TABLE DROP VIEW DROP INDEX DROP TABLE CREATE TABLE S ( S# Char(5) Not Null, Sname Char(20) Not Null, Status SmallInt Not Null, City Char(15) Not Null, PRIMARY KEY (S#) );
After DDL statement has been interpreted, DBMS generates a special set of tables called the data dictionary or data directory. A data dictionary contains metadata – that is, data about data. The schema of a table is an example of metadata. A database system consults the data dictionary before reading or modifying actual data. ผลจากการคอมไพล์ประโยคที่เขียนด้วย DDL (DDL Interpreter) จะทำให้เกิดตารางที่จะจัดเก็บในไฟล์ชนิดหนึ่งที่เรียกว่า พจนานุกรมข้อมูล (Data Dictionary) ซึ่งเก็บข้อมูลที่เกี่ยวกับ โครงสร้างที่ได้จากการออกแบบฐานข้อมูลนั้น ๆ และถ้ามีการเปลี่ยนแปลงแก้ไข หรือ เรียกใช้ข้อมูล DBMS ต้องอาศัยข้อมูลของโครงสร้างจากไฟล์นี้เสมอ The data values stored in the database must satisfy certain consistency constraints. The database systems check these constraints every time the database is updated.
Data Manipulation Language (DML) Data manipulation can be classified into - Retrieving data - Inserting new records - Deleting records - Updating current fields DML is a language that enables users to access or manipulate data as organized by the appropriate data model.) There are basically 2 types: Procedural DML requires a user to specify what data are needed and how to get those data. Declarative DML (nonprocedural DML) requires a user to specify what data are needed without specifying how to get those data.
Database Access from Application Programs Application Program are written in a Host Language. To access the database, DML statements need to be executed from the hose language by Providing an application program interface (API) that can be used to send DML and DDL statements to the database, and retrieve the result. Ex. ODBC, JDBC. Extending the host language syntax to embed DML calls within Host Language program.. However, DML syntax is different from syntax of Host Language, Query Processor of DBMS consists of : Embedded DML Precompiler DML Compiler Query Evaluation Engine DDL Interpreter
Embedded DML Precompiler : Embedded DML Precompiler converts DML statements to be a normal procedure call in the application whose source code can be any hose language. By this conversion, the application can be compiled by normal compiler of that hose language. DML Compiler: DML Compiler translate DML statement to low-level instructions which Query Evaluation Engine can understand. Embedded DML Precompiler must coordinate with DML compiler to create an appropriate Object Code.
Data Manipulation Language (DML) การใช้ข้อมูลในระบบฐานข้อมูล อาจจำแนกได้เป็น - การเรียกดูข้อมูล (RetrieveData) - การเพิ่มข้อมูลใหม่ (Insert Records) - การลบข้อมูลเดิม (Delete Records) - การเปลี่ยนแปลงแก้ไขข้อมูลเดิม (Update Fields) การใช้ข้อมูลในระดับของผู้ใช้นั้น สามารถกระทำได้โดยผ่านทาง DBMS ซึ่งได้เตรีมภาษาสำหรับการใช้ข้อมูล (DML is a language that enables users to access or manipulate data as ornagized by the appropriate data model.)
การเขียนแอปพลิเคชั่นด้วย Host Language เพื่อร้องขอ (request) ข้อมูลจากระบบฐานข้อมูล ทำได้โดยการใช้คำสั่ง DML ภายใน Host Language นั้น แต่เนื่องจาก DML Syntax แตกต่างไปจาก Syntax ของ Host Language ทั่วไป ดังนั้นโปรแกรมจัดการระบบฐานข้อมูล (DBMS) จึงต้องมี Query Processor ซึ่งประกอบด้วย Embedded DML Precompiler DML Compiler Query Evaluation Engine DDL Interpreter
Embedded DML Precompiler : แปลง (convert) ประโยคคำสั่ง DML ให้เป็นโมดุลหนึ่ง (procedure call) ภายในแอ็ปพลิเคชั่นที่สร้างด้วย host language ภาษาใด ภาษาหนึ่ง ซึ่งมีผลทำให้ แอ็ปพลิเคชั่น สามารถถูก Compile ได้ ด้วยตัว Compiler ปกติของภาษานั้น และสามารถ ประมวลผล object ของ Database (เช่น object ที่เป็น relation) ได้ DML Compiler : แปล (translate) ประโยคคำสั่ง DML ไปเป็น low-level instructions ที่ Query Evaluation Engine สามารถเข้าใจได้ Embedded DML Precompiler ต้องติดต่อกับ DML Compiler เพื่อ สร้าง Object Code ที่เหมาะสม
Query Evaluation Engine : DDL Interpreter : แปลประโยคคำสั่งที่เขียนด้วย DDL Query Evaluation Engine : เมื่อ DML Compiler สร้าง Object Code ที่เป็น Low-level Instructions ขึ้นมาแล้ว ก็จะส่งต่อให้กับ Query Evaluation Engine ทำการประมวลผลต่อไป
ADO Connection Cursor Engine Query Processor Simple Data Provider ODBC Data Consumers Visual Basic Application Visual C++ Application ASP page Other Applications ADO Connection Service Components DML Compiler DDL Interpreter Embedded DML Precompiler Query Evaluation Engine Cursor Engine Query Processor Data Providers Simple Data Provider ODBC Provider Mainframe Data Non Relation Data ODBC Database Server
How Database Represent Information A database management system (DBMS) is a combination of software and data: The physical database : a collection of files that contain the data content. The schema: a specification of the information content of the physical database The database engine: software that supports access to and modification of the contents of the database The data definition and manipulation language: programming languages that support schema definition and database access.
Overall System Structure Query processor components DML compiler Embedded DML precompiler DDL interpreter Query evaluation engine Storage manager components : provides the interface between the low level data stored in the database and the application programs and queries submitted to the system Authorization and integrity manager : tests for the satisfaction of integrity constraints and checks the authority of users to access data. Transaction manager : ensures that the database remains in a consistent state despite system failures and that concurrent transaction executions proceed without conflicting. File manager : manages the allocation of space on disk storage and the data structures used to represent information stored on disk. Buffer manager : responsible for fetching data from disk storage into main memory.
Data files : store the database itself. Data dictionary : stores metadata about the structure of the database. The data dictionary is used heavily. Therefore, great emphasis should be placed on developing a good design and efficient implementation of the dictionary. Indices : provide fast access to data items that hold particular values.
users query procesor Storage manager Disk storage Data dictionary Application programmers Sophisticated users Database administrator Naïve users users use write use use Application interfaces Application programs database scheme query Embedded DML precompiler DML compiler query procesor Application programs object code DDL interpreter Query evaluation engine Authorization and integrity manager Storage manager Transaction manager File manager Buffer manager Disk storage Data dictionary Indices Data files
Database Processing Advantages Data redundancy (Repetition of data) can be reduced. Data inconsistency can be avoided. Data can be shared. Standard of data can be controlled. Concurrent-access to database can be provided. Data Integrity and data security can be maintained.
Business Functions Information Systems Marketing and Sales Human Resources Manufacturing Accounting Finance Vendors Customer Business processes Business processes Business processes Business processes Business processes Organizational Boundaries Organizational Boundaries Marketing and Sales Systems Human Resources Systems Manufacturing Systems Accounting Systems Finance Systems Information Systems Traditional view of systems. In most organizations, separate systems built over a long period of time support discrete business processes and discrete business functions. The organization’s system rarely included vendors and customers.
PRODUCTION MAINTENANCE FINANCE AND ACCOUNTING SALES AND MARKETING
FINANCE AND ACCOUNTING SALES AND MARKETING How to integrate functions and business processes 1. Implement a separate “middleware” bridge to each of these specialized systems to link them all together PRODUCTION MAINTENANCE FINANCE AND ACCOUNTING SALES AND MARKETING
How to integrate functions and business processes 1. Implement a separate “middleware” bridge to each of these specialized systems to link them all together 2. Implement or buy entirely new enterprise applications that can coordinate activities, decision, and knowledge across many different functions, levels, and business units in a firm. Enterprise systems Supply chain management systems SCM) Customer relationship management systems (CRM) Each of these enterprise applications integrates a related set of functions and business process to enhance the performance of the organization as a whole.
also known as enterprise resource planning (ERP) Enterprise systems also known as enterprise resource planning (ERP) The enterprise systems, designed to support organization- wide, coordinates and integrates a related set of major internal functions and business processes to enhance the performance of the organization as a whole. They solve the problem of organizational inefficiencies created by isolated islands of information, business processes, and technology.
Enterprise-wide business Process Manufacturing Accounting Business process Enterprise-wide business Process Vendors Organizational Boundaries Customers Organizational Boundaries Human Resources Marketing and Sales Finance Enterprise systems. Enterprise systems can integrate the key business processes of an entire firm into a single software system that allows information to flow seamlessly throughout the organization. These systems focus primarily on internal processes but may include transactions with customers and vendors.