August 2014 ~ Lecture Notes For You

What is a database? Describe the advantages and disadvantages of using of DBMS. (7)
Ans: Database – A database is a collection of related data and/or information stored so that it is available to many users for different purposes.
Advantages Of DBMS
1.
Centralized Management and Control - One of the main advantages of using a database system is that the organization can exert, via the DBA, centralizedmanagement and control over the data.
2.
Reduction of Redundancies and Inconsistencies - Centralized control avoids unnecessary duplication of data and effectively reduces the total amount of data storage required. Removing redundancy eliminates inconsistencies.
3.
Data Sharing - A database allows the sharing of data under its control by any number of application programs or users.
4.
Data Integrity - Data integrity means that the data contained in the database is both accurate and consistent. Centralized control can also ensure that adequate checks are incorporated in the DBMS to provide data integrity.
5.
Data Security - Data is of vital importance to an organization and may be
confidential. Such confidential data must not be accessed by unauthorized persons.The DBA who has the ultimate responsibility for the data in the DBMS can ensure that proper access procedures are followed. Different levels of security could be implemented for various types of data and operations.
6.
Data Independence - Data independence is the capacity to change the schema at one level of a database system without having to change the schema at the next level. It is usually considered from two points of view: physical data independence and logical data independence. Physical data independence is the capacity to change the internal schema without having to change conceptual schema. Logical data independence is the capacity to change the conceptual schema without having to
change external schemas or application programs.
7.
Providing Storage Structures for Efficient Query Processing - Database systems provide capabilities for efficiently executing queries and updates. Auxiliary files called indexes are used for this purpose.
8.
Backup and Recovery - These facilities are provided to recover databases from hardware and/or software failures.
Some other advantages are:
Reduced Application Development Time
Flexibility
Availability of up-to-date Information
Disadvantages Of DBMS
1.
Cost of Software/Hardware and Migration - A significant disadvantage of theDBMS system is cost.
2.
Reduced Response and Throughput - The processing overhead introduced by the DBMS to implement security, integrity, and sharing of the data causes a degradation of the response and throughput times.
3.Problem with Centralization - Centralization also means that the data is accessible from a single source namely the database. This increases the potential of security breaches and disruption of the operation of the organization because of downtimes and failures.

IMP: ACID PROPERTIES
Desirable properties of transaction:
To ensure integrity of data, we require the database system maintain the following properties of
the transaction.
1. Atomicity: either all operations of a transaction is reflected in database or none.
2. Consistency: execution of a transaction in isolation preserves consistency of database.
3. Isolation: even though multiple transactions are executing concurrently, each transaction
is unaware of other transaction.
4. Durability: after a transaction completes, the changes it has made persists.
These properties are also called ACID properties.
Example:
Let, Ti: read(A);
A:=A-50;
Write(A);
Read(B);
B:=B+50;
Write(B);
Consistency:
In this case consistency requirement here is that the total sum of A and B should always be same
i.e. before and after the transaction is executed. In this case consistency property of the transaction
prevents the unwanted creation of a destroying of money.
Ensuring consistency for an individual transaction is the responsibility of the application programmer who codes the transaction.
Atomicity:
Suppose that failure occur after write(A) operation but before write(B) operation. Hence the values reflected in the database becomes A=950 and B=2000, which is not the true value. Hence
transaction should be completed in its entirety of not at all.
Durability:
Once the execution of the transaction completed successfully and the user who initiated the transaction has been notified that the transfer of the fund has taken place, it must be the case that no
system failure will result in a loss of data corresponding to this transfer of funds.
Isolation:
Let a transaction T1 transfer Rs 50 from account A to B. While T1 reads and subtracts 50 from account A but before it was updated to account B a new transaction (T2) interrupts which reads from
account A and B. the value read by transaction T2 is A=950 and B=2000 (initial values of A=1000 and
B=2000).
Hence in order to maintain consistency isolation of transaction is required. In order to maintain isolation serial execution of transaction and use of concurrency control mechanism should be used.

What are deadlocks? How can they be avoided?
(4)
Ans: Deadlock refers to a specific condition when two or more processes are each waiting for another to release a resource, or more than two processes are waiting for resources in a circular chain. Deadlock is a common problem in multiprocessing where
many processes share a specific type of mutually exclusive resource known as a software, or soft, lock. Computers intended for the time-sharing and/or real-time markets are often
equipped with a hardware lock (or hard lock ) which guarantees exclusive access to processes, forcing serialization. Deadlocks are particularly troubling because there is nogeneral solution to avoid (soft) deadlocks.
One of the methods of deadlock avoidance related to timestamps there are two variations: wait-die and wound-wait

Define and differentiate between the following :-
(i) Deadlock prevention.
(ii) Deadlock detection.
(iii) Deadlock avoidance.
(6)
Ans:
(i) Deadlock prevention: These protocols ensure that the system will never enter a deadlock state. There are two approaches to deadlock prevention. One approach ensures
that no cyclic waits can occur by ordering the requests for locks, or requiring all locks to be acquired together. The other approach is closer to deadlock recovery, and performs
transaction rollback instead of waiting for a lock, whenever the wait could potentially result in a deadlock. In this approach two timestamp based techniques are there : wait –
die and wound – wait . In wait –die, the older transaction is allowed to wait if it needs data from older transaction. In wound - wait, the younger transaction is rolled back if it
needs data from older transaction if older transaction needs data currently held by a younger transaction, however younger transaction is allowed wait if it needs data from
older transaction.
(ii) Deadlock detection: If a system does not employ some protocol that ensures deadlock freedom, then a detection and recovery scheme must be used. An algorithm that
examines the state of the system is invoked periodically to determine whether a deadlock has occurred. If one has, then the system must attempt to recover from the deadlock.
(iii) Deadlock avoidance : These protocols also ensure that the system will never enter a deadlock state but the way it is done is different from deadlock prevention. In this
whenever a transaction requests for some data, the system consider the resources currently allocated to each process and the future requests and releases of each process
currently allocated to each process and the future requests and releases of each process, to decide whether the current request can be satisfied or must wait to avoid a possible future deadlock. In this the transactions have to give additional information about how the data
will be requested in future.

Most asked :
Explain the shadow paging recovery technique.
(8)
Ans: Shadow paging is a technique used to achieve atomic and durable transactions, and provides the ability to manipulate pages in a database. During a transaction, the pages
affected by the transaction are copied from the database file into a workspace, such as volatile memory, and modified in that workspace. When a transaction is committed, all of
the pages that were modified by the transaction are written from the workspace to unused pages in the database file. During execution of the transaction, the state of the database exposed to the user is that in which the database existed prior to the transaction, since the
database file still contains the original versions of the modified pages, as they existed before being copied into the workspace. If a user accesses the database before the transaction is complete, or upon recovery of a failure, it will appear as though the transaction has not occurred

post10) What are deferred modification and immediate modification technique for recovery?
How does recovery takes place in case of a failure in these techniques?
(7)
Ans:
Deferred Update – The deferred update techniques do not physically update the database on disk until after a transaction reaches its commit point; then the updates are recorded in the database. In other words, deferred update techniques postpone any actual updating of the database on disk until a transaction reaches its commit point. The transaction force-writes the log to disk before recording the updates in the database.
Immediate Update – The immediate update techniques may apply changes to the database on disk before the transaction reaches a successful conclusion. However, these changes are typically recorded in the log on disk by force writing before they are applied
to the database, making recovery still possible.
Recovery – In deferred update technique, if a transaction fails before reaching its commit point, it will not have changed the database in any way, so UNDO is not needed. Deferred update can lead to a recovery algorithm known as NO-UNDO/REDO. This approach, when used with certain concurrency control methods, is designed never to
require transaction rollback, and recovery simply consists of redoing the operations of transactions committed after the last checkpoint from the log. In immediate update technique, if a transaction fails after recording some changes in the database but before reaching its commit point, the effect of its operations on the database must be undone; that is, the transaction must be rolled back. This technique requires both operations and is used most often in practice. Therefore, it is also known as UNDO/REDO.

post 9) Define check point and its impact on data base recovery.
(3)
Ans:
In a large on-line database system there could be hundreds of transactions handled per minute. The log for this type of database contains a very large volume of
information. A scheme called checkpoint is used to limit the volume of log information that has to be handled and processed in the event of a system failure involving the loss of volatile information. The checkpoint scheme is an additional component of the logging scheme. A checkpoint operation, performed periodically, copies log information onto stable storage. For all transactions active at checkpoint, their identifiers and their database modification actions, which at that time are reflected only in the database buffers, will be propagated to the appropriate storage.

MostAsked
How does the two phase protocol ensure serializability in database schedules?
(4)
Ans:
A transaction is said to follow the two-phase locking protocol if all locking
operations precede the first unlock operation in the transaction. Such transaction can be divided into two phases: and expanding or growing (first) phase and a shrinking (second) phase. The two-phase locking protocol ensures serializability in database schedules.
Consider any transaction. The point in the schedule where the transaction has obtained its final lock (the end of its growing phase) is called the lock point of the transaction. Now,transactions can be ordered according to their lock points – this ordering is, in fact, a serializability ordering for the transactions

Differentiate between
(i) Procedural and non procedural languages.
(ii) Key and superkey
(iii) Primary and secondary storage
Ans: (i) Procedural and non procedural languages - A procedural language specifies the operations to be performed on the existing data to derive the results. It also specifies the sequence of operations in which they will be performed. But, a non procedural language specifies only the result or information required not how it is obtained.
(ii)
Key and superkey - A key a single attribute or a combination of two or more attributes of an entity set that is used to identify one or more instances (rows) of the set (table). If we add some additional attributes to a primary key then that augmented key is called as super key. Therefore, the primary key is the minimum super key .
(iii)
Primary and secondary storage – Primary storage device stores the data
temporarily. Primary storage is generally used by the processing unit to temporary store the data, intermediate results, and the final results before storing to the secondary storage
because the secondary storage devices are not directly accessible by the CPU. But, if we want to store data permanently then the secondary storage devices are required.Secondary storage devices are slower than the primary storage devices.

post 6) Explain the concept of a data model. What data models are used in database management systems? (8)
Data Model – Model is an abstraction process that hides irrelevant details while highlighting details relevant to the applications at hand. Similarly, a data model is a collection of concepts that can be used to describe structure of a database and provides the
necessary means to achieve this abstraction. Structure of database means the data types,relationships, and constraints that should hold for the data. In general a data model consists of two elements:
- A mathematical notation for expressing data and relationships.
• Operations on the data that serve to express queries and other manipulations of the data.
Data Models used in DBMSs:
Hierarchical Model - It was developed to model many types of hierarchical
organizations that exist in the real world. It uses tree structures to represent relationship among records. In hierarchical model, no dependent record can occur without its parent record occurrence and no dependent record occurrence may be connected to more than one parent record occurrence.
Network Model - It was formalised in the late 1960s by the Database Task Group of the Conference on Data System Language (DBTG/CODASYL). It uses two different data structures to represent the database entities and relationships between the entities, namely record type and set type . In the network model, the relationships as well as the navigation through the database are predefined at database creation time.
Relational Model - The relational model was first introduced by E.F. Codd of the IBM Research in 1970. The model uses the concept of a mathematical relation (like a table of values) as its basic building block, and has its theoretical basis in set theory and first-order predicate logic. The relational model represents the database as a collection of relations .
Object Oriented Model – This model is based on the object-oriented programming language paradigm. It includes the features of OOP like inheritance, object-identity,encapsulation, etc. It also supports a rich type system, including structured and collection types.
Object Relational Model – This model combines the features of both relational model and object oriented model. It extends the traditional relational model with a variety of features such as structured and collection types.

Post 5) Explain the concepts of relational data model.
disadvantages.Also discuss its advantages
Relational Data Model – The relational model was first introduced by Prof. E.F. Codd of the IBM Research in 1970 and attracted immediate attention due to its simplicity and mathematical foundation. The model uses the concept of a mathematical relation (like a table of values) as its basic building block, and has its theoretical basis in set theory and
first-order predicate logic. The relational model represents the database as a collection of relations. The relational model like all other models consists of three basiccomponents:
a set of domains and a set of relations
operation on relations
integrity rules
Advantages
• Ease of use – The revision of any information as tables consisting of rows and columns is quite natural and therefore even first time users find it attractive.
• Flexibility – Different tables from which information has to be linked and extracted can be easily manipulated by operators such as project and join to give information in the form in which it is desired.
• Security – Security control and authorization can also be implemented more easily by moving sensitive attributes in a given table into a separate relation with its own authorization controls. If authorization requirement permits, a particular attribute could be joined back with others to enable full information retrieval.
• Data Independence – Data independence is achieved more easily with normalization structure used in a relational database than in the more complicated tree or network structure. It also frees the users from details of storage structure and access methods.
• Data Manipulation Language – The possibility of responding to ad-hoc query by means of a language based on relational algebra and relational calculus is easy in the
relational database approach. Provides simplicity in the data organization and the availability of reasonably simple to very powerful query languages.
Disadvantages
• Performance – If the number of tables between which relationships to be established are large and the tables themselves are voluminous, the performance in responding to queries is definitely degraded.
• Unsuitable for Hierarchies – While the relational database approach is a logically attractive, commercially feasible approach, but if the data is for example naturally organized in a hierarchical manner and stored as such, the hierarchical approach may give better results.

explain the three level architecture of the database system.
A DBMS provides three levels of data is said to follow three-level architecture. The goal of the three-schema architecture is to separate the user applications and the physical database. The view at each of these levels is described by a schema. The processes of transforming requests and results between levels are called mappings. In this architecture,schemas can be defined at the following three levels:
External Level or Subschema –
It is the highest level of database abstraction where only those portions of the database of concern to a user or application program are
included. Any number of user views (some of which may be identical) may exist for a given global or conceptual view. Each external view is described by means of a schema called an external schema or subschema.
Conceptual Level or Conceptual Schema -
At this level of database abstraction all the database entities and the relationships among them are included. One conceptual view
represents the entire database. This conceptual view is defined by the conceptual schema. There is only one conceptual schema per database. The description of data at this level is in a format independent of its physical representation. It also includes features that specify the checks to retain data consistency and integrity.
Internal Level or Physical Schema –
It is closest to the physical storage method used. It indicates how the data will be stored and describes the data structures and access methods to be used by the database. The internal view is expressed by the internal schema.

Explain five duties of Database Administrator.
(5)
Ans:
1. DBA administers the three levels of the database and, in consultation with the overall user community, sets up the definition of the global view or conceptual level of the database.
2. Mappings between the internal and the conceptual levels, as well as between the conceptual and external levels, are also defined by the DBA.
3. DBA ensures that appropriate measures are in place to maintain the integrity of the database and that the database is not accessible to unauthorized users.
4. DBA is responsible for granting permission to the users of the database and stores the profile of each user in the database.
5. DBA is responsible for defining procedures to recover the database from failures with minimal loss of data.
2)Explain the terms primary key, candidate key and foreign key.
Ans:
Primary Key – Primary key is one of the candidate keys that uniquely
identifies each row in the relation.
Candidate Key – A candidate key of an entity set is a minimal superkey, that uniquely identifies each row in the relation.
Foreign Key – Let there are two relations (tables) R and S. Any candidate key of the relation R which is referred in the relation S is called the foreign key in the relation S and referenced key in the relation R. The relation R is also called as parent table and relation S is also called as child table.
: