1.Database Design - Snowflake, Star Schema?
i.Star Schema ( Fact - Dimension Tables) Normalization is not done too much.
a.Too much normalization would need lot join which slows down performance.
b.In Snow Flake the Dimension Tables are further normalized to reduce duplication of data.
That is typical Relational Database model. As its for storing data and reducing as much duplication and
saving physical storage as reports are not generated here so speed is not here the driving force.
Essbase was OLAP model thats the Star Schema design.
(See the pictures below as it looks as Star.)
Snowflake - Star Schema
2. Relational - Dimensional Modelling
In simple words OLTP normalized database are designed with most optimal "transactional" point of view. Databases are normalized to work optimally to a transactional system. When I say optimization of transactional system i mean ..getting to a design state of database structure where all transactional operations like delete,insert,update and select are balanced to give equal or optimum importance to all of them at any point of time...as they are equally valued in a transactional system.
And that what a normalized system offer ..minimal updates possible for a data update,minimal insert possible for new entry,one place delete for category deletion etc (e.g. new product category )...all this is possible a we branch a create master tables .....but this comes at the cost of "select" operation delay ..but as I said its(normalization) not most efficient model for all operations ..its "Optimal"...having said we get other methods to enhance data fetching speed..like indexing etc
On the other hand Dimensional model (mostly used for data-ware house design)..meant for giving importance to only one kind of operations thats Selection of data...as in data-ware houses ..data update/insertion happens periodically ..and its a one time cost.
So if one try to tweak normalized data structure so that only selection is the most important operation at any point in time ...we will end up getting a denormalized (I would say partially denormalized)..dimensional star structure.
- all the foreign keys a one place Fact -no dimension to dimension join (i.e. master to master table join)..snowflake represent same dimension
- ideally designed facts carry only numbers ..measures or foreign keys
- dimension are used to carry description and non aggregatable info
- redundancy of data is ignored ...but in rare cases if Dimensions itself grow too much .snowflake design is seen as option..but that still is avoidable
For details please go through detailed books on this topic.
3.Normalization
Database Normalization Basics
From the book.
Example of an Unnormalized Relation
Order no. Order date Item Lines
Item Code Quantity Price/unit
1000 100112 100 10 10.25
300 15 5.5
2000 101512 200 1 100
3000 102012 100 30 5
4000 110112 400 1 500
300 20 5.5
1)Example : 1NF
Order no. Order date Item Code Quantity Price/unit
1000 100112 100 10 10.25
1000 100112 300 15 5.5
2000 101512 200 1 100
3000 102012 100 30 5
4000 110112 400 1 500
4000 110112 300 20 5.5
Layman point of view can see unnecessary the Unit price is repeating. Its just a flat file(1NF).
There are no composite attributes, and every attribute is single and describes one property.
When we want to move into a computer and store electronically the storage space saving becomes important. Then smart logical database design and physical database design is important.
2)2NF:
A relation is said to be in 2NF , if it is in 1NF and non-key attributes are functionally dependent on the key
attribute(s).
Orders:
Order no. Order date
Order Details:
Order no. Item Code Quantity
Prices:
Item Code Price/unit
3)3NF: If two non-key attributes are functionally dependent, then there will be unnecessary duplication of data.
A 2NF Form Relation
Roll no. Name Department Year Hostel Name
Year, Hostel Name are dependant.
3NF:
Roll no. Name Department Year
Year Hostel Name
4. MOLAP, ROLAP, And HOLAP
MOLAP, ROLAP, And HOLAP
Collected Notes:
Introduction
Data Warehouse - Basic Concepts - SSDI
5.
------------------------------------------------------------------------------------------------------------
Very Basic questions:
1.What is the difference between Data Warehousing, Business Intelligence.?
ans.
In simple term in data warehouse data is stored. From this stored data, management make decision by looking at reports or by a data analysis tool.
In Microsoft the BI tool is Analysis Services , Reporting Services , Integration Services.
In the data warehouse more focus on data modelling, normalization and optimizing data. Deciding on Star schema (OLAP applns. - Fact Dimension Table) or Snowflake Schema ( RDBMs applns.).
------------------------------------------------------------------------------------------------------------
Theoretical Concept Questions.
1.. What is Agile software development ?
Agile_software_development
One of the differences between agile and waterfall, is that testing of the software is conducted at different points during the software development lifecycle. In the waterfall model, there is a separate testing phase after implementation. In Agile XP, testing is done concurrently with implemetation.
------------------------------------------------------------------------------------------------------------
[Just Brief theoretical knowledge]*
2.Software as a service (SaaS) ?
software as a Service (SaaS) is a model of software deployment where an application is hosted as a service outside of the customer’s site and delivered to customers across the Internet.
saas - in nutshell
saas - deployment model
--
Software_as_a_service
sometimes referred to as "on-demand software" , is a software delivery model in which software and associated data are centrally hosted on the cloud. SaaS is typically accessed by users using a thin client via a web browser.
SaaS has become a common delivery model for many business applications, including accounting, collaboration, customer relationship management (CRM), management information systems (MIS), enterprise resource planning (ERP), invoicing, human resource management (HRM), content management (CM) and service desk management.[
The term "software as a service" (SaaS) is considered to be part of the nomenclature of cloud computing, along with infrastructure as a service (IaaS), platform as a service (PaaS),desktop as a service (DaaS), and backend as a service (BaaS).[8]
Architecture
While an exception rather than the norm, some SaaS solutions do not use multi-tenancy, or use other mechanisms—such as virtualization—to cost-effectively manage a large number of customers in place of multi-tenancy.[12] Whether multi-tenancy is a necessary component for software-as-a-service is a topic of controversy.[13]
Open integration protocols
Since SaaS applications cannot access a company's internal systems (databases or internal services), they predominantly offer integration protocols and application programming interfaces (APIs) that operate over a wide area network. Typically, these are protocols based on HTTP, REST, SOAP and JSON.
References:
Developing a SaaS product? Scrum is NOT for you.
Scrum:
More methodical way of developing.
------------------------------------------------------------------------------------------------------------
No comments:
Post a Comment