Tuesday, January 22, 2013

DataWareHousing - ETL Project Life Cycle ( Simple to understand )

-> Datawarehousing projects are categorized into 4 types.
1) Development Projects.
2) Enhancement Projects
3) Migration Projects
4) Production support Projects.

-> The following are the different phases involved in a ETL project development life cycle.
1) Business Requirement Collection ( BRD )
2) System Requirement Collection ( SRD )
3) Design Phase
a) High Level Design Document ( HRD )
b) Low level Design Document ( LLD )
c) Mapping Design
4) Code Review
5) Peer Review
6) Testing
a) Unit Testing
b) System Integration Testing.
c) USer Acceptance Testing ( UAT )
7) Pre - Production
8) Production ( Go-Live )

Business Requirement Collection :-
---------------------------------------------
-> The business requirement gathering start by business Analyst, onsite technical lead and client business users.
-> In this phase,a Business Analyst prepares Business Requirement Document ( BRD ) (or) Business Requirement Specifications ( BRS )
-> BR collection takes place at client location.
-> The o/p from BR Analysis are
-> BRS :- Business Analyst will gather the Business Requirement and document in BRS
-> SRS :- Senior technical people (or) ETL architect will prepare the SRS which contains s/w and h/w requirements.
The SRS will includes
a) O/S to be used ( windows or unix )
b) RDBMS required to build database ( oracle, Teradata etc )
c) ETL tools required ( Informatica,Datastage )
d) OLAP tools required ( Cognos ,BO )
The SRS is also called as Technical Requirement Specifications ( TRS )

Designing and Planning the solutions :-
------------------------------------------------
-> The o/p from design and planning phase is
a) HLD ( High Level Design ) Document
b)LLD ( Low Level Design ) Document
HLD ( High Level Design ) Document : -
An ETL Architect and DWH Architect participate in designing a solution to build a DWH.
An HLD document is prepared based on Business Requirement.
LLD ( Low Level Design ) Document : -

Based on HLD,a senior ETL developer prepare Low Level Design Document
The LLD contains more technical details of an ETL System.
An LLD contains data flow diagram ( DFD ), details of source and targets of each mapping.
An LLD also contains information about full and incremental load.
After LLD then Development Phase will start

Development Phase ( Coding ) :-
--------------------------------------------------
-> Based an LLD, the ETL team will create mapping ( ETL Code )
-> After designing the mappings, the code ( Mappings ) will be reviewed by developers.
Code Review :-
-> Code Review will be done by developer.
-> In code review,the developer will review the code and the logic but not the data.
-> The following activities takes place in code review
-> You have to check the naming standards of transformation,mappings of data etc.
-> Source and target mapping ( Placed the correct logic or not in mapping )
Peer Review :-
-> The code will reviewed by your team member ( third party developer )
Testing:-
--------------------------------

The following various types testing carried out in testing environment.
1) Unit Testing
2) Development Integration Testing
3) System Integration Testing
4) User Acceptance Testing

Unit Testing :-
-> A unit test for the DWH is a white Box testing,It should check the ETL procedure and Mappings.
-> The following are the test cases can be executed by an ETL developer.
1) Verify data loss
2) No.of records in the source and target
3) Dataload/Insert
4) Dataload/Update
5) Incremental load
6) Data accuracy
7) verify Naming standards.
8) Verify column Mapping
-> The Unit Test will be carried by ETL developer in development phase.
-> ETL developer has to do the data validations also in this phase.
Development Integration Testing -
-> Run all the mappings in the sequence order.
-> First Run the source to stage mappings.
-> Then run the mappings related to dimensions and facts.
System Integration Testing :-
-> After development phase,we have to move our code to QA environment.
-> In this environment,we are giving read-only permission to testing people.
-> They will test all the workflows.
-> And they will test our code according to their standards.
User Acceptance Testing ( UAT ) :-
-> This test is carried out in the presence of client side technical users to verify the data migration from source to destination.

Production Environment :-
---------------------------------
-> Migrate the code into the Go-Live environment from test environment ( QA Environment ).

17 comments:

  1. Informatica Online Training, ONLINE TRAINING – IT SUPPORT – CORPORATE TRAINING http://www.21cssindia.com/courses/informatica-online-training-31.html The 21st Century Software Solutions of India offers one of the Largest conglomerations of Software Training, IT Support, Corporate Training institute in India - +919000444287 - +917386622889 - Visakhapatnam,Hyderabad Informatica Online Training, Informatica Training, Informatica, Informatica Online Training| Informatica Training| Informatica| "Courses at 21st Century Software Solutions
    Talend Online Training -Hyperion Online Training - IBM Unica Online Training - Siteminder Online Training - SharePoint Online Training - Informatica Online Training - SalesForce Online Training - Many more… | Call Us +917386622889 - +919000444287 - contact@21cssindia.com
    Visit: http://www.21cssindia.com/courses.html"

    ReplyDelete
    Replies
    1. Thank you for excellent article.You made an article that is interesting.
      Tavera car for rent in chennai|Indica car for rent in chennai|innova car for rent in chennai|mini bus for rent in chennai|tempo traveller for rent in chennai
      Keep on the good work and write more article like this...

      Great work !!!!

      Delete
  2. Hello Ramamohan, First of all i would like thank you so much. You are doing a great job. I observed that your blogs are the best one in web. i also read your blogs and i suggested to all my friends to read your valuable information.

    Thank you so much.

    Please write some followings blogs.


    Please write blogs for dimension to fact table load... with easy and simple steps as given in previous blogs.

    Please write blog for Unix script use in real time.

    ReplyDelete
  3. Nice blog about informatica. If you want to know more about informatica qujestions and answers for experience, please contact us from the links below


    The article is meant to help informatica interview questions and answers for experienced individuals or students preparing on this particular topic. There are so many new important informatica scenario based questions points, question covered and different new points all covered in this piece all at ease. The best informatica interview questions thing about the article is that it makes studying and preparation quite simple for individuals and accordingly they can prepare for the informatica questions interview.

    ReplyDelete
  4. I just see the post i am so happy the post of information's.So I have really enjoyed and reading your blogs for these posts.Any way I’ll be subscribing to your feed and I hope you post again soon.

    Hadoop Training in Chennai

    ReplyDelete
  5. Great articles, first of all Thanks for writing such lovely Post! Earlier I thought that posts are the only most important thing on any blog. But here a Shout me loud found how important other elements are for your blog.Keep update more posts..

    Best Dental Clinic In Vellore

    Best Dental Clinic In OMR

    ReplyDelete
  6. Really Good blog post about data warehouse etl project lifecycle.provided a helpful information.I hope that you will post more updates like this.
    Digital marketing company in Chennai

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. I Feel extremely cheerful to have seen your website page and anticipate such a significant number of all the more engaging circumstances perusing here. Much appreciated yet again for every one of the points of interest.
    Hadoop Online Training in bangalore

    ReplyDelete

DataWareHousing - ETL Project Life Cycle ( Simple to understand ) Warehousing -> Datawarehousing projects are categoriz...