Search This Blog

6.6.11

How to prepare for DataStage Interview - Part 1

There are many friends who are trying to crack down DataStage interviews but somehow because of lack of real world project exposure; not able to make it through. The tips explained below are to help them so that they can be confident while perusing interview and prepare in a way that they are capable of handling any kind of question and answers them tactfully.




Tip number one - Always remember DataStage is not an stand alone product. It is part of Information Server. One should thoroughly understand its place into Information server. So what this actually means? This means one should understand the architecture of Information server. Where DataStage stands and how it is tied us with different other components of Information Server. Most of the time when asked friends are able to clearly differentiate between DataStage and Information server. One should be very clear about the differences.

Tip number two - Know your project set-up. This is very important, this shows how much exposure you have in your project. Most of the time friends typically answer "Step one which is to extract data from source system is handled by another team" this is not the correct answer. One should have at least basic knowledge about "what is the source system and how data is flowing from source till dumping or staging area". Now for example this could be done in multiple fashion like having a shell script running to get source files using putty or having a windows batch file to copy files from another location or even it could be DataStage job which directly read from a share location or use FTP stage to download a source file or it could be a source database where DataStage directly connects and bring data into Staging area. Whatever is the case we should be aware of it. So know the source unless we know the source it is like working in dark.

In typical project set-up data flow as follows:

Source System (File/Complex Files/Database) --> Staging Area (File System/Database) --> Common Objects Model (Dimensions) --> Summary Tables (Preparation for Fact Load) --> Fact Load.

Not everyone gets the chance to work on all the sections listed of the project. In modular development mostly one developer get stuck with one type of jobs and mostly in designing SCDs or performing normal transformations. One should ask questions about the next module of the project. They should question "What is the frequency of Fact load or dimension update?" Or "What are other types being used other than just type two SCD updates?" Or "How many facts and dimension this project has all together?" Also one should think beyond just customer and product scenarios. DataStage might not be just used for dimension modelled warehouses. It could also be used for normal requirements where the transformed information is needed at destination side.

Tip number three - No jargon at all. Remember the person who is taking your interview is from different environment and does not know what KPGH or COM or XYZ stands for. One should not use the short forms while explaining the projects or even while explaining the scenarios one should use tables from sample star schema like customer/product or the great OLTP example employee and manager.

Tip number four - Keep is simple. If you know the answer then explain it and try to not to mention a technical term which you are not familiar with. If you are confident about the answer go ahead and say it with full confidence. Once you said it then you should stuck with it. If you change your answers it means you are not sure about it.

Tip number five - Don't get confused. Interviewer might ask questions which might sound incorrect and just because interviewer is asking it, you should not node affirmative. Think properly and then answer.

Tips number six - Know the stages. There are multiple ways in which one job can be implemented. One should evaluate the best stage to use in the job from following point of views:

1. Objective of the job
2. Memory consumption
3. Source Data and its amount
4. Frequency of the job
5. Job criticality

To be continued...

2 comments:

  1. 3 dimension tables and 1 fact table so how many scd stages required for update record ?

    please help me in this
    send me answer madosh786@gmail.com

    ReplyDelete
  2. Thanks for sharing the good updates on DataStage over here. Keep updating more updates on interview questions here.

    ReplyDelete