Tuesday, July 29, 2014

02: Overview of Datastage Designer Client

After creating your project, it is time to know about designer client to start developing ETL.


There are different sections in Designer screen. At the top, you see menubar (1) and toolbar (2) (that I will explain each tool when we go deep into developing process).

(3) Repository: DS suggests a standard repository structure to organize repository objects. You can add new folders and organize them together with the existing ones.

(4) Palette: Palette keeps the building elements (called stages in parallel and server jobs, called activities in sequence jobs) to develop ETL jobs. Stages are orginized in different sections based on their funcltionalities.

(5) Log: to make your life easier during development, you can show log view in designer client. If it does not appear on your screen, you can activate from view menu.

(6) Canvas: place holder to design our jobs. You can drag and drop any stage in palette to canvas and link them to have end-to-end job.

The first DS Job :)


Basically an ETL job requires three stages.
1- DB connector/enterprise/file stage to connect to data source and extract data
2- Transformer stage to make all necessary transformations and map to output columns
3- DB connector/enterprise/file stage to connect target and load data

Depending on your needs you can use further processing stages like lookup, copy, filter etc.


Friday, July 25, 2014

01: How to start with Datastage?

Let's start with the clients of Datastage and describe main activities that you will be doing with each.


Administrator Client: 

With Administrator Client, mainly you can add, delete, move Datastage projects, control user permissons and environment variables and also manage some other administrative tasks. You can prefer to create a new project from scratch or copy an existing one and continue as a phased approach.

Designer Client:

The client of development! You will be designing and building parallel, server or sequence jobs to fulfill your data integration initiative within the designer. Designer client is a graphical user interface that include many building blocks (called "stage" in parallel and server jobs and called "activity" in sequence jobs) to help you incorporate functional capabilities to your job design.

Director Client: 

You developed your jobs and ready to run! Within director client you can validate, run, reset, schedule and monitor your jobs.

We will go into detail about each client in following related posts but let's start with creating our project.

1-) How to create a new Datastage Project?

Login to Administrator Client and go to "Projects" tab.




You will see the list of projects on the left pane.
With the options on the right, you can add or delete projects and control project level properties.






When you click on "Add" button, you can give a name to your project, specify the path and you can copy roles from an existing project. When you check the related box, drop down list will be activated so you can chose the project from which you want to copy roles.


2-) Do I need to create a new project for every reason?

Deciding to create a new project or building your jobs within an existing project is a design option. However, to be able to have boundaries between different environments, control user priveleges and ensure easy maintanence, it is better to have different projects for different subjects especially for prod environments. For example you can have a DS Project for your datawarehouse and another one for your accounting system. On the other hand, within the same repository, you can benefit from generating generic batch jobs that can be used commonly for different purposes with parameters. So before deciding to create a new project or continue with the existing repository, you need to evaluate similarities and required boundaries between projects.

3-) How to copy/move a project in Datastage?

You might want to copy/move your projects because of many reasons like moving your repository to a new path, going from development to prod, having different incremental repositories for phased based development projects ...etc.

To move/copy a project, you need to create a new project in desired path by copying the roles from existing project that you want to move/copy. Then, you need to export required DS components (jobs, table definitions, parameter sets, routines etc.) from the existing project and import all to your new project.

To export DS components from your existing project, open designer client and click on Export >DatastageComponents from menu bar. In the opened screed, specify the path that you want to save export file. Then click 'Add' to select any component that you want to export from repository. When you select all components to export, click on 'Export' button.

Then open your new project in Designer Client to import all components. Select Import >DatastageComponents and select the export file path in opened screen and then click 'Import'. Now you have all components (jobs, table definitions, parameters etc) that you exported in your new project.

You also need to consider that 'Project Properties' are not copied automatically. So if you want to have same Environment Variables for your new project, you need to export them from Administrator Client and then import to your new project.

4-) How to delete a Datastage Project?

In administrator client, projects tab, go over the project that you want to delete and click on delete button on the left. You are done :)



Related Posts Plugin for WordPress, Blogger...