3.22 Optimizing the Source and other Transformations

Optimizing the Source

Optimizing the Query: Single table select statements with an ORDER BY or GROUP BY clause may benefit from optimization such as adding indexes
Using Conditional Filters: if multiple sessions read from the same source simultaneously, the Powercenter conditional filter may improve performance
Increasing Database Network Packet Size in Oracle (listener.ora and tnsnames.ora)
Connecting to Oracle Database Sources: Use IPC protocol to connect to the Oracle database
Using Teradata FastExport (use FastExport reader instead of Relational reader)
In Sybase or Microsoft SQL Server Tables, use tempdb as an in-memory database to allocate sufficient memory

1.1.1 Optimizing Mappings Overview

Reduce the number of transformations in the mapping
Delete unnecessary links between transformations
Use Expression Transformations to do the most amount of work possible

1.1.2 Optimizing Datatype Conversions

Use integer values in place of other datatypes when performing comparisons using Lookup and Filter transformations
Convert the source dates to strings through port-to-port conversions to increase session performance

1.1.3 Optimizing Expressions

When possible, isolate slow expressions and simplify them
Choose Numeric fields in place of strings as much as possible
Choose Decode instead of Lookup as much as possible
Use Operators instead of Functions

1.1.4 Optimizing Aggregator Transformations

Group by simple columns
Use sorted input
Use incremental aggregation
Filter data before you aggregate it
Limit port connections

1.1.5 Optimizing Custom Transformations

Decrease the number of function calls the Integration Service and procedure make
Increase the locality of memory access space for the data
Write procedure code to perform an algorithm on a block of data instead of each row of data

1.1.6 Optimizing Joiner Transformations

Designate the master source as the source with fewer duplicate key values
Designate the master source as the source with fewer rows
Perform joins in a database when possible
Join sorted data when possible

1.1.7 Optimizing Lookup Transformations

Use the optimal database driver
Cache lookup tables
Optimize the lookup condition
Filter lookup rows
Index the lookup table
Optimize multiple lookups
Create a pipeline Lookup transformation and configure partitions in the pipeline that builds the lookup source

1.1.8 Optimizing Normalizer Transformation

Place the Normalizer Transformation as close to the target as possible

1.1.9 Optimizing Sequence Generator Transformations

Create a reusable Sequence Generator and using it in multiple mappings simultaneously
Consider configuring the Number of Cached Values to a value greater than 1,000
If you do not have to cache values, set the Number of Cache Values to 0

1.1.10 Optimizing Sorter Transformations

Allocate enough memory to sort the data.
Specify a different work directory for each partition in the Sorter transformation
Run the Powercenter Integration Service in ASCII mode

1.1.11 Optimizing Source Qualifier Transformations

Use the Select Distinct option for the Source Qualifier transformation if you want the Integration Service to select unique values from a source

1.1.12 Optimizing SQL Transformations

SQL Transformations used in Query mode instead of Script Mode

1.1.13 Eliminating Transformation Errors

Set Error Threshold to an appropriate value
If there are multiple errors in a session, set tracing level to low (ex-Terse)
Check Session log to check Transformation Errors

1.1.14 Optimize the Sessions

Use a Grid to improve session / workflow performance
Use Pushdown Optimization
Run Concurrent Sessions and Workflows
Buffer Memory: Adjust DTM Buffer Size & Default Buffer Block Size
Caches: Analyze Transformation Caches & configure them accordingly
Target-Based Commit: Choose appropriately
Real-time Processing: See if Source Based Commit can be used
Remove Staging Areas as much as possible
Generate less or binary Log Files
Choose Appropriate Error Tracing
Post-Session Emails: Set flat file Logging for Post Session Emails

1.1.15 Optimize Grid Deployment

Add nodes to the grid
Increase storage capacity and bandwidth
Use shared file systems
Use a high-throughput network

1.1.16 Optimizing the Powercenter Components

Optimizing Powercenter Repository Performance:

Ensure the Powercenter repository is on the same machine as the Repository Service process
Order conditions in object queries
Use a single-node tablespace for the Powercenter repository if you install it on a DB2 db
Optimize the database schema for the Powercenter repository if you install it on a DB2 or Microsoft SQL Server database

Optimizing Integration Service Performance:

Use native drivers instead of ODBC drivers for the Integration Service
Run Int Service in ASCII data movement mode if character data is 7-bit ASCII or EBCDIC
Cache Powercenter metadata for the Repository Service
Run Integration Service with high availability

1.1.17 Optimize the Systems

Improve network speed
Use multiple CPUs
Reduce paging
Use processor binding
Use Local Disk if possible: A local disk can move data 5 to 20 times faster than a network
Minimize the number of network hops between the source and target databases and the Integration
Move Target to a Server System if possible
Ask Network Engineer to provide enough Bandwidth
Use Multiple CPUs
Reduce Paging
Use Processor Binding

Are the target tables having constraints?
Does a parent need to be loaded first? Check for PK – FK Relationship
Naming convention for mapping , workflow, session, transformations
SCD type 1 / 2 followed
Check if DTM buffer size is auto or manually set
Use Decode instead of nested IIF functions
Use of Variables or Parameters ($Source / $ DBConnection_Src) for Connection of Source / Target Tables
Use of Unconnected Lookups wherever only 1 value is to be returned by Lookup Table
Using Dynamic Lookups for SCD implementations
Use of Shortcuts in case of multiple folders
Re-usable transformations
Memory Utilization – Source Bottleneck (can be checked only by execution of mapping)
Use of Re-usable sessions
Memory Utilization – Target Bottleneck (can be checked only by execution of mapping)
Memory Utilization – Transformation Bottleneck (can be checked only by execution of mapping)
Different Commit Type & Commit Interval (from Session Properties, default is Target Type , 10000)
Column Precision Too high / Implicit Data type Conversion

3.22 Optimizing the Source and other Transformations

Optimizing the Source

1.1.1 Optimizing Mappings Overview

1.1.2 Optimizing Datatype Conversions

1.1.3 Optimizing Expressions

1.1.4 Optimizing Aggregator Transformations

1.1.5 Optimizing Custom Transformations

1.1.6 Optimizing Joiner Transformations

1.1.7 Optimizing Lookup Transformations

1.1.8 Optimizing Normalizer Transformation

1.1.9 Optimizing Sequence Generator Transformations

1.1.10 Optimizing Sorter Transformations

1.1.11 Optimizing Source Qualifier Transformations

1.1.12 Optimizing SQL Transformations

1.1.13 Eliminating Transformation Errors

1.1.14 Optimize the Sessions

1.1.15 Optimize Grid Deployment

1.1.16 Optimizing the Powercenter Components

Optimizing Powercenter Repository Performance:

Optimizing Integration Service Performance:

1.1.17 Optimize the Systems

About

Legal

Trending Topics