<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Informatica Performance Tuning</title>
	<atom:link href="http://kirtandesai.com/write/index.php/2006/11/12/informatica-performance-tuning/feed/" rel="self" type="application/rss+xml" />
	<link>http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/</link>
	<description></description>
	<lastBuildDate>Tue, 23 Aug 2011 14:40:20 -0700</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.2</generator>
	<item>
		<title>By: Kirtan Desai</title>
		<link>http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/comment-page-1/#comment-14</link>
		<dc:creator>Kirtan Desai</dc:creator>
		<pubDate>Tue, 27 Mar 2007 17:44:41 +0000</pubDate>
		<guid isPermaLink="false">http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/#comment-14</guid>
		<description>Dan,
At this point, it&#039;s impossible for me to gather these many people. I may look into travelling down to one of your classes.

thanks for the information though.
Kirtan</description>
		<content:encoded><![CDATA[<p>Dan,<br />
At this point, it&#8217;s impossible for me to gather these many people. I may look into travelling down to one of your classes.</p>
<p>thanks for the information though.<br />
Kirtan</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan Linstedt</title>
		<link>http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/comment-page-1/#comment-13</link>
		<dc:creator>Dan Linstedt</dc:creator>
		<pubDate>Mon, 26 Mar 2007 01:08:32 +0000</pubDate>
		<guid isPermaLink="false">http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/#comment-13</guid>
		<description>I have revamped and reworked my performance and tuning class, and it is now a full 2 day fast-track, 100% lecture.  It will comprise of tuning your top 3 &quot;problematic&quot; mappings in class, worked in to the lecture.  It is a jam-packed class full of all the tips and techniques you need to make Informatica hum.

The techniques I teach can take mappings from 24 hours down to 4 hours, and from 48 hours down to 8 hours, and from 2 hours down to 35 to 45 minutes.

I&#039;ve added v8 content, but all of the content applies backwards to v6, v7 as well.

If you can get 10 people into a class, we can bring the class to you (within the USA).  If you can get 20 people in to the class, we can bring the class to you outside the USA.

If you are interested in holding such a class, please let me know.  I&#039;ve been teaching performance and tuning (systems-wide, including Informatica) for 15 years.

The class itinerary is available at: http://www.GeneseeAcademy.com

Thanks,
Dan Linstedt
DanL@RapidACE.com
http://www.RapidACE.com</description>
		<content:encoded><![CDATA[<p>I have revamped and reworked my performance and tuning class, and it is now a full 2 day fast-track, 100% lecture.  It will comprise of tuning your top 3 &#8220;problematic&#8221; mappings in class, worked in to the lecture.  It is a jam-packed class full of all the tips and techniques you need to make Informatica hum.</p>
<p>The techniques I teach can take mappings from 24 hours down to 4 hours, and from 48 hours down to 8 hours, and from 2 hours down to 35 to 45 minutes.</p>
<p>I&#8217;ve added v8 content, but all of the content applies backwards to v6, v7 as well.</p>
<p>If you can get 10 people into a class, we can bring the class to you (within the USA).  If you can get 20 people in to the class, we can bring the class to you outside the USA.</p>
<p>If you are interested in holding such a class, please let me know.  I&#8217;ve been teaching performance and tuning (systems-wide, including Informatica) for 15 years.</p>
<p>The class itinerary is available at: <a href="http://www.GeneseeAcademy.com" rel="nofollow">http://www.GeneseeAcademy.com</a></p>
<p>Thanks,<br />
Dan Linstedt<br />
<a href="mailto:DanL@RapidACE.com">DanL@RapidACE.com</a><br />
<a href="http://www.RapidACE.com" rel="nofollow">http://www.RapidACE.com</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kirtan Desai</title>
		<link>http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/comment-page-1/#comment-12</link>
		<dc:creator>Kirtan Desai</dc:creator>
		<pubDate>Tue, 27 Feb 2007 18:45:14 +0000</pubDate>
		<guid isPermaLink="false">http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/#comment-12</guid>
		<description>David,
In cases where you have a lot of data in both source and look up tables, it is wiser to break the process into pieces and not try to do it all in one process.

Overall I do agree with your comment though. If an outer join is possible to perform while keeping the cost low, there is nothing better than that.

Kirtan</description>
		<content:encoded><![CDATA[<p>David,<br />
In cases where you have a lot of data in both source and look up tables, it is wiser to break the process into pieces and not try to do it all in one process.</p>
<p>Overall I do agree with your comment though. If an outer join is possible to perform while keeping the cost low, there is nothing better than that.</p>
<p>Kirtan</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: david</title>
		<link>http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/comment-page-1/#comment-11</link>
		<dc:creator>david</dc:creator>
		<pubDate>Tue, 27 Feb 2007 17:39:58 +0000</pubDate>
		<guid isPermaLink="false">http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/#comment-11</guid>
		<description>Kaushik,

We have very similar issue as you have, here is what we plan to do:

If you can remove lookup in your mapping and including this in your source qualify sql override,
in our case, (use lookup table outjoin with souce table). our mapping performance increase from 15 mins down to 35 secs, if you added have properly indexes.

Hope this will be help you as well.</description>
		<content:encoded><![CDATA[<p>Kaushik,</p>
<p>We have very similar issue as you have, here is what we plan to do:</p>
<p>If you can remove lookup in your mapping and including this in your source qualify sql override,<br />
in our case, (use lookup table outjoin with souce table). our mapping performance increase from 15 mins down to 35 secs, if you added have properly indexes.</p>
<p>Hope this will be help you as well.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kirtan Desai</title>
		<link>http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/comment-page-1/#comment-10</link>
		<dc:creator>Kirtan Desai</dc:creator>
		<pubDate>Tue, 27 Feb 2007 04:34:22 +0000</pubDate>
		<guid isPermaLink="false">http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/#comment-10</guid>
		<description>Kaushik,

What do you expect from this query? Do you have a time limit that you want it to finish by? Or you just think that since it&#039;s taking 35 minutes, it&#039;s slow.

***********Anyways 

Most of the time is taken by the sorting. The problem here is that you are doing a look up on a big table. Doing a lookup against such table would be a big No-No in my opinion. Is TABLE_DETAIL in a different database than the source tables? I have a suggestion for you. See if it works for you.

Let&#039;s say you have a source table called SRC_DATA fields like

A
B
C
DETAIL_ID
NUMBER
YEAR

And than there is your lookup table with fields
DETAIL_ID
NUMBER
YEAR

NOW before running the main process,
GET ALL THE DATA from the lookup table where combination of DETAIL_ID, NUMBER is in
(
get a distinct set of DETAIL_ID, NUMBER
)
And build a view with the results.

And in your main process do a look up on the view and not on the big table.

#####
HOWEVER-
If you HAVE TO do the look up than I would suggest creating an index on all the columns that are in the ORDER BY list. Since you do not have anything in the where clause this is a straight forward. You idea of either creating a better index or altering the ORDER BY clause would work just fine.

ALSO,
Look into tuning look up cache (data and index). That will help you a LOT with your performance issue.
You should be able to find it in the transformation guide.

These are a few thoughts that I gathered while reading your email. If you need a detailed explanation and/or suggestion, please give a test case (create table scripts with 100 rows and a good description of your process (make it technical and not so functional) .

Hope this helps.

-Kirtan</description>
		<content:encoded><![CDATA[<p>Kaushik,</p>
<p>What do you expect from this query? Do you have a time limit that you want it to finish by? Or you just think that since it&#8217;s taking 35 minutes, it&#8217;s slow.</p>
<p>***********Anyways </p>
<p>Most of the time is taken by the sorting. The problem here is that you are doing a look up on a big table. Doing a lookup against such table would be a big No-No in my opinion. Is TABLE_DETAIL in a different database than the source tables? I have a suggestion for you. See if it works for you.</p>
<p>Let&#8217;s say you have a source table called SRC_DATA fields like</p>
<p>A<br />
B<br />
C<br />
DETAIL_ID<br />
NUMBER<br />
YEAR</p>
<p>And than there is your lookup table with fields<br />
DETAIL_ID<br />
NUMBER<br />
YEAR</p>
<p>NOW before running the main process,<br />
GET ALL THE DATA from the lookup table where combination of DETAIL_ID, NUMBER is in<br />
(<br />
get a distinct set of DETAIL_ID, NUMBER<br />
)<br />
And build a view with the results.</p>
<p>And in your main process do a look up on the view and not on the big table.</p>
<p>#####<br />
HOWEVER-<br />
If you HAVE TO do the look up than I would suggest creating an index on all the columns that are in the ORDER BY list. Since you do not have anything in the where clause this is a straight forward. You idea of either creating a better index or altering the ORDER BY clause would work just fine.</p>
<p>ALSO,<br />
Look into tuning look up cache (data and index). That will help you a LOT with your performance issue.<br />
You should be able to find it in the transformation guide.</p>
<p>These are a few thoughts that I gathered while reading your email. If you need a detailed explanation and/or suggestion, please give a test case (create table scripts with 100 rows and a good description of your process (make it technical and not so functional) .</p>
<p>Hope this helps.</p>
<p>-Kirtan</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: KOUSHIK</title>
		<link>http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/comment-page-1/#comment-9</link>
		<dc:creator>KOUSHIK</dc:creator>
		<pubDate>Mon, 26 Feb 2007 12:38:26 +0000</pubDate>
		<guid isPermaLink="false">http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/#comment-9</guid>
		<description>Can you please advice on this ?

One of our informatica batch is taking a long time(35 min) to execute a sorting query during Look up in a table. The table contains arround 47662364 records. And it takes 19-20 min to execute the below query and thats why it is creating problem in performance. Can we tune this query/ Table so that it takes less time ? There are other similar cases too where the query takes a long time to execute. The query :-
SELECT DETAIL_ID,NUMBER,YEAR FROM TABLE_DETAIL ORDER BY NUMBER,YEAR,DETAIL_ID
Is there any way to tune the table/query. I mean as far as my knowledge goes we can tune this in 2 ways
1. Creating and Using an unique Index on these 3 fields (DETAIL_ID,NUMBER,MVYEAR) on the table.
there is already an index on DETAIL_ID,NUMBER.
2. Creating some tuning method provided by Informatica :-
	a. change the default query like
	SELECT DETAIL_ID,NUMBER,YEAR FROM TABLE_DETAIL ORDER BY NUMBER,YEAR
	Here we can see that we have 2 fields instead of 3( in default query) to sort.
Shall this improve performance ? I am not sure that any of above two can improve performance, because the table is too large and we are sorting all records.

What do you think will be effective ? Database Tuning or Session tuning ?</description>
		<content:encoded><![CDATA[<p>Can you please advice on this ?</p>
<p>One of our informatica batch is taking a long time(35 min) to execute a sorting query during Look up in a table. The table contains arround 47662364 records. And it takes 19-20 min to execute the below query and thats why it is creating problem in performance. Can we tune this query/ Table so that it takes less time ? There are other similar cases too where the query takes a long time to execute. The query :-<br />
SELECT DETAIL_ID,NUMBER,YEAR FROM TABLE_DETAIL ORDER BY NUMBER,YEAR,DETAIL_ID<br />
Is there any way to tune the table/query. I mean as far as my knowledge goes we can tune this in 2 ways<br />
1. Creating and Using an unique Index on these 3 fields (DETAIL_ID,NUMBER,MVYEAR) on the table.<br />
there is already an index on DETAIL_ID,NUMBER.<br />
2. Creating some tuning method provided by Informatica :-<br />
	a. change the default query like<br />
	SELECT DETAIL_ID,NUMBER,YEAR FROM TABLE_DETAIL ORDER BY NUMBER,YEAR<br />
	Here we can see that we have 2 fields instead of 3( in default query) to sort.<br />
Shall this improve performance ? I am not sure that any of above two can improve performance, because the table is too large and we are sorting all records.</p>
<p>What do you think will be effective ? Database Tuning or Session tuning ?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Administrator</title>
		<link>http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/comment-page-1/#comment-8</link>
		<dc:creator>Administrator</dc:creator>
		<pubDate>Mon, 18 Dec 2006 13:36:31 +0000</pubDate>
		<guid isPermaLink="false">http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/#comment-8</guid>
		<description>subbu,

You can have as many sources as you want. If there are different data sources, I would make sure of the following
[
Before I start with that, let me say this.
Pulling data from multiple sources always rings a bell in my head.
Make sure that you pull tables from multiple data sources only if you want join them. It&#039;s ok to have multiple pipelines. Make sure you look at all the requirements and performance impacts before going forward with your approach.
]

1. All the data sources that you are pulling data from are related, that is, they have a logical functional(business) relationship.

2. There are keys on tables (which are from different data sources) that can be used to join these tables.

3. To assign connection values in session properties, open the session -&gt; goto properties -&gt; on the left hand side, select the source/target you want to define connection for -&gt; on the right hand side assign connection value.

I am going to post a tutorial on Informatica Basics soon. Keep checking the web site for that.</description>
		<content:encoded><![CDATA[<p>subbu,</p>
<p>You can have as many sources as you want. If there are different data sources, I would make sure of the following<br />
[<br />
Before I start with that, let me say this.<br />
Pulling data from multiple sources always rings a bell in my head.<br />
Make sure that you pull tables from multiple data sources only if you want join them. It's ok to have multiple pipelines. Make sure you look at all the requirements and performance impacts before going forward with your approach.<br />
]</p>
<p>1. All the data sources that you are pulling data from are related, that is, they have a logical functional(business) relationship.</p>
<p>2. There are keys on tables (which are from different data sources) that can be used to join these tables.</p>
<p>3. To assign connection values in session properties, open the session -> goto properties -> on the left hand side, select the source/target you want to define connection for -> on the right hand side assign connection value.</p>
<p>I am going to post a tutorial on Informatica Basics soon. Keep checking the web site for that.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: subbu</title>
		<link>http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/comment-page-1/#comment-7</link>
		<dc:creator>subbu</dc:creator>
		<pubDate>Mon, 18 Dec 2006 13:01:24 +0000</pubDate>
		<guid isPermaLink="false">http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/#comment-7</guid>
		<description>i have a  small dobt whn we r working on oracle source which are from different data bases how many source qualifiers we can use and wher we have to give database connections at session level pls give me reply to my mail subramanyam.ambati@gmail.com</description>
		<content:encoded><![CDATA[<p>i have a  small dobt whn we r working on oracle source which are from different data bases how many source qualifiers we can use and wher we have to give database connections at session level pls give me reply to my mail <a href="mailto:subramanyam.ambati@gmail.com">subramanyam.ambati@gmail.com</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan Linstedt</title>
		<link>http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/comment-page-1/#comment-6</link>
		<dc:creator>Dan Linstedt</dc:creator>
		<pubDate>Sun, 10 Dec 2006 12:31:40 +0000</pubDate>
		<guid isPermaLink="false">http://kirtandesai.com/write/2006/11/12/informatica-performance-tuning/#comment-6</guid>
		<description>Hey Kirtan,

This is a good basic start for P&amp;T - interesting statements, there are a few hints I&#039;ll add for you.  I&#039;m currently writing a new P&amp;T document, very extensive, that will be available (hopefully Q2 next year) - for sale, quite possible it may turn into a small hand-book.

1. If you use a joiner, always use a sorted joiner - the cost of sorting data coming from the RDBMS staging area is frequently less than the cost of building the caches in place.  Furthermore, if you are NOT on 64 bit Informatica, or you don&#039;t have unlimited (seemingly unlimited) RAM, you have an upper limit to the caching mechanisms in all the cached objects - including Joiner.
2. If the RDBMS is NOT tuned properly, putting more work into the RDBMS will actually slow things down.
3. Too many Instances of an RDBMS on the SAME MACHINE will actually kill performance rather than help it, and it really is not necessary in order to handle fail-over as many set it up to do.
4. Replacing a Lookup with a SORTED JOINER can improve performance dramatically.
5. The manuals have the formula wrong for the Data / Index Cache settings, even though it&#039;s counter-intuitive, you want 100% of the INDEX cached if you can get it, giving up Data Cache for disk.  Why? Because if you can&#039;t access RAM to check for the existence of data you are actually increasing I/O - when you increase I/O you slow performance dramatically.  Set your Index caches to twice the size of your data caches to be safe.

Hope these tips help,  Watch my site next year for new announcements on performance and tuning.
Dan L</description>
		<content:encoded><![CDATA[<p>Hey Kirtan,</p>
<p>This is a good basic start for P&amp;T &#8211; interesting statements, there are a few hints I&#8217;ll add for you.  I&#8217;m currently writing a new P&amp;T document, very extensive, that will be available (hopefully Q2 next year) &#8211; for sale, quite possible it may turn into a small hand-book.</p>
<p>1. If you use a joiner, always use a sorted joiner &#8211; the cost of sorting data coming from the RDBMS staging area is frequently less than the cost of building the caches in place.  Furthermore, if you are NOT on 64 bit Informatica, or you don&#8217;t have unlimited (seemingly unlimited) RAM, you have an upper limit to the caching mechanisms in all the cached objects &#8211; including Joiner.<br />
2. If the RDBMS is NOT tuned properly, putting more work into the RDBMS will actually slow things down.<br />
3. Too many Instances of an RDBMS on the SAME MACHINE will actually kill performance rather than help it, and it really is not necessary in order to handle fail-over as many set it up to do.<br />
4. Replacing a Lookup with a SORTED JOINER can improve performance dramatically.<br />
5. The manuals have the formula wrong for the Data / Index Cache settings, even though it&#8217;s counter-intuitive, you want 100% of the INDEX cached if you can get it, giving up Data Cache for disk.  Why? Because if you can&#8217;t access RAM to check for the existence of data you are actually increasing I/O &#8211; when you increase I/O you slow performance dramatically.  Set your Index caches to twice the size of your data caches to be safe.</p>
<p>Hope these tips help,  Watch my site next year for new announcements on performance and tuning.<br />
Dan L</p>
]]></content:encoded>
	</item>
</channel>
</rss>

