ETL´s: Talend Open Studio vs Pentaho Data Integration (Kettle). Comparative.

Posted by Roberto Espinosa en 1 junio 2010

(Read in Spanish language here)

Let’s try in this latest entry of the ETL processes series to make a comparison as complete as possible of Tools Talend Open Studio and Pentaho Data Integration (Kettle), which we have been using in last months. For this study to be as comprehensive and rigorous as possible, we will divide the task in 5 sections:

Using Talend Open Studio in ETL Process Design

Property table.
Examples of Use
Strengths / weaknesses table.
Resource Links (comparative and additional information.)
Final opinion.

Property table.

Product	TALEND OPEN STUDIO ver.4.0	PENTAHO DATA INTEGRACION CE (KETTLE) ver 3.2
Manufacturer	Talend – France	Pentaho – United States
Web	www.talend.com	www.pentaho.com
License	GNU Lesser General Public License Version 2.1 (LGPLv2.1)	GNU Lesser General Public License Version 2.1 (LGPLv2.1
Development Language	Java	Java
Release Year	2006	2000
GUI	Graphical tool based on Eclipse	Design Tool (Spoon) based on SWT
Runtime Environment	From design tool, or command line with Java or Perl language (independent of the tool)	From design tool, or command line with utilities Pan and Kitchen .
Features	With the design tool build the Jobs, using the set of components available.Work with project concept, which is a container of different Jobs with metadata and contexts. Talend is a code generator, so Jobs are translated into corresponding defined language (Java or Perl can choose when create a new project), compiled and executed .Components bind to each other with different types of connections. One is to pass information (which may be of Row or Iterate, as how to move the data). Also, you can connect with each other triggering connections (Run If, If Component Ok, If Component Error) that allow us to articulate the sequence of execution and ending time control.Jobs are exported at SO, and can run independently of the design tool on any platform that allows the execution of the selected language. In addition, all generated code is visible and modifiable (although you modify the tool to make any changes to the Jobs).	With the design tool built Spoon transformations (minimum design level) using the steps. At a higher level we have the Jobs that let you run the transformations and other components, and orchestrate process. PDI is not a code generator, is a transformation engine, where data and its transformations are separated.The transformations and Jobs are stored in XML format, which specifies the actions to take in data processing. In transformations use steps, which are linked to each other by jumps, which determine the flow of data between different components. For the jobs, we have another set of steps, which can perform different actions (or run transformations). The jumps in this case determine the execution order or conditional execution.
Components	Talend has a large number of components. The approach is to have a separate component as the action to take, and access to databases or other systems, there are different components according to the database engine that we will attack. For example, we have an input table component for each manufacturer (Oracle, MySQL, Informix, Ingres), or one for the SCD management for each RDBMS. You can see available components list here.	Smallest components set, but very much oriented towards data integration. For similar actions (eg reading database tables), a single step (no one from each manufacturer), and behavior according to the database defined by the connection. You can see available elements for transformations here and for jobs here.
Platform	Windows, Unix and Linux.	Windows, Unix and Linux.
Repository	Works with the workspace concept, at filesystem level. In this place you store all the components of a project (all Jobs, metadata definitions, custom code and contexts). The repository is updated with the dependencies of changend objects (expand to all project changes.) If we change the table definition in repository, for example, is updated in all the Jobs where it is used.	The Jobs and transformations are stored in XML format. We can choose to store at file system level or in the database repository (for teamwork). Dependencies are not updated if you change a transformation who is called from another. If the level of components within a single transformation or job.
Metadata	Full metadata that includes links to databases and the objects (tables, views, querys). Metadata info is centrally stored in workspace and its not necesary to read again from source or destination system, which streamlines the process. In addition, we can define metadata file structures (delimited, positional, Excel, xml, etc), which can then be reused in any component.	The metadata is limited to database connections, which metadata can be shared by different transformations and jobs.Database information (catalog tables / fields) or files specifications (structure) is stored in steps and can not be reused. This info is read in design time.
Contexts	Set of variables that are configured in the project and that can be used later in the Jobs for your behavior (for example to define productive and development environment).	Using Variables in tool parameters file (file kettle.properties). Passing parameters and arguments to the process (similar to the contexts), both in jobs and transformations.
Versions	It allows us to perform a complete management of objects versions (can recover previous versions)	Functionality provided in version 4.0.
Languages to define their own components (scripting)	Talend allows us to introduce our custom code using Java and Groovy.Additional custom SQL and Shell.	JavaScript used for the calculations and formulas.Aditional custom SQL, Java, Shell and open office formulas.
Additional tools	Talend offers additional tools for Data Profiling and Master Data Management (MDM). Open Studio have too a simple modeling tool to draw logical processes and models.	PDI 4.0 offers Agile functionality for dimensional modeling and models publication in Pentaho BI.
Plugins	Download new components through Talend Exchange .	Incorporation of additional plugins in the web .
Support	An complete online community with Talend’s wiki , Talend Forum and bugtracker for the management of incidents and Bugs.	Includes forum Pentaho , Issue Tracking and Pentaho Community .
Documentation	Complete documentation in pdf format that includes: Installation, User Manual and Documentation of components.	Online Documentation on the web. Books: Data Integration-Pentaho 3.2 Beginner’s Guide(M.C.Roldan), Pentaho Kettle Solutions (M.Casters, R.Bouman, J.van Dongen).

Examples of Use (in Spanish)

EXAMPLE	TALEND OPEN STUDIO	PENTAHO DATA INTEGRATION (KETTLE)
Charging time dimension of a DW	ETL process to load the time dimension. Example use of the ETL Talend.	Time Dimension ETL with PDI.
Implementation of dynamic sql statements	More examples of Talend. Execution of SQL statements constructed in execution time.	Passing parameters and dynamic operations in a transformation of PDI.
Loading a product dimension DW	Product Dimension ETL load. More examples of Talend. Using logs, metrics and statistics.	Product Dimension ETL with PDI (I).Extraccion to Stage Area. ETL Product Dimension with PDI (II). Loading to DW.
Charging a customer dimension DW	Talend ETL Mapping Types. Customer Dimension ETL.	Customer Dimension ETL with PDI.
Treatment of slowly changing dimensions	Management of SCD (slowly changing dimensions.)	Treatment Slowly Changing Dimensions (SCD) with PDI.
Connecting to ERP Sap	Connecting to Sap with Talend	Connecting to Sap with Kettle (plugin ProERPConn)
Charging sales fact tables in a DW	Sales Facts table. Talend contexts.	Loading ETL sales made with PDI. ETL load PDI budgeting facts.
Exporting Jobs and planning processes	Export Talend.Planification jobs in ETL processes.
Treatment of public data	Data Model and Process Load DW London’s public data.
Understanding the user interface		Building ETL processes using Kettle (Pentaho Data Integration)
Graduation Project (comparative Pentaho / JasperETL). Developed by Rodrigo Almeida, Mariano Heredia.	Thesis which details a Business Intelligence project using the tools of Pentaho and Jasper. In the ETL’s, compared Kettle with JasperETL (based on Talend).	Download the book on the web: http://sites.google.com/site/magm33332/bifloss. It includes a magnificent detail of the features of each tool.

As further examples, you can also consult:

Tutorial Talend Open Studio 4 by Victor Javier Madrid on the web adictosaltrabajo.com. Includes another example to process an EDI file with Talend.
White Paper on Open Source ETL tools conducted in French by Atol. It includes a series of practical examples very complete.

Using Pentaho Data Integration (Kettle) in ETL Process Design

Table of strengths / weaknesses.

From my point of view, with the user experience of the two tools and the information collected about tools and others user experience, I can highlight the following aspects:

TALEND OPEN STUDIO	PENTAHO DATA INTEGRATION (KETTLE)
It is a code generator and this issue implies a heavy dependence of the project chosen language (Java in my case). By choosing java, we have all the advantages and disadvantages of this language. You need a high level of this language to get the most out of the application.	It is a transformation engine, and notes from the outset has been designed by people who needed to meet their needs in data integration, with great experience in this field. It is also easier to manage the datatypes with PDI, it is not as rigorous as Java.
Tool unintuitive and difficult to understand, but once you overcome this initial difficulty, we observe the very great potential and power of the application.	Very intuitive tool, with some basic concepts can make it works. Conceptually very simple and powerful.
Unified user interface across all components. Based on Eclipse, the knowledge of the tool enables us to use the interface.	The design of the interface can be a bit poor and there is no unified interface for all components, being sometimes confusing.
Talend is investing significant resources in its development (through capital injections from various funds), which is producing a very rapid development of the tool. With a great potential, the product also is being supplemented with other tools for MDM and Data Profiling.	Much slower tool evolution and uncertain because Pentaho tends to leave the OpenSource focus.
Greater availability of components to connect to multiple systems and data sources, and constantly evolving.	Limited availability of components, but more than enough for most ETL or data integration process.
There is not a database repository (only in paid versions),but work with Workspace and the project concept gives us many opportunities. Very useful dependency analysis and update when elements are modified (which is distributed to all the Jobs of a project).	Database repository gives us many opportunities for teamwork. In this repository is stored xml, containing the actions that Transformations and Jobs take on the data.
A separated component by each database vendor.	A single component by database action type (and the characteristics of the connection used are those that determine their behavior).
Help shortcut in the application. Comprehensive online help components. When we designed our own code in Java, we have the context assistance of language provided by Eclipse.	Help poor, almost nonexistent in the application. The online help in the Pentaho website is not particularly full, and in some parts is very small, so that the only way to determine the functioning of the component is test it.
Logs: We can configure at project level or in each Job, indicating if we want to overwrite the configuration of the project in this regard. Log can be sent to database, console or file. The functionality is very developed, distinguishing logs of statistics, metrics and process logs (to handle errors).	Logs: different logs levels (from the most basic to the row detail in data flow). Sufficient to analyze the execution of the transformations and jobs. Possibility of record logs in database, but very limited. Log configuration is set at the level of transformations and Jobs.
Debug: Debug perspective in Eclipse, we can keep track of implementation (see the source code) as if we were programming in Eclipse. You can also include statistics and data display trace or response times in the execution of jobs using the graphical tool.	Debug: Contains a simple debug tool, very basic.
Versioning of objects: in the Workspace we can manage a complete versioning of the Jobs (with minus and major number). It enables us to recover earlier versions if problems occur. It has a massive tool to change versions of lots of objects (can be very useful for versioning of distributions).	Object versioning is scheduled to be included in version 4.0.
Parallelism: very small in Open version. Advanced functionality in paid versions (Integration Suite).	Parallelism: parallelism is very easy to make,using the Distribute Data option in the configuration of the information exchange between steps, but will have to take care with not consistent, depending type of process.
Automatic generation of HTML jobs documentation. Includes graphical display of the designs, tables of properties, additional documentation or explanatory texts that we have introduced in the components, etc. You can see an example here .	~~Unable to generate documentation of changes and Jobs.~~ Using the graphics tool, we can include notes with comments on the drawing process. With project kettle-cookbook you can generate html documentation.
Simple tool for modeling chart. With it we can conceptually draw our Jobs designs and processes.
Continuous generation of new versions, incorporating improvements and bug fixes.	The generation of new versions is not very common and we need to generate versions ourseles updated with the latest available sources: see blog entry of Fabin Schladitz .
Talend Exchange : place where the community develop their own components and share them with other users.	Pentaho also offers fans who develop and release plugins on their website , with less activity than Talend.
As a negative point, sometimes excessively slow caused by the use of Java language.	As a downside, some components are not behaving as expected, to perform complex transformations or by linking calls between different transformations in Jobs. The problems could be overcome by changing the design of transformations.
It is an advantage to have a local repository where we store locally information about database, tables, views, structures, files (text, Excel, xml). Being in the repository can be reused and associated components not need to re-read of the data source metadata every time (for the case of databases, for example). It has a query assistant (SQL Builder) with many powerful features.	When working with databases with very large catalogs, it is inconvenient to have to recover the entire building, for example, a sql statement to read from a table (when we use the option of browsing the catalog).
Code reuse: we can include our own libraries, which are visible in all Jobs in a project. This allows us a way to design our own components.	JavaScript code written in the step components can not be reused in other components. Fairly limited to adding new functionality or modify existing ones.
Control of flow processes: on the one hand we have the data flow (row, iterate or lookup) and on the other triggers for control execution and orchestration of processes. The row and iterate combination is useful to orchestrate the loop process, with the aim of repetitive process. There is a major inconvenience we may complicate the process: you can not collect several data streams coming from the same origin (must have a different point of departure).	Flow control processes: passing information between components (steps) with jumps, in a unique manner, and the resulting flow varies with the type of control used. This approach has limitations with the control of iterative processes. As an interesting feature, encapsulation of transformations through the mappings, which allows us to define transformations for repetitive processes (similar to a function in a programming language).
Handling errors: when errors occur, we can manage the log, but we lose control. We cant reprocess rows.	Error management: errors management in the steps allow us to interact with those mistakes and fix them without completing the process (not always possible, only in some steps).
Execution: either from the tool (which is sometimes very slow, especially if you include statistics and execution traces). To run at command line level, is necesary to export Jobs. The export generates all objects (jar libraries) necessary to perform the job, including a .bat or .sh file to execute it. This way allow us to execute the job in any platform where you can run java or perl language, without needing to install Talend.	Execution: either from the tool (pretty good response times) or at command level with Pan (for transformations) and Kitchen(for jobs) . They are two very simple and functional utilities that allow us to execute XMLs specifications of jobs or transformations (either from file or from the repository). It is always necessary to run the process have installed the PDI tool. We have also Carte tool, a simple web server that allows you to execute transformations and jobs remotely.

Comparatives and additional information.

Benchmark between Talend and Pentaho Data Integration by Matt Caster.	http://www.ibridge.be/?p=150. Reviews about in Goban Saor and Nicholas Goodman blogs.
Comparison between Kettle and Talend, by Vicent McBurnety (2007)	http://it.toolbox.com/blogs/infosphere/wiki-wednesday-comparing-talend-and-pentaho-kettle-open-source-etl-tools-16294
Comparison between Kettle, Talend and CloverETL.	http://www.cloveretl.org/_upload/clover-etl/Comparison% 20CloverETL 20vs%%%% 20and 20Talend 20Pentaho.pdf
Benchmark between PDI, Talend, Datastage e Informatica	http://it.toolbox.com/blogs/infosphere/etl-benchmark-favours-datastage-and-talend-28695
Compare features and benchmark between Talend and PDI	http://www.atolcd.com/fileadmin/Publications/Atol_CD_Livre_Blanc_ETL_Open_Source.pdf
Comparison between PDI and Informatica	http://www.jonathanlevin.co.uk/2008/03/pentaho-kettle-vs-informatica.html
Comparative table of features on the web Openmethodology.org	Talend vs PDI (Kettle).
Simple example comparing Kettle and Talend	http://forums.pentaho.org/showthread.php?t=57305
Comparison of ETL tools Adepti made by Adeptia company.	http://www.adeptia.com/products/etl_vendor_comparison.html
Pentaho Kettle Open Source Review	http://www.datawg.com/pentaho-kettle-open-source-etl-tool-review.html
Benchmark between Talend and Pentaho Data Integration by Marc Russel.	http://marcrussel.files.wordpress.com/2007/08/benchmark-tos-vs-kettle.pdf
Comparison between Datastage, Talend, Informatica and PDI (Manapps).	http://marcrussel.files.wordpress.com/2009/02/etlbenchmarks_manapps_090127.pdf
PowerPoint Presentation comparative Talend vs. Kettle	http://svn2.assembla.com/svn/bbdd_dd/Presentaciones/Kettle%
EOS: Open Source Directory. EOS: Open Source Directory.	http://www.eosdirectory.com/project/397/Talend+Open+Studio.html vs http://www.eosdirectory.com/project/202/KETTLE++pentaho+data+integration+.html

Summary (final opinion).

From my point of view, I think both tools are complementary. Each one with a focus, but allow the same tasks of transformation and data integration. The product Talend has more future, since they are putting many resources in its development, and is being supplemented with other tools to create a true data integration suite. Also used in the Jaspersoft project, the fact of being more open and can be complemented with the use of Java gives certain advantages over Pentaho.

By the other hand, Pentaho Data Integration is a very intuitive and easy to use. You can see from the beginning when you start to use, as I mentioned, which is developed through the prism of the problems of ETL processes and data transformation. In some aspects it is faster and more agile than Talend, not having to be moving Java code generation all the time. We misses the management of a truly integrated project repository, such as Talend, and an independent metadata of source/target systems.

A level of performance, and reviewing the different comparations and benchmarks, not see a clear winner. A tool is faster in some things (Talend in calculating additions or Lookups), while Pentaho is faster, for example in the treatment of SCD or the parallelization process. In my ETL processes there have been no large differences in performance, although I found slightly more agile Pentaho when performing mass processes.

Taking into account all seen (and everything detailed in the above), I opted for Talend slightly, but before choosing a tool for a project, I conduct a thorough study of the type of work and casuistic to which we will face in the design of our processes before opting for one or other tool. You may have specific factors that may recommend the use of one or another (such as the need to connect to a particular application or platform in which to run the process). What is clear in both cases, is that either we could hold for the processes of construction of a DW in a real environment (as I have shown in this blog with the whole series of published examples.)

If you have worked with some of the tools, or both, maybe have to add something to this comparison. I hope your opinions.

Updated 18/06/10

I leave the link to the last Gartner study about Data Integration Tools, published in November 2009:

Magic Quadrant for Data Integration Tools.

In the latter study was included Talend as a emerging provider of data integration tools. If you are considering working with Talend, they say interesting things (both Strengths and Cautions) on the evolution of the product and its future:

Strong Points:

Two levels: entry-level Open Source tool free (Talend Open Studio) and higher with a payment tool with more features and support (Talend Integration Suite).
Talend is getting almost unanimously positive results in business. Although the initial factor may be its price, its features and functionality are the second factor in its success.
Good connectivity in general. Complemented with the tools of Data Profiling and Data Quality. The passage of the versions to pay Open requires no extra learning curve.

Precautions:

There is a shortage of experts in the tool, although it is developing a network of alliances with other companies and develop its commercial network (although they are not in all regions).
There are some problems with the central repository (which is not in the Open version), when working to coordinate development works. Looks like they are trying to solve in the new versions.
Some customers have reported that the documentation is wrong and some problems in metadata management. It is also necessary to have an expert in Java or Perl to take full advantage of the tool (such as indicated in comparison with Pentaho).

I recommend reading the report, there is a lot of information about data integration tools and particularly if you are looking for information on any individual (such as Talend, the purpose of this comparative product).

This entry was posted on 1 junio 2010 a 0:01 and is filed under ETL, Kettle, Pentaho, Talend. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, o trackback from your own site.

34 respuestas to “ETL´s: Talend Open Studio vs Pentaho Data Integration (Kettle). Comparative.”

Miguel Angel Pérez Gómez said

3 junio 2010 a 21:37
just terrific…as usual 😉

Responder
- Roberto Espinosa said
  
  3 junio 2010 a 21:46
  A monster…jeje..
  
  Thank you, Miguel Angel!!!!!
  
  Responder
Werner said

21 junio 2010 a 16:31
Roberto,

this is great – many thanks!

Responder
Sylvain said

3 noviembre 2010 a 16:03
Hello

A nice post and a very good job !

Sylvain – http://www.osbi.fr
(french author of «Atol CD ETL White book»)

Responder
- Roberto Espinosa said
  
  3 noviembre 2010 a 16:53
  Thank you, Sylvain.
  
  Very good your blog.
  
  Best regards!!!
  
  Responder
Roland Bouman said

4 noviembre 2010 a 14:11
Hi!

this seems like a very fair and extensive comparison. Thanks!

I don’t agree with everything that’s here though: for example, the way I perceive it, Pentaho/Kettle is more open than Jaspersoft/Talend; scripting support in Kettle also covers custom SQL, Java, shell and open office formulas; finally kettle supports transparent clustering over multiple nodes. But I understand that you can’t mention every little feature so that’s ok.

Final thing I’d like to note is that Pentaho/Kettle does support auto-documentation via the kettle-cookbook community project: http://code.google.com/p/kettle-cookbook/
There are actually quite a lot of community projects that extend and/or complement Pentaho/Kettle functionality, mostly in the form of transformation and/or job plugins.

Thanks again and kind regards,

Roland Bouman – rpbouman.blogspot.com
Author of «Pentaho Kettle Solutions» (Wiley, ISBN: 978-0-470-63517-9)

Responder
- Roberto Espinosa said
  
  4 noviembre 2010 a 14:55
  Hi Roland:
  
  Thank you very much for your comments. I take its into account. I Will update the comparative with this aspects (I dont know the kettle-codebook project). Think I haven´t so deep knowledge about Kettle like you.
  
  Congratulations for your book and best regards.
  
  Responder
Tweets that mention ETL´s: Talend Open Studio vs Pentaho Data Integration (Kettle). Comparative. « El Rincon del BI -- Topsy.com said

4 noviembre 2010 a 14:57
[…] This post was mentioned on Twitter by Youngwoo Kim, andrea_gioia. andrea_gioia said: ETL´s: Talend Open Studio vs Pentaho Data Integration (Kettle). Comparative > http://bit.ly/dcqFXU @talend #kettle #etl […]

Responder
Michael Waclawiczek said

15 enero 2011 a 14:57
I recommend that you take a look at expressor Studio. It’s a brand new, easy to use data integration application with a simplified data mapping approach. You can download a free copy at http://www.expressorStudio.com.

Responder
Vergleich TOS und Kettle - Dijit said

16 febrero 2011 a 10:03
[…] Studio (TOS) mit Pentaho Data Integrator (Kettle) einer Gegenüberstellung unterzogen. Der Artikel Talend Open Studio vs. Pendaho Data Integration (Kettle). Comparative. ist in Englisch verfasst und steht dort auch schon seit Juni 2010 […]

Responder
Father Time said

1 marzo 2011 a 17:51
I have been using Talend for almost a year on and off, and am looking for something better at this point.
More often than not, I have to recreate the project entirely if i remove a component, or move something around. Mapping and the IN/OUT links get lost in the java created code, errors about the schema being different than what is expected come and go and ‘NullPointerExceptions’ appear out of no where.

Incredibly frustrating and forcing me to look at my other choices.
I have lost days in total when trying to trouble shoot an issue that the only resolution ended up being to totally start from scratch.

Responder
- Roberto Espinosa said
  
  6 marzo 2011 a 20:39
  Hello Father Time:
  
  Thanks for comments about your experience with Talend!!!…I think Talend is a tool still in evolution, and it has a lot a good thinks. By the other way, Kettle (PDI) is growning in the new version and i think you can give it a try!!!…As I explain in my comparison, both are complementary tools
  
  Responder
- sherkhan said
  
  16 marzo 2012 a 7:30
  When you get a stack-trace error, you can open the code tab to access the java and jump to the line where the error happened (ctrl-L).
  Other errors can easily be spotted by opening the code tab and looking for the red marks where there are compiler errors.
  
  Responder
Roland Bouman said

1 marzo 2011 a 17:55
Father Time, give Pentaho Data Integration (kettle) a go.

If you change a transformation, you can use the «Verify» option which checks the logical validity of the model, and you can use the «Get SQL» option to generate the SQL implied by the transformation (both options are in the «Transformation» menu). While not 100% fool proof, these options actually let you troubleshoot issues fairly quickly.

Responder
- Father Time said
  
  1 marzo 2011 a 17:59
  Roland.
  
  I did look at Kettle a while back, but ended up on talend. Time to look again.
  I will check those options out as well.
  
  Thanks.
  
  Responder
Father Time said

1 marzo 2011 a 23:09
So today I update the IP of one of the databases and allow talend to update the schemas of any place that db was used.

That ran as it has before but now when I run the job, I get no connection to the db at all and it times out.

I know this isnt a ‘talend help’ topic, but this house of cards seems to be falling apart.

Where is this ‘Transformation’ menu option in talend open studio?
Thanks.

Responder
Roland Bouman said

1 marzo 2011 a 23:12
Father time, I mean the «Transformation» main menu item in kettle.

Responder
Patrick said

29 marzo 2011 a 16:05
Hello Father Time,

Might be a little late, but the best way to get an answer is to go on the Talendforge forum to ask your question to the community. You will be quickly getting an answer from the members: Talend users, professionals, consultants and experts.
Check the forums out: http://www.talendforge.org/forum/

Thanks,
Patrick – Talend

Responder
Father Time said

29 marzo 2011 a 16:26
Patrick,

I have and either my questions are too cryptic or have no answer, because too many have gone unanswered. Even people who have had the same issues, I try to respond to those posts to bump them to get them noticed, but it doesn’t normally help.
And with the amount of times that ‘start a new job from scratch’ (search those forums, you’ll see what i mean) has been the only way to get something that was working, to work again, it (talend) feels very brittle and unsettling.

I’m not an advanced user of talend and the things I have tried to do or understand how to do all seem pretty standard, so I wouldn’t think that what I am trying to do is beyond its claimed abilities.

Responder
Jan Lolling said

28 febrero 2012 a 14:34
Java is NOT slow! By the way Kettle is a also a Java application and if Java applications are slow, mostly it is caused by poor design or object management!

Responder
Chris Developer said

5 marzo 2012 a 17:24
Hello,

we are using Talend since 1,5 Years to build up an Enterprise Datawarehouse. We have had massive problems in terms of Performance. We are now putting all logic into PL/SQL since Talend has just become to slow to load data on a daily basis. In addition there are major Problems with the metadata Repositories. We always have to migrate the whole repository to migrate between the DEV-Test and Prod environments. Contexts are basically not usable since there are problems with inheriting context-settings to subjobs. We will be switching to one of the Major vendors (Informatica, Data Stage, SSIS) inside of the next 2 Month. Cost of Maintenance by far outnumber the higher lincensing cost for stable Software like Informatica. Believe me, do not use Talend on a major Project. Talend might be the right joice for small Datamigration-Projects and has lots of features to offer but the basic features needed for a Datawarehouse are just working to poor.

Responder
Jens Bleuel said

17 abril 2012 a 16:27
Another change in the feature list of Pentaho Kettle: Data Profiling and Data Quality (Human Inference) Integration with Kettle
see http://kettle.bleuel.com/2012/04/17/data-profiling-and-data-quality-human-inference-integration-with-kettle/

Responder
loans said

29 abril 2013 a 2:23
Its like you read my mind! You appear to know a lot
about this, like you wrote the book in it or something. I think that you could do with a few pics to drive the message home a little bit, but instead of that, this is wonderful
blog. A great read. I will definitely be back.

Responder
Aman Kataria said

24 May 2013 a 12:17
awesom stuff man that really would have required some effort

Responder
themed bridal showers said

27 junio 2013 a 23:09
Greetings! I know this is kinda off topic but I was wondering
which blog platform are you using for this site?
I’m getting tired of WordPress because I’ve had problems with hackers and I’m looking at options for another platform. I would be awesome if you could point me in the direction of a good platform.

Responder
Click Here. said

30 junio 2013 a 6:30
Hurrah! At last I got a webpage from where I be able to genuinely obtain
valuable information concerning my study and knowledge.

Responder
simple websites said

30 julio 2013 a 22:17
simple websites

ETLÂ´s: Talend Open Studio vs Pentaho Data Integration (Kettle). Comparative. « El Rincon del BI

Responder
Beatrice said

4 agosto 2013 a 19:22
The fare on easy Jet is starting one flight a day from Stansted to Dubrovnik on 10 August for a fortnight costs only 179. Let’s hope for a nice match with the better result for us. At the time hotel zvonimir was at war, fighting for the ball with Francesco Totti of Italy.

Responder
Margie said

14 octubre 2014 a 21:20
An interesting discussion is definitely worth comment.
I think that you need to publish more on this issue, it may
not be a taboo subject but generally folks don’t discuss these subjects.
To the next! All the best!!

Responder
Steinberg Itamar said

2 diciembre 2014 a 16:00
thank you for the comparison , it helps to understand the difference between talend and PDI , I my self prefer PDI because i find it very easy to use, of course there is always a learning curve but once you learn the steps (small building blocks of a transformation) then it become simpler. about the slowness remarks, i agree there are several steps that could work faster , like the json input but they are very few, you need to be familiar with the steps and pick the right one for example – using bulk load instead of table output , also i think its very stable and have a large community . very good job

Responder
kingdom hearts 2.5 hd remix walkthrough said

6 May 2015 a 21:25
If you would like to improve your familiarity only keep visiting this
website and be updated with the latest news posted here.

Responder
hvac jobs in houston said

15 julio 2015 a 6:41
hvac jobs in houston

ETLÂ´s: Talend Open Studio vs Pentaho Data Integration (Kettle). Comparative. « El Rincon del BI

Responder
emie2 said

9 julio 2016 a 7:58
Excellent job!
Any significant updates to comparison?

Responder
GrayMatter Software Services said

27 noviembre 2017 a 11:39
Thanks for sharing such a useful information on Pentaho. Get more details about Pentaho Support @ http://pentaho.graymatter.co.in/pentaho-support

Responder