Tuesday, April 7, 2009

SSIS Vs Talend Open Studio

From an initial evaluation of Talend Open Studio, open source ETL tools and comparison with Microsoft SQL Server Integration Services (SSIS) here are few quick points that were observed.

Talend Open Studio is an Open Source ETL Tool. The Tool is based off of Java and Perl. The tool presents us with wide range of ETL elements like

  1. Data connectivity (supports wide range of data connectivity from mySQL to Teradata supports all the widely used ODBC drivers) while SSIS has a generic ODBC connection manager, using which one can connect to data sources.
  2. Supports Java/Perl scripts while SSIS supports Microsoft VB .NET / C# scripts (with SQL 2008).
  3. Developer support from the Talend Forums and Tool Documentation while SSIS comes with MSDN Books Online, hands on labs, screen casts and webcasts, as well as TechNet forums from Microsoft.

Coming to the various components/elements of the tool here's a very high level overview,

Microsoft SSIS

Elements

Control Flow

Containers:- For Each Loop, For Loop, Sequence Containers, etc.

Tasks:- Execute SQL Task, Execute Package Task, Data Flow Task, Script Task, etc

Data Flow

Source:- ODBC, Excel, Flat Files, XML, etc

Transformations:- Aggregate, Sort, Lookup, Slowly Changing Dimension Fuzzy lookup fuzzy grouping, etc.

Destinations:- ODBC, Excel, Flat Files, XML, etc

Event Handlers

Error Handling and Logging Mechanisms:- Event Logging, Checkpoints, Error Handling

Variables

Global and local scoped variables

Configuration Files


Talend OpenStudio

Tasks/Elements

Business Intelligence components

Supports Slowly Changing Dimensions, Supports MDX queries

Business Components

Supports Microsoft AX

Custom Code

Allows Custom Java code (like Script Task/Script Components in SSIS)

Data Quality

Adding Surrogate Keys, Lookup (Fuzzy, Inteval)

Database Component

Similar to DataFlow Container, Inputs, Outputs

Database Utilities Components

Create Tables, ParseRecordset

ELT Component

Aggregates, Filter rows/columns

File Component

All File Operations

Internet Component

FTP, RSS, Mails, Mom, Web services, Sockets, XML RPC, Files

Logs & Errors Components

Error Handlers and loggers, Job Kill,

Misc Group Components

Miscellaneous Components

Orchestration Components

Job Sequencing

Processing Components

Aggregations, Mapping, Transformations, Filtering, Denormalizations

System Components

Operating System level tasks

XML Components

All XML operations

Variables

Supports creations and usage of variables

Conclusion

In overall perspective, we save on licensing cost associated with Microsoft SQL Server Integration Services over Talend Open Studio, but we lose out on several other aspects like,

  • Unified BI toolset offered from Microsoft in the form of MS BI toolset which includes Integration Services, Reporting Services, and Analysis Services when purchasing a single license of Microsoft SQL Server 2005/2008.
  • Developer support from online communities for MS BI.
  • Online Trainings in the form of Webcasts/ Virtual Labs/ Hands on labs etc.

2 comments:

Patrick_Talend said...

From Talend, I'd like to comment on your blog post.

As you say, there are no license fees when using Talend Open Studio.

If you want a thorough developer support for Talend, you need to go to our forum as you pointed out. The community is large and active, able to answer to your questions. If you find a bug, you can also report it on the bugtracker.

About training with Talend, there is an extensive documentation on the website (http://www.talend.com/resources/documentation.php), tutorials you can watch (http://www.talendforge.org/tutorials/menu.php) and webinars you can follow (http://www.talend.com/webinar/index.php) to learn more about Talend.

Thanks for the review,
Patrick - Talend.

Shashidhara Krishnamurthy said...

Thank you for your comments Patrick. This information would definitely help out people seeking more information on Talend Open Studio.

Thank you,
Shashidhara Krishnamurthy

Post a Comment