Showing newest posts with label Microsoft. Show older posts
Showing newest posts with label Microsoft. Show older posts

Tuesday, April 7, 2009

SSIS Vs Talend Open Studio

From an initial evaluation of Talend Open Studio, open source ETL tools and comparison with Microsoft SQL Server Integration Services (SSIS) here are few quick points that were observed.

Talend Open Studio is an Open Source ETL Tool. The Tool is based off of Java and Perl. The tool presents us with wide range of ETL elements like

  1. Data connectivity (supports wide range of data connectivity from mySQL to Teradata supports all the widely used ODBC drivers) while SSIS has a generic ODBC connection manager, using which one can connect to data sources.
  2. Supports Java/Perl scripts while SSIS supports Microsoft VB .NET / C# scripts (with SQL 2008).
  3. Developer support from the Talend Forums and Tool Documentation while SSIS comes with MSDN Books Online, hands on labs, screen casts and webcasts, as well as TechNet forums from Microsoft.

Coming to the various components/elements of the tool here's a very high level overview,

Microsoft SSIS

Elements

Control Flow

Containers:- For Each Loop, For Loop, Sequence Containers, etc.

Tasks:- Execute SQL Task, Execute Package Task, Data Flow Task, Script Task, etc

Data Flow

Source:- ODBC, Excel, Flat Files, XML, etc

Transformations:- Aggregate, Sort, Lookup, Slowly Changing Dimension Fuzzy lookup fuzzy grouping, etc.

Destinations:- ODBC, Excel, Flat Files, XML, etc

Event Handlers

Error Handling and Logging Mechanisms:- Event Logging, Checkpoints, Error Handling

Variables

Global and local scoped variables

Configuration Files


Talend OpenStudio

Tasks/Elements

Business Intelligence components

Supports Slowly Changing Dimensions, Supports MDX queries

Business Components

Supports Microsoft AX

Custom Code

Allows Custom Java code (like Script Task/Script Components in SSIS)

Data Quality

Adding Surrogate Keys, Lookup (Fuzzy, Inteval)

Database Component

Similar to DataFlow Container, Inputs, Outputs

Database Utilities Components

Create Tables, ParseRecordset

ELT Component

Aggregates, Filter rows/columns

File Component

All File Operations

Internet Component

FTP, RSS, Mails, Mom, Web services, Sockets, XML RPC, Files

Logs & Errors Components

Error Handlers and loggers, Job Kill,

Misc Group Components

Miscellaneous Components

Orchestration Components

Job Sequencing

Processing Components

Aggregations, Mapping, Transformations, Filtering, Denormalizations

System Components

Operating System level tasks

XML Components

All XML operations

Variables

Supports creations and usage of variables

Conclusion

In overall perspective, we save on licensing cost associated with Microsoft SQL Server Integration Services over Talend Open Studio, but we lose out on several other aspects like,

  • Unified BI toolset offered from Microsoft in the form of MS BI toolset which includes Integration Services, Reporting Services, and Analysis Services when purchasing a single license of Microsoft SQL Server 2005/2008.
  • Developer support from online communities for MS BI.
  • Online Trainings in the form of Webcasts/ Virtual Labs/ Hands on labs etc.