Tuesday, April 7, 2009

Preview into Azure Services

Microsoft has stormed into the Cloud Computing environment with its release of 'Azure Services' into the air. This blog provides a preview into Azure Services. The CTP version of Azure was launched in the PDC 2008. Ever since, it has caught the attention of many of the Organizations. Before we rip open Azure, we need to understand what the Cloud computing is all about. Why is there so much buzz around it?




Technically, Cloud computing encompasses any subscription-based or pay-per-use service that, in real time over the Internet, extends IT's existing capabilities.

In simple words, it is a service that allows you to host your application in their environment, provide you with the required hardwares, softwares and services on a subscription / pay per use basis. At the same time, they reduce the upfront investment on planning, purchase of hardware and basic softwares. Thus lowering the initial investment cost and turning it into maintenance cost.

Microsoft Azure is a cloud computing solution, that encompasses a number of servers (The count keeps increasing by 10,000 every month) bound together the 'Azure Fabric'. Azure takes care of all the updates/ maintenance for the server, data back up, logging and other tasks. This allows the user to keep focus on the business requirements and leave the scalability / availability / server maintenance tasks to Azure.

A number of services are provided on top of this fabric

  • .NET Services
  • SQL Services
  • LIVE Services
  • Sharepoint Services *
  • Dynamic CRM Services *

* These services are not available in the CTP version.

You can opt for the services that you require, thus you will be paying only for the services that you use. Some of the other advantages that Azure brings to the table are

  • Dynamic Provisioning
    • Increase/ decrease the resource used at run time
  • Lower cost
    • Pay as you grow. Will be billed only for resources used.
    • No Upfront cost on building / planning infrastructure
  • Reduced Administration Overhead
    • Easy and quick deployment for Azure hosted application
    • Server maintenance are already taken care of.
  • Developer benefits
    • Azure supports .NET, IIS, VS08
    • Does not require extensive training to gain expertise

From a developer perspective, the main advantage that Azure brings to the table is that, it allows the dev team to focus on the core business requirement and not on the administrative / extensibility or other hardware issues. Azure provides a platform for a host of the next generation applications.

Azure forms a platform for a new breed of application. But does that mean you need to jump into azure? Following are some of the key factors that needs to be considered before making any decisions. If these signatures exist on your application, then Azure definitely stays on the table.

  • Potential to grow many folds in terms of data / usage?
  • Use social networking type of application?
  • Use highly resource intensive functions from time to time?
  • Require a environment to test the market?
  • Require safe and secure mechanism to communicate (Internet service bus ) outside organizations network.
  • Want to lower the initial cost involved in infrastructure planning and purchase
  • Want to lower the administrative cost of maintaining the servers ( Data backups, audit, connectivity and other server issues)

Migrating an application to Azure does not mean that the investment made on the traditional on premise application will no longer be used. In fact Azure can work hand in hand with traditional application. The traditional application can be extended to move the resource intensive functions to the cloud.

Based on the customer requirement, a cloud application can be

  • Completely hosted in the cloud
  • A part of the application can be hosted in the cloud while the data resides on premise.
  • The application resides on Premise and the data stored on Azure.

Having said all this, we still have to understand that Azure as well as the cloud computing environment is still in a evolving stage. With all the advantages that cloud brings, it also brings a dependency to the service provider. Some of the common concerns that have bubbled up are the in terms of SLAs, maintenances charges etc which can change as times change.

Taking into consideration the Microsoft branding and the kind of investment that has been put into Azure, one thing is certain, Azure is here to stay. It is time to get the creative juices to make the best use of Azure.

SSIS Vs Talend Open Studio

From an initial evaluation of Talend Open Studio, open source ETL tools and comparison with Microsoft SQL Server Integration Services (SSIS) here are few quick points that were observed.

Talend Open Studio is an Open Source ETL Tool. The Tool is based off of Java and Perl. The tool presents us with wide range of ETL elements like

  1. Data connectivity (supports wide range of data connectivity from mySQL to Teradata supports all the widely used ODBC drivers) while SSIS has a generic ODBC connection manager, using which one can connect to data sources.
  2. Supports Java/Perl scripts while SSIS supports Microsoft VB .NET / C# scripts (with SQL 2008).
  3. Developer support from the Talend Forums and Tool Documentation while SSIS comes with MSDN Books Online, hands on labs, screen casts and webcasts, as well as TechNet forums from Microsoft.

Coming to the various components/elements of the tool here's a very high level overview,

Microsoft SSIS

Elements

Control Flow

Containers:- For Each Loop, For Loop, Sequence Containers, etc.

Tasks:- Execute SQL Task, Execute Package Task, Data Flow Task, Script Task, etc

Data Flow

Source:- ODBC, Excel, Flat Files, XML, etc

Transformations:- Aggregate, Sort, Lookup, Slowly Changing Dimension Fuzzy lookup fuzzy grouping, etc.

Destinations:- ODBC, Excel, Flat Files, XML, etc

Event Handlers

Error Handling and Logging Mechanisms:- Event Logging, Checkpoints, Error Handling

Variables

Global and local scoped variables

Configuration Files


Talend OpenStudio

Tasks/Elements

Business Intelligence components

Supports Slowly Changing Dimensions, Supports MDX queries

Business Components

Supports Microsoft AX

Custom Code

Allows Custom Java code (like Script Task/Script Components in SSIS)

Data Quality

Adding Surrogate Keys, Lookup (Fuzzy, Inteval)

Database Component

Similar to DataFlow Container, Inputs, Outputs

Database Utilities Components

Create Tables, ParseRecordset

ELT Component

Aggregates, Filter rows/columns

File Component

All File Operations

Internet Component

FTP, RSS, Mails, Mom, Web services, Sockets, XML RPC, Files

Logs & Errors Components

Error Handlers and loggers, Job Kill,

Misc Group Components

Miscellaneous Components

Orchestration Components

Job Sequencing

Processing Components

Aggregations, Mapping, Transformations, Filtering, Denormalizations

System Components

Operating System level tasks

XML Components

All XML operations

Variables

Supports creations and usage of variables

Conclusion

In overall perspective, we save on licensing cost associated with Microsoft SQL Server Integration Services over Talend Open Studio, but we lose out on several other aspects like,

  • Unified BI toolset offered from Microsoft in the form of MS BI toolset which includes Integration Services, Reporting Services, and Analysis Services when purchasing a single license of Microsoft SQL Server 2005/2008.
  • Developer support from online communities for MS BI.
  • Online Trainings in the form of Webcasts/ Virtual Labs/ Hands on labs etc.