Taking the Pain out of Excel Reporting with a useful Talend Extension

Creating Excel Reports just got a whole lot easier.

Creating Excel Reports is Easy

Creating Excel Reports is Easy

You can read more about the tool in the embedded PDF titled Creating Excel Reports Made Easy or download it from here.

The source code is located on the talendForge site. (http://talendforge.org/exchange/tos/?cid=6). If you search by Author Hugo in the java category you can download the addin.

Love it or loath it the spreadsheet is a ubiquitous business application which empowers the end user. Whilst data can be conveyed in many formats, most business users are happiest with data in Excel.

I think every IT professional and business user alike will agree with the above statement. While spreadsheets can be a complete nightmare and are frequently abused, for many purposes including reporting and analysis the business find them to be the ideal medium.

Having spent the majority of my IT career working in data related fields I have produced more than my fair share of ETL routines and reports. However until recently something was missing; creating Excel reports was a pain.

Whilst you could create a report in a tool such as Business Objects, Crystal Reports, JasperReports, JFreeReports or any other dedicated reporting tool then export the report to Excel, I have always found these tools to be inflexible when exporting to Excel. They lack the ability to have features such as split panes, drop down filters (eg autofilters), multiple worksheets and lack control over the layout of the data.

Therefore I have created an open source extension for Talend in java which I hope can ease this pain. Traditionally you could create Excel reports in java by using a library such as POI or JXL and writing plenty of code. Alternatively you could write the program in VB/A or .NET but all of these approaches require large amounts of code and can be inflexible, with many lines dedicated to trivial issues such as formatting and data placement. If you wanted to change the report layout or format you needed to change the code and then undergo regression testing to ensure that you hadn’t inadvertently broken something else.

By using this extension and a template based approach the whole process is greatly simplified, but certainly doesn’t lack in power or flexibility. Indeed if you discover there are features which you can’t accomplish with the tool and templates you can include a piece of VBA in the template which will run when the workbook is open to complete the tasks!

I believe that the tool is straightforward and it should be possible for a business user or someone without extensive IT training to quickly get up to speed and create great looking reports without the need for a single line of bespoke code. More adventurous users will appreciate the power offered by the Tags and ability to loop on collections and access underlying features such as renaming worksheets based on object field values. You can read more about the tool in the attached PDF titled Creating Excel Reports Made Easy, infact you should use that as your guide to get up and running with the tool, since it contains numerous useful examples.

Whilst I have endeavored to test the extension and make it as flexible as possible I am of course limited to my own imagination and tests. I would therefore be greatful if others would download the tool provide feedback that will help to improve the tool and benefit us all.

In a nutshell the report is creating by using Talend and mapping the incoming data to objects. The Image below shows a template where all data has been mapped to a single employee object – hence the tags such as ${employee.name}

See how easy it can be

See how easy it can be

By using the expressions $[b2 + c2] and $[sum(e2)] in the above report we actually retain one of the many advantages of excel, notably the ability to change fields and see the effects, since at run time these values will become formulas such as =sum(e2:e10). Therefore the business might choose to flex the created spreadsheet to see what the effect would be of changing some variables. In a more traditional report generated by a tool like Crystal these fields would be pre-calculated and we loose one of the advantages which excel offers.

It is possible to nest these objects as well eg you might have an incoming row which has joined two relational tables together eg Emp and Dept. In this situation you would map some fields to the Employee Object and some fields to the Department Object, in addition you would tell the tool on which fields it can join the Employee and Department objects.

By allowing for the nesting of objects it is possible to create nested / banded reports eg see below: – Note if the pictures looks funny please trust me this is Excel, however I use a mac – so it should be noted that these spreadsheets have been produced on a unix machine, although being java based the reports can also be created on windows and linux.

A Report with Grouping And Outlines

A Report with Grouping And Outlines

The actual Excel Template looks like the one below – note this is one of the more complex spreadsheets you could hope to create since it has a variety of features. The Tags use JEXL syntax and should therefore be familiar to anyone with a knowledge of JSP – and are simple to pick up by business users. If the Tags scare you then believe me, you would need to write a lot more code in VBA or a similar language to import the data from the source system and create a report such as this.

The Template use to create the Banded Report

The Template use to create the Banded Report

The main piece of the work is done by the Talend Extension – however fear not this is very simple to use. It comprises of very few fields.

Mapping the Elements

Mapping the Elements

Please don’t let the images showing tags scare you, they really are quite simple when you get the hand of them.

In addition it is possible to create charts, filter the data, add outlines to the data, group the data and use a whole host of formulas.

The source code is located on the talendForge site. (http://talendforge.org/exchange/tos/?cid=6). If you search by Author Hugo in the java category you can download the addin.

Enjoy

Hugo

About these ads

16 Responses to “Taking the Pain out of Excel Reporting with a useful Talend Extension”


  1. 1 marion February 17, 2009 at 2:16 pm

    hello,

    Wich language is used to generate the template file? Where can I find some tutorials?
    is it accessible to final users?

    thank you

    • 2 hugoworld February 17, 2009 at 3:46 pm

      Hi

      You can find tutorials in the PDF link on this page. WordPress is a bit wierd and clicking on the link will take you to a second page, which inludes a file called NewsLetter.pdf… This file is the documentation and tutorials.
      The talend component is java. In terms of the templates they aren’t generated the output file is generated by applying the expressions in the template. The expressions use JEXL Syntax (Java Expression Language). For further details take a look at the PDF which explains how the jxls implementation works.

      However for many cases it’s not necessary to use expressions.

      You can include lots of features in the template, such as freeze panes, autofilters, conditional formatting, excel formulas etc…

      The code is available to everyone the entire project is uploaded in zip format to talendForge.

      Regards

      Hugo

  2. 3 Gábor Bánóczi March 5, 2009 at 7:13 pm

    Dear Hugo,

    I wanted to acknowledge your work in this semi public space. The Excel report generating tool for Talend is awesome.
    Thank you. And thank you.

    For all those who might download the tutorial, please be aware that when you copy code from the PDF into the Excel sheet the quotation marks will be not the standard characters and will cause syntax errors in the execution.

    Thank you,

    Gábor

  3. 4 Gábor Bánóczi March 5, 2009 at 7:34 pm

    What is the significance of the different parenthesis marks i.e. the [], {}, ().

    If you gave us a reference where the syntax is described would be most helpful…

    Thanks.

  4. 5 Gábor Bánóczi March 11, 2009 at 1:17 am

    Dear Hugo,

    to answer my own question, in addition to the very helpful document that you created here is the website with the documentation of the jXLS:

    http://jxls.sourceforge.net/index.html

    HTH,

    Gábor

  5. 6 hugoworld March 11, 2009 at 7:07 am

    Hi Gabor

    Yes the underying component is a library called jxls. This is explained in the PDF. However not all of the examples on the sourceforge site will work since this component is using generated code some of the flexibility is missing. Hence the most common operations are included in the PDF.
    The syntax of the expressions uses JEXL again this is stated in the PDF and for underly object access such as renaming workbooks then the JavaDocs of Apache POI should be studied.

  6. 7 Cindy October 19, 2009 at 11:52 pm

    Hi,

    I have been playing with this today and think its greate! I have 2 questions I’m hoping you can answer. How do I bring a context variable into a header of the excel report?

    Also how can I create a pivot table from the raw data.

    Any info would be greatly appreciated!!!!

    • 8 hugoworld October 20, 2009 at 8:23 pm

      Hi Cindy

      Thanks for the question. In terms of the context variable I would need to check. I don’t think it will be possible in the current version. I will see if I can have a look when I have a little more time.

      The solution to your second question is that you need to use a small piece of VBA. Something like this should work. This assumes you have your source data in a sheet Called “Sheet1″ and you want your pivot table to be placed in “Sheet2″. To access the macro features of Excel press Alt+F11 and you should be able to pretty much paste this code in.
      If you want a different layout (in terms of fields on pages, rows and columns then you might want to record the macro and the adapt the code accordingly. You will probably want to leave in the check for the last row of you source worksheet, to ensure all data is included in the pivot, and also the bit to delete the pivot and recreate it.

      Option Explicit

      Private Sub Workbook_Open()

      Dim WB As Workbook
      Dim WSSource As Worksheet
      Dim WSTarget As Worksheet
      Dim lastRow As Integer
      Dim PT As PivotTable

      On Error GoTo err:

      Set WB = ThisWorkbook
      Set WSSource = WB.Sheets(“Sheet1″)
      Set WSTarget = WB.Sheets(“Sheet2″)

      ‘Delete the existing pivot (if any)

      ‘ Delete any prior pivot tables
      For Each PT In WSTarget.PivotTables
      PT.TableRange2.Clear
      Next PT

      ‘Change this to the source sheet name

      lastRow = WSSource.UsedRange.Rows.Count

      WSSource.Activate
      WSSource.Range(“A1:D” & lastRow).Select

      ‘CHANGE THE R1C1 Reference (Row and Column) And Sheet Name

      WSSource.PivotTableWizard SourceType:=xlDatabase, SourceData:= _
      “Sheet1!R1C1:R” & lastRow & “C4″, TableDestination:=”Sheet2!R1C1″, TableName:=”PivotTable1″
      ActiveSheet.PivotTables(“PivotTable1″).SmallGrid = False
      With ActiveSheet.PivotTables(“PivotTable1″).PivotFields(“Sales”)
      .Orientation = xlDataField
      .Position = 1
      End With
      With ActiveSheet.PivotTables(“PivotTable1″).PivotFields(“Year”)
      .Orientation = xlColumnField
      .Position = 1
      End With
      With ActiveSheet.PivotTables(“PivotTable1″).PivotFields(“Month”)
      .Orientation = xlRowField
      .Position = 1
      End With
      With ActiveSheet.PivotTables(“PivotTable1″).PivotFields(“Person”)
      .Orientation = xlPageField
      .Position = 1
      End With

      ‘Clean up
      Set WB = Nothing
      Set WSSource = Nothing
      Set WSTarget = Nothing

      Exit Sub

      err:
      MsgBox (“Error – ” + err.descrption)

      End Sub

    • 9 hugoworld October 20, 2009 at 8:26 pm

      Oh sorry I forgot to mention that Sheet1 contained a Header Record of Sales, Year, Month and Person in Cells A1:D1 with the actual data being in rows 2 onwards.

      Hope that helps.

      • 10 argo January 31, 2010 at 7:04 pm

        excellent post, hugo. i gave this a shot. it works like a charm.

        i did notice it took 1-1.5s to render a pivot table (with 2 aggregate columns) table from @600 data rows.

        in my usecase, 25k-50k rows on the datasheet would not be unheard of. so a static quick-loading pivot table would serve us better than one that is drawn on-worksheet-open.

        i’m now looking into the possibility of creating the pivot table as part of the jxls processing i.e. after the data table is created, perhaps by firing the same macro.

        no idea if this is possible, but i’m going to give it a shot.

        cheers,
        ~a

      • 11 hugoworld February 1, 2010 at 8:00 am

        I don’t know if this would be possible, particuarly from java and I have never tried this but perhaps you could try to use the application.run method from a worksheet cell.

        E.G. – Try typing something like the formula below in cell A1

        Application.Run(“MyMacro”)

        In this case MyMacro must be declared as being public, you can pass parameters is as arguments using commas.

        If you are running the code on a windows machine which has Excel Installed then you could use Talend to create the spreadsheet and a simple piece of Windows Scripting Host / C# command line code to instantiate excel and run the macro before saving the file.

        I can put up some sample c# to execute the macro if you are interested, you could get talend to call this at the end of the job.
        If you are on a unix platform then things might be a little more tricky.

  7. 12 Roger November 12, 2009 at 11:01 pm

    Hello Hugo,

    I am using the tExcelReport in Talend Open Studio 3.05. The Plugin Works so far with out any problems. Thank you for this perfect Job!

    I have two questions:
    1) Will you make tExcelReport available for Talend Open Studio 3.2.x?
    2) Would it be possible to use multiple output from tMap and send this to different tExcelReport output? With the actual version it seams that parallel Work with the tExcelReport is not working. I have to do for each Excel-output a new Job.

    • 13 hugoworld December 31, 2009 at 5:32 pm

      Hi Roger

      1) I will test with never version of Talend over the next few days and try and make any of the necessary changes
      2) Yes this is a small change.

  8. 14 hugoworld February 1, 2010 at 11:43 am

    As mentioned earlier if you have access to a windows machine to run the macro then you try something like this c# snippet.

    public static void main (String[] args)
    {
    Application oXL = new Application();
    oXL.Visible = false;
    String fileName = Server.MapPath(“~/MyReport.xls”);
    object missing = Type.Missing;
    Workbook wb = oXL.Workbooks._Open(fileName, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing);
    oXL.Run(srcFile + “!Module4.test”, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing,
    missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing, missing);
    wb.Save();
    wb.Close(true, missing, missing);
    oXL = null;
    }

    The sample above will run a macro called Test which lives in Module4 of the spreadsheet called MyReport.xml

    If your Talend job runs on windows then you will need to install Excel on the same machine. Make sure you reference Microsoft Office in the Visual Studio Project (You can download a trial version for free from microsoft)

    If your talend job runs on a unix server, then you could install cygwin and the openssh component of cygwin on a windows machine.
    Use the command cygrunsrv with openSSH to install openssh as a service (you might want to google howto do this).
    You can then use the command ssh -c to run the job on the windows machine from your unix box.
    Although I believe Talend has a run ssh command task built in.

    Hope that helps


  1. 1 Talend ETL Excel report generator « Gobán Saor Trackback on February 13, 2009 at 4:37 pm
  2. 2 Taking the Pain out of Excel Reporting with a useful Talend Extension « Britain's Got Talend* Trackback on April 27, 2010 at 12:43 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s





Follow

Get every new post delivered to your Inbox.

%d bloggers like this: