Talend Extensions

On and Off I’ve been writing an extension to Talend for best part of a few months. I say on and off since, since so much has come in the way that I haven’t had a great deal of time to dedicate to this task.

Some of you may be wondering why I am writing the extension for Talend and not PDI. Well I guess it’s because when I first embarked on this project I decided that the program would need to dynamically create java classes. The program wouldn’t know in advance what the classes were or what they looked like. When I first embarked on this endevour the only way I could think about tackling this problem was to write a piece of code which would create a .java file, compile it and add it to a jar before invoking the java classloader and loading the class using reflection. I decided that since PDI is a meta data driven tool and I had effectively resorted to generating code, that this didn’t fit with the PDI model.

Ironically after I had written the code to achieve the above objective I discovered the dynabeans project within apache commons, which achieved everything I had been attempting to and was also a lot simpler. I wouldn’t have the same objection about using dynabeans with PDI since it isn’t generating code and compiling it.

Anyhow I would like to express a few opinions of Talend Extensions. By logging this information here if I decide to create any further extensions for Talend I should at least know which files to check to see why the code isn’t compiling etc..

1) The GUI elements – These are defined in an XML file which offers the ability to quickly add common fields such as text fields etc.. Whilst this is perhaps fine to get you going I also found it limiting. I can’t see how you can create dynamic GUI’s. How can you have fields which are loaded based on the data in a table or some other source? With PDI you are writing SWT so although it might be more lines of code to create the GUI anything would be possible. What I would really like to do is some sort of validaiton on the user input. If anyone can tell me how to do this then I would be greatful.

2) Code Generation – There are effectively 3 mains Java Emitter Templates. A begin file which is executed once in the pipeline, a main file which is exectued for every row in the pipeline and an end file, which is executed at the end of the pipeline.

The generation model can be a real pain. For a start you don’t have any of the advanced feature of an IDE such as code complete, however then most annoying feature is when you have an error in the source code. You can essentially have two different kinds of errors.

  • Startup Generation Errors – These are primarily due to syntax errors or similar in your JET scripts. I’ve been writing my code in a text editor, which doesn’t give me syntax highlighting or anything else which would help to spot such errors.
  • Compile time Errors – These occur when you have a problem in your injected code which prevents the generated code from compiling or even generating.

I have found the best way to address such issues is it open the begin, main and end files located in workspace/.JETEmitters/src/org/talend/designer/codegen/translators/{YOUR FOLDER HERE}

So if you xml file defined that your component would appear in the Misc section then you would find the files that telend uses in :

workspace/.JETEmitters/src/org/talend/designer/codegen/translators/misc

You can then look at the java stack trace, find the lines in the above files and then move back the your original JET templates and rectify the problem.

The Templates can be messy. I couldn’t find any way to added methods within the templates to reuse elements of the code. I have found that I have duplicated code in both the begin and main template since I couldn’t find any way to share this code amongst the template.

Due to the fact I couldn’t see how to use methods within the templates, some of the code is quite heavily nested and is therefore not so easy to read. In fact since I was writing the code on and off I would find it difficult to come back to the code and amend it.

My other concern is testing the code. I can’t see any way to effectively unit test the component. This of course is a concern. Could any bugs end up  being compared to “whack a mole”. I’m sure you have seen the game where you whack one mole and two more pop up. So how can you ensure that in fixing one bug you don’t create two more? Particularly if you can’t have automated test suites which could be used to regression test after any change ?

However on a more positive note I actually found that the generation model conferred some real advantages. It was possible to move the vast majority of the logic to the template and ensure that the generate code would be simple. Basically since you effectively writing two pieces of code you can ensure that the generated code doesn’t have to do a great deal or work and should therefore be fast and use less resources.

About these ads

5 Responses to “Talend Extensions”


  1. 1 Cedric Carbone February 2, 2009 at 7:40 pm

    Dear Hugo,

    Thank you for using Talend softwares.
    Here some comments about negative points who wrote on our post :
    “These are defined in an XML file which offers the ability to quickly add common fields such as text fields etc.. Whilst this is perhaps fine to get you going I also found it limiting. I can’t see how you can create dynamic GUI’s.”
    The XML files allow you to built quickly GUI’s. If you want make our proper GUI you can create a plugin and put your SWT code inside (it’s an external component). Talend has got 500+ components based on XML file GUI and less than 10 components (tMap, tELTMap tRowGenerator, tFileAdvancedOutputXML, tSCD…) based on proper SWT GUI.

    “For a start you don’t have any of the advanced feature of an IDE such as code complete, however then most annoying feature is when you have an error in the source code.”
    Have you try to use the component designer perspective? With this editor, you can generate some code thanks a wizard, reach autocompletion feature, see error… However, I agree with you, we need improve a lot this editor… and it’s on the 2009 roadmap of Talend Open Studio (with test feature, a new GUI builder drag’n drop based…)
    Stay tuned ;)
    -cedric

    • 2 hugoworld February 2, 2009 at 8:01 pm

      Thanks for the advice.
      I wasn’t trying to be negative and some of the issues are perhaps due to my lack of understanding of creating extensions.
      In terms of the GUI Elements I think that the XML approach is good, since it allows you to create a GUI quickly. I would like to find out how you can validation based on the users input though. Is it necessary to create an SWT component for this purpose.
      I will see if I can have a look at the component designer perspective. Especially if this will prevent me from having to repeated restart TOS to see the effects of a change in the code. The main point of frustration so far has been due to having errors in the template which either prevent the component from loading, or generating. As I said in the original post some of this may be due to my text editor missing things such as syntax highlighting for the .javajet files.

      Regards

      Hugo

  2. 3 ccarbone February 2, 2009 at 8:11 pm

    “Especially if this will prevent me from having to repeated restart TOS to see the effects of a change in the code. ” You don’t need to restart your studio, just press Ctrl+Shift+F3 (Others keyboard tips
    at http://www.talendforge.org/wiki/doku.php?id=tips )

    Enjoy Ctrl+Shift+F3 when you develop talend components!
    -cedric


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s





Follow

Get every new post delivered to your Inbox.

%d bloggers like this: