Lately I’ve been looking into integration between open source BI solution, Pentaho, and Openbravo ERP. After installing and setting up the different components, I decided I would start with a very simple task-a basic ETL process, just to get a feeling for Pentaho.
To make things a bit more interesting, I decided I’d throw SugarCRM into the mix. SugarCRM is the world’s leading open source CRM solution. It is written in php and runs on an Apache web server using Mysql as the database engine. My ETL job will transfer some customer information from Openbravo to SugarCRM – that would be a scenario for anyone who is using Openbravo and would like to start working on Sugar as a CRM solution.
After installing the Pentaho BI suite, I decided to take a look at their ETL offering,Kettel. It’s a java based tool, but since the design-time of ETL is a client-side activity,I decided to run it on off my Windows XP PC. ETL tools are part of every major BI offering because they are used to upload large amount of data from operational systems (e.g. ERP) to data warehouses.
ETL tools are also used in migration processes-if you replace your ERP vendor, upgrade to a new version or transfer data from ERP to CRM, for example,you will need an ETL tool. Although ETL tools allows you to upload as many or as few records as you like, the true strength of this tools lie in their ability to perform well when transferring very large amounts of data,sometimes tens of millions of records. In our simple test we will transfer a small number of customer records.
Openbravo->SugarCRM, Customer->Account ETL process using Pentaho Kettel
Notice in the screen capture the source (Openbravo business partner table), the fields mapping step in the middle and the destination – a CSV file to be imported into SugarCRM.
Once I got Kettel to work, I logged in and was very impressed with the ease of use and broad functionality. ETL processes almost always require different types of transformations and Kettel provides an extensive selection of those.
The one thing I didn’t like was the fact that you could not connect two different process steps with a simple ‘arrow drag’ – you have to go to the left menu, right click on ‘Hops’, select source step and destination step from drop down menus and click OK. That is too much effort for such a basic activity. Also, if you have complex transpositions with many steps, finding the step you need in the drop-down menu can be somewhat annoying.
I was pleasantly surprised to see that Kettel had connectors not only to the leading database engines but also to several commercial applications. The optimal integration between Pentaho and Openbravo should include a dedicated ‘Openbravo’ connector delivered by Pentaho as part of their ETL tools. I doubt it will happen.
I was able to easily connect to the Posstgres database I use for my Openbravo installation. That database will be the source of my customer data. The destination will be a csv file which will later be imported into SugarCRM. Since the database schema of openbravo and Sugar are different, a mapping step is required-we need to map the field names in Openbravo to field names in Sugar. Although during import into Sugar you are allow to do that mapping, it is better to let your ETL tools handle that mapping-it’s more transparent, you can reuse this mapping and performance it better.
The next post will show a recorded demo of the entire process completed. I will then move on to an ETL process from Openbravo to a data warehouse. Then the fun can begin-we can start BIing Openbravo!!