Open Source BI Pentaho Installation Walkthrough
Pentaho is a leading Open Source BI suite. It contains all components required from an enterprise-grade BI solution-reporting,dashboards,analytics engine,ETL and data mining. The reason I am covering Pentaho here is that I believe an open source BI can complement the open source ERP solutions I cover here.
The Pentaho offering is a classical, ‘modern’ open source model – you can download,install and use a ‘basic’ edition of the software for free, but Pentaho charges for additional ‘enterprise class’ functionality, support and some advance administration options.
The installation itself (I am installing Pentaho 1.7 GA) was not as straight forward as I expected and required some manual work. Because of that, I will write a complete installation guide, hoping it will save some time and effort to others who plan to install Pentaho. I will run the Pentaho server on my Linux Ubuntu Box.
Pentaho is written in Java, running on Jboss application server. That makes integrating it with Openbravo or Compiere somewhat easier, both written in Java as well. However, BI solutions are platform agnostic – they only rely on the underlinying data layer and are independent of the programming language used to write the ERP system.
Pentaho is a BI solution and it requires a database that will contain the data to be analyzed or mined. For that purpose, I will use a Mysql database, provided by Pentaho, with some sample data.
BTW – Pentaho delivers several out of the box default users and passwords. They can be found under the ‘Valid Users’ drop down menu on the upper-right corner of your Pentaho installation homepage.
Sample Pentaho Report

Pentaho Installation Steps
Make sure you have an Ubuntu 8.04 Linux installation.
Install Java:
server01$ sudo apt-get install sun-java5-jdk
Install MySQL:
server01$ sudo apt-get install mysql-server mysql-client
Create a directory for your Pentaho installation
server01$ mkdir /opt/pentaho
server01$ cd /opt/pentaho
Download the Pentaho BI Suite:
server01$ wget http://dfn.dl.sourceforge.net/sourceforge/pentaho/pentaho_demo_mysql5-1.7.0.GA.tar.gz
gunzip and untar the Pentaho archive
server01$ gunzip -c pentaho_demo_mysql5-1.7.0.GA.tar.gz | tar xvf -
Your Pentaho startup script is:
/opt/pentaho/pentaho-demo/start-pentaho.sh
Create Pentaho Mysql Database
Before we can get Pentaho to work, we need to create the Mysql database. This database will contain Pentaho metadata information, users and access rights along with some sample data we will later use to create reports, analyze, etc. The archive we downloaded contains an sql script we will execute to create the database,users and sample data.
I was not able to use the supplied script ( if you were able to use with Mysql,please let me know how. It might work on database other than Mysql).
The Pentaho Mysql database creation script can be found at:
http://source.pentaho.org/svnroot/pentaho-data/trunk/mysql5/SampleDataDump_MySql.sql
This script would not run on my machine (Mysql 5.0.51a-3ubuntu5.1) as-is.
First, the script starts by dropping some users – if they do not exist, you get error messages. If this is your first installation of Pentaho on your MySQL database, comment the first 4 lines that drop existing users.
Another issue the script has is the MEMORY type table it tries to create. The syntax is just wrong. The script attempts to create a table called DATASOURCE of type MEMORY.
Replace the sql command that creates the DATASOURCE table to
CREATE TABLE DATASOURCE(JNDINAME VARCHAR(50) NOT NULL PRIMARY KEY,MAXACTCONN INTEGER NOT NULL,DRIVERCLASS VARCHAR(50) NOT NULL,IDLECONN INTEGER NOT NULL,USERNAME VARCHAR(50) NOT NULL,PASSWORD VARCHAR(50) NOT NULL,URL VARCHAR(100) NOT NULL,QUERY VARCHAR(100) NOT NULL,WAIT INTEGER NOT NULL) ENGINE=MEMORY;
That should do it. Now just run the script:
server01$ cd data
server01$ mysql -u root -p < SampleDataDump_MySql.sql
Your Mysql database should now be populated with Pentaho metadata and sample data.
Connecting to Pentaho server from remote computer
If you plan to use a separate computer as a Pentaho client, there is one more thing you need to take care of – you need to enable access from remote computers to the Pentaho server (by default,both Pentaho and the Jboss application server limit access to the servers’ localhost only.).
To do that, edit the following file:
- In the file : /opt/pentaho/pentaho-demo/jboss/server/default/deploy/pentaho.war/WEB-INF/web.xml,change the value of the property ‘base-url’ from localhost to your Pentaho server hostname or IP address. You can also change the port Pentaho will listen on from the default 8080 to something else-that would be useful if you have other applications already listening on this port.
- Change the Pentaho startup script (start-pentaho.sh) so that the Jboss server binds to your hostname or IP Address instead of the default binding to localhost. To do that, add your the Pentaho server hostname or IP Address to the -b switch. Your joss start command should look like : sh run.sh -b server01
Starting Pentaho
You are now ready to start Pentaho, by executing:
server01$ /opt/pentaho/pentaho-demo/start-pentaho.sh
Take a look at the trace on the screen, make sure you do not have any errors(you will get many WARN messages, saying default values are used).
Point your browser to the URL:
http://server01:8080/pentaho/

i am new in the pentaho world.
And
the Open Source BI Pentaho Installation Walkthrough you have its so amaizing and straight to the point.
i have been using the Installation documents from pentaho site, it makes no sence to me any more.
is it possible to post me the Installation Walkthrough on windows?
I was just curious why you chose to deploy Pentaho to MySql when OpenBravo is running in Postgres? Also since Open Bravo is deployed through Tomcat wouldn’t it be simpler to also use Pentaho’s Tomcat build?
@ Damian:
See this blog for Tomcat installation:
http://osbi.nl/2008/11/how-to-manually-install-pentaho-bi-server-linux/
HOW TO INSTALL PENTAHO 1.7 GA ON MYSQL WiTH WINDOWS GUIDE ?
Hi,
Thanks its very useful. But can you tell m ehow to install that in window. I m trying to install from last one week but its not working. Reuqet your help please
Your link where I can find the Mysql database creation script (http://source.pentaho.org/svnroot/pentaho-data/trunk/mysql5/SampleDataDump_MySql.sql) is not valid. Can you correct it please?
Thanks
I found the correct mysql script by looking around..
http://source.pentaho.org/svnroot/legacy/pentaho-data/trunk/mysql5, then right-click (windows client) and save as
Roy – thanks for the update. How about bloggging here? I think you got the talent!!
How to do the same for windows server.
I like to connect the Pentaho server in windows from remote system.
thanks….
Hi,
I am trying to figure out how to install Pentaho. I have the files copied to the remote server, but am unsure on what to do next. I am fairly new at this, so I do not know how to launch spoon.bat on a remote computer, etc.
Any guidance is appreciated!
Began trying this on Ubuntu Karmic. The only problem I ran into in getting the BI server started was that my JAVA_HOME variable was not set. That required editing the file /etc/bash.bashrc to set that variable.
For me (default install of java jdk from the synaptic manager), the location for java was /usr/bin/java. So I edited /etc/bash.bashrc to add the following at the end of the file:
JAVA_HOME=/usr/
export JAVA_HOME
PATH=$PATH:$JAVA_HOME/bin
export PATH
Then I restarted the machine and the BI Suite started right up! You can check the variable setting after the restart but running “echo JAVA_HOME” in a terminal session.
Used this blog post to figure things out: http://www.zimbio.com/the+ubuntu+guy/articles/82/How+set+JAVA_HOME+environment+variable+Ubuntu
I’m a bit new to linux, so I wrote this up hoping it might help someone else who is as well.
And updated tutorial on how to install Pentaho Data Integration (also known as Kettle) in Ubuntu can be found here:
http://blog.foobaria.com/2010/05/install-pentaho-data-integration-aka.html
It can be useful for people looking only to install this great data migration tool, and not the full suite.