If you believe the hype, the business-intelligence tools offered by some of the world’s largest software companies also pack a substantial punch. But these systems are often difficult to install and maintain, not to mention downright expensive. Small and medium-sized businesses typically can’t afford software platforms that cost upwards of several hundred thousand dollars, but that doesn’t mean they’re cut off from BI tools in general. In fact, there are some decent open-source options. In this article, I’m going to examine a couple.
When I set out to write this review, my plan was to compare two of the big players in the open-source BI space, Pentaho and Jaspersoft. But while installing and evaluating the two, I noticed some striking similarities, as well as some common software between them. It turns out that both platforms rely on the same open-source tools under the hood—two in particular: Mondrian (available on SourceForge) and JPivot (also available on Sourceforge). The latter provides the user interface for OLAP (Online Analytical Processing), while Mandrian supplies actual OLAP analysis.
Note, however, that I’m only talking about the open-source variants of the products offered by these two companies. Both companies provide “community” versions based on open-source tools, as well as “enterprise” versions that cost actual money. Pentaho, which acquired another company that created commercial and closed-source BI tools, includes different engines in its open-source and premium software; if you use the open-source version of Pentaho, you’re denied access to the supposedly fancy BI tools available with the commercial product.
I’m going to take a good look at Jaspersoft, and then a brief look at Pentaho, since several of the features I describe in the former apply to the latter. I’m still not covering all the features in either product, because there just isn’t space. Instead, I want to focus on some factors important to small business owners, with thoughts about the following:
1. Price, price, price. This is why I chose open-source products for this run-through: small businesses simply aren’t in a position to give six figures to Oracle.
2. Ease of installation. If you read this, you’ll know why such things are important. And here’s something else to consider: When you go to the BI product website, do you know exactly what you’re supposed to install, or are you hit with a whole slew of products, with little idea of how they fit together? Is it clear whether you’re actually installing the open-source version as opposed to a trial version?
3. Ease of use, early on. Nobody wants to have to spend six weeks learning how to use a product. Small business entrepreneurs don’t have that kind of time.
4. Good, solid examples. Not just a couple of silly, small examples, but real examples to get you up to speed.
Right off the bat, I had a little trouble navigating the assortment of products offered by Jaspersoft. With a little help from Google and the Jaspersoft community pages, I eventually discovered community.jaspersoft.com, which advertised “Jaspersoft 5.0.” Clicking on that led me to a comparison page for the Community Edition and the Commercial Edition; the list for the Commercial Edition is much longer, and included more charting, more data sources, and extended XML/A support. By this point, it was clear that the Community Edition is the open-source version.
The main community portal lists several other products: JasperReports Server, JasperReports Library, Jaspersoft ETL, Jaspersoft Studio, and iReport Designer. I clicked around to help me decide what to download and install:
Jaspersoft Studio is an eclipse-based report designer. Cool. And I like Eclipse. But will the typical business professional be able to use it? I’m guessing probably not. (I don’t have room here to review it, but I’m going to try it out later; if you have experience with it, feel free to discuss it in the comments.)
JasperReports Server. The name suggests it’s the main server for the reporting engine. This is how these usually work: there’s a server running that does the number crunching and report generation, and other apps connect to that server. There’s also a particularly interesting option: If you’re a developer like me, you can embed reports, dashboards and analytics into your own programs.
iReport Designer. This tool apparently lets you design great reports in minutes using a graphical designer. As I found out later, it’s included when you download JasperReports Server—but again, this isn’t immediately clear on the website.
Jaspersoft ETL. ETL stands for Extract, Transform, Load. The website claims it’s easy to use and it can “extract data from your transactional system to create a consolidated data warehouse or data mart for reporting and analysis.”
JasperReports Library is another reporting engine for creating reports from within your Java applications. It’s a code library.
Obviously I needed to download the server first. Although the overall process wasn’t entirely clear up to this point, it still felt much more streamlined than similar install experiences with SAP and Oracle.
Installing Jaspersoft BI Suite
So I downloaded the server (there are downloads for Windows, Linux, and Mac, with 64-bit versions for all three, and 32-bit for Mac and Linux). Because so many small businesses automatically use Windows, that’s the one I’ll use here. After registering with the site, I proceeded to downloading the 420MB file.
The setup wizard started, and I chose the installation directory. I was asked whether I wanted to use the bundled Tomcat. Although I have a Tomcat somewhere on this server, I opted for the bundled one. Next is the same question for PostgreSQL. Bundled or existing? Again, I used the bundled.
Then it asked a few more questions, such as Tomcat server port. I accepted the defaults, but my paranoia from experiences with SAP and Oracle told me I needed to make note of every value. Further, my experiences with IT stuff suggested I better make sure those ports are really available. I checked with TCPView, and—bingo—the ports were clear.
There was also the option for installing sample databases and reports. Definite yes. Next came an interesting option: Did I want to install iReport Designer? That was a separate download in the original download list; nowhere did the page indicate that I didn’t have to download that separately. I clicked Yes, and the setup began. It took a couple minutes, and I was finally presented with a message with the option to start the server and browse to the login page.
At this point I realized I had forgotten to follow the installation manual. Nonetheless, the installation worked, so score one for not needing the manual (but don’t quote me on that). Jaspersoft provides the documents with the initial usernames and passwords for the superuser, administrator, sample end user, and same end user with access to sample codebases—I tried the latter one (username: “demo,” password: “demo”), and the system said it couldn’t let me log in. So I tried the jasperadmin username.
Just to recap: it took me a relatively short time to completely install and start using Jaspersoft BI. I had much better luck with this system than I did with Oracle and SAP, although the experience (most notably the wrong username and passwords) wasn’t without its hiccups.
Exploring the Jaspersoft Tools
Like other modern BI products, you access the main product through the Web browser. There’s a menu across the top, along with a list of folders on the left rail of the homepage for managing analysis components, content files, data sources, images, input data types, reports, system properties, and themes. There’s also a Manage menu for managing users, roles, and server settings. (Using the Manage Users page, I was able to see why my demo login didn’t work; there was no user called demo. Nor was there a user called superuser. The docs were incorrect.).
The user and role tools are easy to use and work just as you would expect them to. The other item in the Manage dropdown menu lets you configure various logs, as well as OLAP settings. These are low-level OLAP settings, such as disable memory caching, generate formatted SQL traces, query limit, result limit, and more.
I logged out and back in as the regular demo user (username: “joeuser,” password: “joeuser”). As it turns out, this login has access to the sample databases: the tree includes an item for the sample reports, and I was able to open them up. They provide limited drill-down features: for example, there’s a sample report listing sales employees, which includes an “Accounts” column. Under that column is a “view” link that takes you to a report listing the accounts under that sales employee. The reports include pagination, and in the upper-left is a navigation bar for the pages. On the right is a “back” button, which performed as hoped: I first looked at the employee list, then clicked on one employee’s accounts and started paging through the lists contained therein.
Next to the back button is a way to export the reports in various formats, including PDF, Exel, Excel with pagination, CSV, DOCX, RTF, Flash, ODT, ODS, Excel 2007, and Excel 2007 with pagination. (It’s nice to see Open Office ODT and ODS.) The various downloaded reports all pretty much look the same, whether it was a spreadsheet or word processing document. The spreadsheets don’t have any functionality (although I’m not sure it would make sense to include any). Let’s move onto the OLAP.
The installation for Pentaho wasn’t quite as clear as Jaspersoft, but it was just as easy. I had to download their server as a .zip file, make sure I had Java installed, point my Java Home environment variable to the correct path, and run a batch file to launch it. Done.
OLAP in both Jasper and Pentaho
At the heart of any good BI tool is OLAP. The OLAP tools are an optional component in Jaspersoft BI Suite. This makes sense; even though OLAP is incredibly powerful, some small-business people might want access to basic reports without engaging in sophisticated analysis.
Here’s where the commonality exists between Jaspersoft and the other product, Pentaho: both use the same visual interface, JPivot, and the same engine, Mondrian. So what I’m about to describe actually applies to both Jaspersoft and Pentaho.
The OLAP tools let you perform ad-hoc views, reports, and views through MDX queries. (MDX stands for Multidimensional Expressions, and refers to a query language originally created by Microsoft, but now considered a standard by many OLAP vendors.) For the ad-hoc views, you can perform your OLAP searches and analysis right inside the web browser. It’s quite easy to use; you just drill down as you go along. The following figure shows me doing just that: first with products, then drinks, then alcoholic beverages (always a favorite subject). Meanwhile, the right-hand side updates accordingly, showing the measures (in this case unit sales, store cost, and store sales:
Drilling down and exploring data is actually very easy, and most business people could probably get the hang of this in relatively short time. You can click on the totals for a popup window showing detail data points in grid form. For example, clicking on the “Beer and Wine” item displays a grid listing all the measures, along with all the dimensions pointing to each measure. If you wanted to zoom into individual-detail data, you could not only see a particular customer buying 3 units of Y alcohol at X store—you’d be able to tell, based on his shopper-card data, that he was a single male making between $30-$50k a year.
You can also save your queries and open them up later. For advanced people (including techies) you can display and edit the MDX query that takes place behind the scenes.
Once you drill down to where you need to be, you have the option of displaying the data in chart form. You can customize the charts in various ways; they’re interactive, so you can float your mouse over a bar and see the actual value, and a full description of the dimensions for the value.
In my previous article, I pretty much hit a brick wall on trying to get answers to my problems. Here, it was different. I did encounter a problem when I wanted to modify the OLAP cube for a data sample, and saw a massive Java exception output appear in the browser instead. (That’s really not good; we all know non-techie people freak when they see that kind of thing.) In that particular case, I Googled the exception name for the Jaspersoft site and quickly found somebody with a solution (it involved opening up an XML file and commenting out a line from it).
Another Quick Word About Pentaho
Since Pentaho uses the same JPivot tool, the OLAP drill-down interface is the same; and since it uses the same engine, you get the same values. But there’s one difference: Because Pentaho now offers what they feel is a better interface and engine in their commercial product, they make it very clear that the tools you’re using here are “legacy” tools. In a couple of places, they display helpful messages along the lines of, “Web Ad Hoc Query and Reporting has been replaced by the new Interactive Reporting client. It is provided as a convenience but will no longer be enhanced or officially supported by Pentaho.”
There’s a similar message regarding JPivot, and these messages appear in annoying places. Further, the menus for reporting and OLAP tools have the word “Legacy” stamped after them. In order to access those new and improved tools, you need to upgrade to the commercial version. (I don’t know about you, but I call that sort of thing advertising.) That makes the whole package feel less like an open-source platform, and more like a trial product.
People have argued over whether Pentaho’s offering is truly open-source, but one thing is clear: Those messages are annoying, and you really don’t want your clients seeing the messages plastered all over, do you? It turns out the messages are just embedded in the JSP pages, and it’s easy to remove them—here’s one such helpful page. Aside from that, the rest of Pentaho boasts a nice interface and behaves very similarly to Jaspersoft.
These are good tools for small businesses (annoying ads in Pentaho’s platform aside). I did some really nice drill-downs, and I was able to modify the cubes and add in additional data points. The samples that come with Jaspersoft are serviceable, and while the support documents contain a few errors, they weren’t deal-killers. Personally speaking, I felt Jaspersoft’s interface was a bit easier to use.