My Profile Photo

Cooking Analytics


My name is Miguel Gaspar and the blog purpose is to provide information on the value and how to build IoT and Big Data solutions, providing business, architectural and technical guidance that can be useful to developers, architects and top level business managers.


IoT on Pentaho using MQTT - New MQTT steps for PDI are available in Marketplace

Earlier this year I have made some posts and provided some samples on how we could stream data from the IoT devices and displaying the info on a dashboard, using Pentaho Data Integration and MQTT protocol.

First blog post

Second blog post

Third blog post

I am still missing some blog posts on how to scale a solution on this and how to use D3.js as the chart library, among other updates. The good news is that Pentaho Labs announced the release of MQTT steps, even if not yet supported. I guess it is just a matter of time now.

The steps are ready to be used and available in marketplace.

Here the link where Pentaho labs is announcing it.

MQTT on PDI

You can watch the video with the updates.

Samples are available here

Please keep in mind that the most interesting part is not the live dashboard, it is the data streaming capabilities on PDI. You can nos make use of the new steps to be closer to IoT and possible to use Data Integration for the ingestion of data coming from devices. This also makes clear tht might be possible to use other protocols.

Main and row processing transformations

One clear advantage is that you don’t need to write all your code, you can build a gateway using PDI and without writing one line of code. Also look at the samples, you will notice we can switch the data processing transformation we want to use, and you can do it using a parameters. Also from the image above you can see that you are able to group rows based on number of rows or time.

Also notice that the case, based on the values opf a parameter named “rowsGroupingType” will decide if we need to go and process rows based of the number, based on time, or not group them at all.

Usefull also if you write to a permanent storage, like Cassandra/HBase or other depending on golas and requirments.

Twitter LinkedIn Facebook Google+