A word of caution when it comes to performance with BigInsights

Eventhough IBM BigInsights is a great platform to handle loads of data in different formats, connecting to Hive via IBM Business Intelligence deserves a word of caution.  IBM clearly states the following :

Quote :

The architecture of Hive is better suited for large query processing than handling many light queries because the set up time for a query is significantly higher compared to the most popular RDBMS. Therefore, consider scheduling pre-authored reports when ah-hoc analysis is not absolutely necessary. Scheduling reports in batches will benefit from Cognos Business Intelligence query service's cache management system, minimizing the amount of data that is fetched from BigInsights and significantly reducing report execution times.
If interactivity is required on large volumes of data, consider whether the analysis requirements can be met by scheduling Cognos Active Reports. If ad-hoc analysis is required, educate business users on how to perform multiple operations with a single drag-and-drop gesture. (Modelers should consider adding stand-alone filters and calculations to the package so that multi-click operations to define the filters/calculations can be avoided.)

What to conclude? Ad-Hoc reporting is a big nono.  You shouldn't expect Hive to have blazing performance when reporting on top of it.  That's especially true when there's only a limited amount of nodes serving the data.  Luckily IBM Cognos BI offers a range of possibilities in this respect as mentioned in the article in the URL below.