BI systems are undoubtedly complex and costly to set up. As such, many projects are not completed as planned and are often irrelevant quickly afterwards. Many organizations will even take the step of telling you they have no BI – in spite of the multi-million dollar investments made into BI solutions.
The traditional model of BI involved the use of a dedicated data warehouse (DWH), utilizing a relational database (usually SQL, Oracle or something similar) consisting of many tables and indices. A schema, often 'Star' or 'Snowflake,' is used due to technological restrictions on volume and index size. By reducing these restrictions, performance can be improved. However, on the other hand, this results in the ETL process becoming increasingly slow and complex – with the need to scan each row and field of incoming data.
Even if this hurdle is overcome, these kinds of databases were not designed to handle volumes over about 1 TB. In the data-hungry modern setting, this means setting up hundreds of partitions, aggregation tables, pre-calculated table allocations to different databases, OLAP and other workarounds. An already expensive project therefore becomes bloated in size and cost.
But is there an alternative to this unwieldy process? Golan Nachum, CEO of Twingo, writes that there is. Even for businesses with as little data as 0.5 TB, 'Big Data' can be an appropriate solution. By selecting a column-based data warehouse, tables can be flattened and the fact dimension can be skipped. This means there is no need for aggregation, calculated tables, partitions, or any of the workarounds described above. The ETL process becomes as simple as just loading files.
Perfectly designed to load massive volumes of data per minute, modern BI means you don't need to wait for nightly runs. Scalability is a key focus of the system, using low-cost hardware, continuity, data redundancy, and query performance that is up to 1,000 times faster. A rapid, short, simple and efficient ETL means fields and dimensions can be easily modified and added.
The BI tool connects directly to the database and each query is performed right as needed, with very rapid turnaround (typically less than a second). Any good BI tool that can work with a column-based database will be perfect for performing simple, efficient queries. In terms of dimensions, simply use a distinct operator on the flattened table for response times as quick as one-hundredth of a second.
Indeed, any type of query and business enquiry is possible. This includes ad-hoc, self-service BI, drill session analysis and anything else an organization might need. A simple system with no performance issues means there is no longer any need for an army of DBA and BI Staff.