Business Analytics Tools Gem – In part I., we argued in favor of using Superset as a free and open source solution. Make sure to check it out beforehand to understand our dedication and excitement to the project.

We set out to showcase how Superset can consume data from a centralized data store such as Snowflake and build a stack of promising technologies.

Promotions’ data mart followed a star schema approach, where the fact table consisted of approximately 5 million rows and 47 columns. In addition, we also set up a modern-analytical wide table by flattening the star schema into one joined table to test the performance of superset.

First, according to the official documentation, Superset is not officially supported on Windows. Thus, Windows users can only test Superset locally through an Ubuntu Desktop VM or using WSL(2). The first one probably works, but it’s not efficient from our standpoint. Even though we iterated through the latter option, unfortunately, we bumped into unknown or known-but-unsolved issues. We eventually managed to start Superset locally by starting the DockerHub image step-by-step (instead of docker-compose), but we suggest you avoid installing it on Windows. That said, we hope that sooner or later, it will be natively supported. It is also important to highlight as a footnote that we self-hosted these

Second, it is advisable to index (or cluster) the source tables (or materialized views) of visualizations, optimize both vertically and horizontally the underlying virtual warehouse in Snowflake and perform a micro-division pruning (or dimensionality reduction). Otherwise, slices and dashboard queries tend to time out due to concurrency, especially if filters are applied to multiple slices on the dashboard. This can be verified by investigating the execution plan of each chart in Snowflake’s query profile section and checking whether table scanning consumes the lion’s share of the resources. Since queries can be saved and also tracked back in superset, we can always reuse previous queries. Note that our top priority is not to benchmark the query performance of Superset in Snowflake, although we have a general sense of it during our work.

Third, Superset stores the dashboard components (metadata and slice configurations) in its dedicated database, so we decided to manually store and transfer dashboards between instances rather than mounting the database on a host file and glueing together every particle of a dashboard. Just to refer to what was said in part I., the Superset community is very close to a pull workflow solution where you can play with YAML files from dashboards through the API. According to our understanding, Superset supports exporting individual dashboards with a CLI command, but now we also feel the urge to develop a bulk export option.

Commercial BI tools have been ruling the market for years until cost-friendly open source candidates began to appear gradually. Among these, Superset is considered one of the most exciting projects, and it is certainly worth keeping an eye on.

We exposed Superset to a stack of favorable technologies, notably DBT which is also an important technology with Hiflylabs. From our experience, the learning curve is shallow compared to its feature-loaded counterparts, although we must not skip over the beauty and simplicity of Superset’s visualizations. We are committed to both the DBT and Superset open source projects and look forward to expanding our client base by offering services on both in the foreseeable future.

As there have been cases of Superset slowly overtaking expensive BI tools, we also hope to contribute to the initiative to cut costs and leverage our expertise in supercharging Superset. Even though there are downsides of using Superset due to its recent graduation, we believe that the strong community and the committers behind the project can launch the product in high quality matching its potential.

Now that you have seen Apache Superset synthesize with products such as Snowflake and dbt, what are your impressions on it being the "chosen one" among the free and open source solutions? Do you see any possibility of Superset establishing its noticeable share in the overheated data visualization market? Let us know in the comment section below!

Example Outreach for Diversity Recruiting Initiatives Read More 7 Strategies for Reducing Unconscious Bias in Candidate Pre-Screen Read More How to Write Inclusive Job Descriptions. It is defined in many ways. According to the earliest definition (1958), business intelligence is defined as “the ability to grasp the interrelationships of presented facts in such a way as to guide action toward a desired goal.”

A broader and perhaps more current definition of the discipline is this: Business intelligence is the process of collecting business data and turning it into information that is meaningful and actionable toward a strategic goal. Or even more simply, BI is the effective use of data and information to make sound business decisions. Although it may not sound like it, BI is different from analytics.

Reporting and analysis are the central building blocks of business intelligence, and the arena in which most BI vendors compete by adding and refining features to their solutions.

The raw material of business intelligence is the data that records the daily transactions of an organization. Data may come from activities such as interactions with customers, management of employees, operation of operations or financial management. According to the traditional model, daily transaction data is recorded in three main transactional databases: CRM (customer relationship management), HRM (human resource management) and ERP (enterprise resource planning). For example, a sales transaction would be recorded and stored as a piece of data in the CRM database.

A piece of data, by itself, is neutral—that is, neither “good” nor “bad.” For example, if you knew that Rep. X received Y dollars’ worth of orders year-to-date, you wouldn’t necessarily know whether This is a cause of panic or celebration.

Just like raw material, data must be processed through analysis to become meaningful. The same piece of data in the example above would become meaningful (for example) if compared to year-to-date sales target for rep X. By doing this, the piece of data became part of the process of analysis.

Analyzing data means asking it questions and getting meaningful answers. For example, the simple command “sort in descending order” on a column of data in Excel representing year-to-date orders taken by sales reps would answer the questions “Who takes the most orders? The least orders?” The brand command has contextualized the data, making it much more meaningful in terms of the strategic goals of the business.

Of course, analysis in BI is much more complex and varied than that. The powerful and interactive analytical tools of today’s better business intelligence solutions make it easier to ask data an increasing number of questions and get meaningful answers – including “what-if” scenarios, multidimensional slicing and dicing (XOLAP analysis), mashing up of data With geographic mapping and much more.

Purpose of features enable decision-makers to understand data, to find patterns among numbers, to identify trends and the reasons behind them – simply, to contextualize data and answer questions about it.

In any case, the goal of even the most sophisticated analytical features is always the same: to allow decision-makers to understand data, to find patterns among numbers, to identify trends and the reasons behind them – simply, to contextualize data and answer questions. About it.

Interestingly, most BI projects fail not because of faulty technical implementation, but because of lack of a strategic focus. Business intelligence should be a lever that enables a company to “lift” itself

