How to set MicroStrategy to use SQL Server's minimally logged INSERT for the intermediate temporary tables

This is going to be a discussion around some specific VLDB settings which once enabled, can provide flexible control over how the temporary tables are used in fulfilling a report request. As you've probably noticed by now, for some types of data merging (usually when a  report has metrics coming from separate fact tables), MicroStrategy uses multiple passes to fulfil it. Those intermediate passes will then be merged in a last pass (if metric join type is set to outer then the last pass is a full outer join). In the case of SQL Server database instance, the intermediate passes will store the data into true temporary tables, as stated by the "Tables -> Intermediate Table Type" VLDB option.

During the report execution, the temporary tables are created under the tempdb database using SELECT INTO statement. This is the fastest way to create and populate a table in SQL Server, mainly because it's not a logged operation like INSERT and DELETE.

We're constantly looking for ways to improve performance, so we soon discovered the parallel execution of the report passes. This is controlled by the project level VLDB option "Query Optimizations -> Parallel Query Execution". This option helped us with costly reports where multiple fact tables metrics are merged into the same report. The advantage is that MSTR will execute some of the passes in parallel, reducing the overall time necessary to fulfill the report.

But there's a catch. Enabling this, had an impact on how the temporary tables behave and for a good reason. As the temporary tables are generated in parallel, on separate SPIDs, they need to be visible between connections so that they can be merged in the last pass. That's why, after enabling this VLDB option, the temporary tables are materialized on the default database setup in the ODBC DSN. This does come with an additional headache, as "SELECT INTO" puts an exclusive lock on the systems tables, blocking other processes (ETL and other MSTR reports). The lock can stay there for a while, from seconds to minutes, depending on how heavy those report passes are.

In order to take advantage of all the VLDB setup already performed, we had to find a way to prevent MSTR processes from blocking each other or the ETL. The only viable option is to use an operation that would not require system tables locking and that is an INSERT. By default, SQL Server performs a logged insert which is slower than a SELECT INTO, but in certain conditions a minimally-logged INSERT (MLI) operation can be performed which reduces a lot the performance gap. For the requirements of MLI, checkout this link.

This being said, we identified a set of VLDB options to achieve the minimally-logged insert:
  1. First step is to switch from SELECT INTO to INSERT INTO and the VLDB setting for that is "Tables -> Table Creation Type -> Explicit Table"
  2. In the second step we have to force a table hint "WITH (TABLOCK)" at the insert statement level as per MLI requirement using "Tables/Insert -> Insert Table Option"

After the 2 VLDB changes, MSTR will create the temporary tables through a DDL statement (CREATE TABLE) using the definition of the attribute alias and then perform an INSERT INTO operation with the hint we predefined.

It's true that we do not have the same level of performance offered by SELECT INTO, but eliminating the locks using INSERT INTO, balanced the execution of the reports, so the overall user experience has been improved.

Comments

  1. Thank you for sharing such a nice and interesting blog and really very helpful article. Mulesoft is the Most Widely Used Integration Platform. If you want to become Mulesoft Certified Developer, attend this Best Mulesoft Training Course offered by the Unogeeks (Top Mulesoft Training Institute)

    ReplyDelete

Post a Comment

Popular posts from this blog

The count distinct adventure - Part I

The surrogate key strikes back

The count distinct adventure - Part II