On-Premises Data Gateway FAQ

As of Oct 5th 2017 Reference article

General

Question: What is the actual Windows service called?
Answer: The gateway is called On-premises data gateway service in Services

Question: What are the requirements for the gateway?
Answer: Take a look at the requirements section of the main gateway article.

Question: What data sources are supported with the gateway?
Answer: See the data sources table in the main gateway article.

Question: Do I need a gateway for cloud data sources like Azure SQL Database?
Answer: No! The service will be able to connect to that data source without a gateway.

Question: Are there any inbound connections to the gateway from the cloud?
Answer: No. The gateway uses outbound connections to Azure Service Bus.

Question: What if I block outbound connections? What do I need to open?
Answer: See the list of ports and hosts that the gateway uses.

Question: Does the gateway have to be installed on the same machine as the data source?
Answer: No. The gateway will connect to the data source using the connection information that was provided. Think of the gateway as a client application in this sense. It will just need to be able to connect to the server name that was provided.

Question: What is the latency for running queries to a data source from the gateway? What is the best architecture?
Answer: It is recommended to have the gateway as close to the data source as possible to avoid network latency. If you can install the gateway on the actual data source, it will minimize the latency introduced. Consider the data centers as well. For example, if your service is making use of the West US data center, and you have SQL Server hosted in an Azure VM, you will want to have the Azure VM in West US as well. This will minimize latency and avoid egress charges on the Azure VM.

Question: Are there any requirements for network bandwidth?
Answer: It is recommended to have good throughput for your network connection. Every environment is different and this is also dependent on the amount of data being sent. Using ExpressRoute could help to guarantee a level of throughput between on-premises and the Azure data centers.

You can use the 3rd party Azure Speed Test app to help gauge what your throughput is.

Question: Can the gateway Windows service run with an Azure Active Directory account?
Answer: No. The Windows service needs to have a valid Windows account. By default it will run with the Service SID, NT SERVICE\PBIEgwService.

Question: How are results sent back to the cloud?
Answer: This is done by way of the Azure Service Bus. For more information, see how it works.

Question: Where are my credentials stored?
Answer: The credentials you enter for a data source are stored encrypted in the gateway cloud service. The credentials are decrypted at the gateway on-premises.

Question: Can I place the gateway in a perimeter network (also known as DMZ, demilitarized zone, and screened subnet)?
Answer: The gateway requires connectivity to the data source. If the data source is not accessable in your perimeter network, the gateway may not be able to connect to it. For example, your SQL Server may not be in your perimeter network. And, you cannot connect to your SQL Server from the perimeter network. If you placed the gateway in your perimeter network, it would not be able to reach the SQL Server.

Question: Is it possible to force the gateway to use HTTPS traffic with Azure Service Bus instead of TCP?
Answer: Yes. Although, this will greatly reduce performance. You will want to modify the Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.dll.config file. You will want to change the value from AutoDetect to Https. This file is located, by default, at C:\Program Files\On-premises data gateway.

Question: Do I need to whitelist the Azure Datacenter IP list? Where do I get the list?
Answer: If you are blocking outbound IP traffic, you may need to whitelist the Azure Datacenter IP list. Currently, the gateway will communicate with Azure Service Bus using the IP address in addition to the fully qualified domain name. The Azure Datacenter IP list is updated weekly. You can download the Microsoft Azure Datacenter IP list.

<setting name="ServiceBusSystemConnectivityModeString" serializeAs="String">
    <value>Https</value>
</setting>

High Availability/Disaster Recovery

Question: Are there any plans for enabling high availability scenarios with the gateway?
Answer: Yes, this is an area of active investment for the Power BI team. Please stay tuned to the Power BI blog for further updates about this feature.

Question: What options are available for disaster recovery?
Answer: You can use the recovery key to restore or move a gateway. When you install the gateway, supply the recovery key.

Question: What is the benefit of the recovery key?
Answer: It provides a way to migrate or recover your gateway settings. This is also used for disaster recovery.

Troubleshooting

Question: Where are the gateway logs located?
Answer: See the tools section of the troubleshooting article.

Question: How can I see what queries are being sent to the on-premises data source?
Answer: You can enable query tracing. This will include the queries being sent. Remember to change it back to the original value when done troubleshooting. Having query tracing enabled will cause the logs to be larger.

You can also look at tools your data source has for tracing queries. For example, for SQL Server and Analysis Services you can use Extended Events or SQL Profiler.

Analysis Services

Question: Can I use msdmpump.dll to create custom effective username mappings for Analysis Services?
Answer: No. This is not supported at this time.

Question: Can I use the gateway to connect to a multidimensional (OLAP) instance.
Answer: Yes! The On-Premises Data Gateway supports live connections to both Analysis Services Tabular and Multidimensional models.

Question: What if I install the gateway on a computer in a different domain from my on-premises server that uses Windows authentication?
Answer: No guarantees here. It all depends on the trust relationship between the two domains. If the two different domains are in a trusted domain model, then the gateway might be able to connect to the Analysis Services server and the effective user name can be resolved. If not, you may encounter a login failure.

Question: How can I find out what effective username is being passed to my on-premises Analysis Services server?
Answer: We answer this in the troubleshooting article.

Question: I have 25 databases in Analysis Services, is there a way to have them all enabled for the gateway at once?
Answer: No. This is on the roadmap, but we don’t have a timeframe.

Administration

Question: Can I have more than one admin for a gateway?
Answer: Yes! When you manage a gateway, you can go to the administrator’s tab to add additional admins.

Question: Does the gateway admin need to be an admin on the machine where the gateway is installed?
Answer: No. The gateway admin is used to manage the gateway from within the service.

Question: Can I prevent users in my organization from creating a gateway?
Answer: No. This is on the roadmap, but we don’t have a timeframe.

Question: Can I get usage and statistics information of the gateways in my organization?
Answer: No. This is on the roadmap, but we don’t have a timeframe.

Power BI

Question: Do I need to upgrade the personal gateway? Answer: No, you can keep using the personal gateway for Power BI.

Question: How often are tiles in a dashboard, in Power BI, refreshed when connected through the On-Premises Data Gateway?
Answer: About ten minutes. DirectQuery connections are just that. This doesn’t mean that a tile issues a query to your on-premises server, and shows new data, every ten minutes.

Question: Can I upload Excel workbooks with Power Pivot data models that connect to on-premises data sources? Do I need a gateway for this scenario?
Answer: Yes, you can upload the workbook. And, no, you don’t need a gateway. But, because the data will reside in the Excel data model, reports in Power BI based on the Excel workbook will not be live. In order to refresh reports in Power BI, you’d have to re-upload an updated workbook each time. Or, use the gateway with scheduled refresh.

Question: If users share dashboards that has a DirectQuery connection, will those other users be able to see the data even though they might not have the same permissions.
Answer: For a dashboard connected to Analysis Services, users will only see the data they have access to. If the users do not have the same permissions, they will not be able to see any data. For other data sources, all users will share the credentials entered by the admin for that data source.

Question: Why can’t I connect to my Oracle server?
Answer: You may need to install the Oracle client and configure the tnsnames.ora file with the proper server information in order to connect to your Oracle server. This is a separate install outside of the Gateway. For more information, see Installing the Oracle Client.

Question: Will the gateway work with ExpressRoute?
Answer: Yes. For more information about ExpressRoute and Power BI, see Power BI and ExpressRoute.

Advertisements

Microsoft SSAS logistic regression mining model node types

For the DMX queries here is the reference of the node types for Microsoft SSAS logistic regression mining model:

NODE_TYPE
A logistic regression model outputs the following node types:

Node Type ID Description
1 Model.
17 Organizer node for the subnetwork.
18 Organizer node for the input layer.
19 Organizer node for the hidden layer. The hidden layer is empty.
20 Organizer node for the output layer.
21 Input attribute node.
23 Output attribute node.
24 Marginal statistics node.

So the DMX query:

SELECT * FROM [Model_Name].Content
WHERE NODE_TYPE = 23

Mining Model Content for Logistic Regression Models

Generate T-SQL script to assign role to the newly created user to SQL server database

You open SQL management Studio and create the user login using UI or following T-SQL script with Windows authentication:

USE [master]
GO
CREATE LOGIN [DomainNameXYZ\UserNameABC] FROM WINDOWS WITH DEFAULT_DATABASE=[master]
GO

Use Case (I) – You need to provide all the databases “db_datareader” and “db_datawriter” roles: 

We do this in 2 steps:

  1. Create the user with the login created above in each database, or required database.
  2. Assign the role (permission) to the user for that database.

You may use this script to generate script to execute, for creating users in each database:

declare @V1 nvarchar(max), @V2 nvarchar(max), @V3 nvarchar(max), @V4 nvarchar(max), @V5 nvarchar(max)
DECLARE @DbID as INT;
DECLARE @DbName as NVARCHAR(max);
Declare @NewHire as NVARCHAR(max) ;

DECLARE @DbNameCursor as CURSOR;

SET @NewHire = N'[DomainNameXYZ\UserNameABC]’

create table #NewHireAccess
(SQLForNewHire varchar(max))
SET @DbNameCursor = CURSOR FOR
SELECT database_id, Name
FROM sys.databases
where database_id > 6
order by database_id;

OPEN @DbNameCursor;
FETCH NEXT FROM @DbNameCursor INTO @DbID, @DbName;

WHILE @@FETCH_STATUS = 0
BEGIN
Set @V1 = ‘Use ‘ + @DbName + ‘;
GO’;
insert into #NewHireAccess
(SQLForNewHire)
select @V1;

Set @V2 = ‘CREATE USER ‘ + @NewHire + ‘ FOR LOGIN ‘ + @NewHire
insert into #NewHireAccess
(SQLForNewHire)
select @V2;

Set @V2 = ‘ALTER USER ‘ + @NewHire + N’ WITH DEFAULT_SCHEMA=[dbo]; ‘
insert into #NewHireAccess
(SQLForNewHire)
select @V2;

FETCH NEXT FROM @DbNameCursor INTO @DbID, @DbName;
END

CLOSE @DbNameCursor;
DEALLOCATE @DbNameCursor;

select * from #NewHireAccess

drop table #NewHireAccess

You may use this script to generate script to execute, for assigning roles “db_datareader” and “db_datawriter” to the  user in each database:

declare @V1 nvarchar(max), @V2 nvarchar(max), @V3 nvarchar(max), @V4 nvarchar(max), @V5 nvarchar(max)
DECLARE @DbID as INT;
DECLARE @DbName as NVARCHAR(max);
Declare @NewHire as NVARCHAR(max) ;

DECLARE @DbNameCursor as CURSOR;

SET @NewHire = N'[DomainNameXYZ\UserNameABC]’

create table #NewHireAccess
(SQLForNewHire varchar(max))
SET @DbNameCursor = CURSOR FOR
SELECT database_id, Name
FROM sys.databases
where database_id > 6
order by database_id;

OPEN @DbNameCursor;
FETCH NEXT FROM @DbNameCursor INTO @DbID, @DbName;

WHILE @@FETCH_STATUS = 0
BEGIN
Set @V1 = ‘Use ‘ + @DbName + ‘;
GO’;
insert into #NewHireAccess
(SQLForNewHire)
select @V1;

Set @V2 = ‘ALTER ROLE [db_datareader] ADD MEMBER ‘ + @NewHire
insert into #NewHireAccess
(SQLForNewHire)
select @V2;

Set @V2 = ‘ALTER ROLE [db_datawriter] ADD MEMBER ‘ + @NewHire
insert into #NewHireAccess
(SQLForNewHire)
select @V2;

FETCH NEXT FROM @DbNameCursor INTO @DbID, @DbName;
END

CLOSE @DbNameCursor;
DEALLOCATE @DbNameCursor;

select * from #NewHireAccess

drop table #NewHireAccess

 

microsoft decision trees algorithm parameters COMPLEXITY_PENALTY SCORE_METHOD SPLIT_METHOD

The Microsoft Decision Trees algorithm supports parameters that affect the performance and accuracy of the resulting mining model. You can also set modeling flags on the mining model columns or mining structure columns to control the way that data is processed.

Setting Algorithm Parameters

The following table describes the parameters that you can use with the Microsoft Decision Trees algorithm.

COMPLEXITY_PENALTY
Controls the growth of the decision tree. A low value increases the number of splits, and a high value decreases the number of splits. The default value is based on the number of attributes for a particular model, as described in the following list:

  • For 1 through 9 attributes, the default is 0.5.
  • For 10 through 99 attributes, the default is 0.9.
  • For 100 or more attributes, the default is 0.99.
FORCE_REGRESSOR
Forces the algorithm to use the specified columns as regressors, regardless of the importance of the columns as calculated by the algorithm. This parameter is only used for decision trees that are predicting a continuous attribute.

SCORE_METHOD

Determines the method that is used to calculate the split score. The following options are available:

ID Name
1 Entropy
3 Bayesian with K2 Prior
4 Bayesian Dirichlet Equivalent (BDE) with uniform prior

(default)

The default is 4, or BDE.

SPLIT_METHOD

Determines the method that is used to split the node. The following options are available:

ID Name
1 Binary: Indicates that regardless of the actual number of values for the attribute, the tree should be split into two branches.
2 Complete: Indicates that the tree can create as many splits as there are attribute values.
3 Both: Specifies that Analysis Services can determine whether a binary or complete split should be used to produce the best results.

The default is 3.

MAXIMUM_INPUT_ATTRIBUTES

Defines the number of input attributes that the algorithm can handle before it invokes feature selection.

The default is 255.

Set this value to 0 to turn off feature selection.

[Available only in some editions of SQL Server]

MAXIMUM_OUTPUT_ATTRIBUTESDefines the number of output attributes that the algorithm can handle before it invokes feature selection.

The default is 255.

Set this value to 0 to turn off feature selection.

[Available only in some editions of SQL Server]

MINIMUM_SUPPORTDetermines the minimum number of leaf cases that is required to generate a split in the decision tree.

The default is 10.

You may need to increase this value if the dataset is very large, to avoid overtraining.

 

Microsoft Decision Trees Algorithm Technical Reference

Update Table using ROW_NUMBER()

For reasons we didn’t setup identity for one of the columns say CountryKey of table DimCountry.

We update ISO and Name columns/fields of this table but couldn’t update the key.

So we issued following update SQL statement using ROW_NUMBER function:

with SDimCountry
as
(
select *
, row_number() over(order by Field2Name) as CKey
from DimCountry
)
update SDimCountry
set CountryKey = CKey
go

Request to run job refused because the job is already running from a request by User

Stop the job using:

EXEC msdb.dbo.sp_stop_job N’ABC_JOB_NAME’

The sp_stop_job will show that the job has been stopped, though when you actually try to re-run it you will get error: Request to run job ABC (from User USERXYZ) refused because the job is already running from a request by User USERXYZ. [SQLSTATE 42000] (Error 22022).  The step failed.

Navigate to SQL Agent and script the job and save the file.

Now delete the job and recreate it using the saved script.

Viola! You are back in action.

SSAS DMX Query for Dependency Network pane of the Microsoft Naive Bayes Viewer

SELECT NODE_CAPTION, MSOLAP_NODE_SCORE
FROM TM_NaiveBayes.CONTENT
WHERE NODE_TYPE = 10
ORDER BY MSOLAP_NODE_SCORE DESC

The node caption is used to identify the node, rather than the node name, because the caption shows both the attribute name and attribute value.

The MSOLAP_NODE_SCORE is a special value provided only for the input attribute nodes, and indicates the relative importance of this attribute in the model. You can see much the same information in the Dependency Network pane of the viewer; however, the viewer does not provide scores.

The NODE_TYPE = 10 represents the input attribute, and NODE_TYPE = 11 for each value of the attribute.

Reference: Naive Bayes Model Query Examples