You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: Troubleshoot Polybase connectivty | Microsoft Docs
2
+
title: Troubleshoot PolyBase Kerberos connectivity | Microsoft Docs
3
3
description:
4
4
services:
5
5
documentationcenter:
@@ -16,27 +16,28 @@ ms.tgt_pltfrm: na
16
16
ms.devlang:
17
17
ms.topic: article
18
18
ms.date: 07/18/2017
19
-
ms.author:
19
+
ms.author: alazad
20
20
---
21
-
# Troubleshoot PolyBase connectivity
22
-
You can use An interactive diagnostics tool, that has been built into PolyBase, to help troubleshoot authentication problems when using PolyBase against a Kerberos-secured Hadoop cluster.
21
+
# Troubleshoot PolyBase Kerbrsos connectivity
22
+
You can use an interactive diagnostics tool, that has been built into PolyBase, to help troubleshoot authentication problems when using PolyBase against a Kerberos-secured Hadoop cluster.
23
23
24
-
This article serves as a guide to walk through the debugging process of such issues by leveraging this tool.
24
+
This article serves as a guide to walk through the debugging process of such issues by leveraging this tool, which is built into PolyBase.
25
+
26
+
## Prerequisites
25
27
26
-
**Prerequisites**
27
28
1. SQL Server 2016 RTM CU6 / SQL Server 2016 SP1 CU3 / SQL Server 2017 or higher with PolyBase installed
28
29
1. A Hadoop cluster (Cloudera or Hortonworks) secured with Kerberos (Active Directory or MIT)
29
30
30
-
**Introduction**
31
+
## Introduction
31
32
It helps to first understand the Kerberos protocol at a high-level. There are three actors involved:
1. Key distribution center (referred to as a domain controller in Active Directory)
35
36
36
37
Each one of Hadoop's secured resources is registered with the **Key Distribution Center (KDC)** with a unique **Service Principal Name (SPN)** as part of the Kerberization process of the Hadoop cluster. The goal is for the client to obtain a temporary user ticket, called a **Ticket Granting Ticket (TGT)**, in order to request another temporary ticket, called a **Service Ticket (ST)**, from the KDC against the particular SPN that it wants to access.
37
38
In PolyBase, when authentication is requested against any Kerberos-secured resource, the following four-round-trip handshake takes place:
38
39
1. SQL Server connects to the KDC and obtains a TGT for the user. The TGT is encrypted using the KDC’s private key.
39
-
1. SQL Server calls the Hadoop secured resource (e.g. HDFS) and determines what SPN it needs a ST for.
40
+
1. SQL Server calls the Hadoop secured resource (e.g. HDFS) and determines which SPN it needs an ST.
40
41
1. SQL Server goes back to the KDC, passes the TGT back, and requests a ST to access that particular secured resource. The ST is encrypted using the secured service’s private key.
41
42
1. SQL Server forwards the ST to Hadoop and gets authenticated to have a session created against that service.
42
43
@@ -45,31 +46,33 @@ In PolyBase, when authentication is requested against any Kerberos-secured resou
45
46
Issues with authentication fall into one or more of the above four steps. To help with faster debugging, PolyBase has introduced an integrated diagnostics tool to help identify the point of failure.
46
47
47
48
## Troubleshooting
48
-
PolyBase has multiple configuration XMLs containing properties of the Hadoop cluster. Namely, these are core-site.xml, hdfs-site.xml, hive-site.xml, jaas.conf, mapred-site.xml, and yarn-site.xml. They are located under "\[System Drive\]:*{{INSTALL\_PATH}}\\{{INSTANCE\_NAME}}*\\MSSQL\\Binn\\Polybase\\Hadoop\\conf". The default for SQL Server 2016, for instance, would be "C:\\Program Files\\Microsoft SQL Server\\MSSQL13.MSSQLSERVER\\MSSQL\\Binn\\Polybase\\Hadoop\\conf".
49
+
PolyBase has multiple configuration XMLs containing properties of the Hadoop cluster. Namely, these are the following files:
For example, the default for SQL Server 2016, for instance, would be "C:\\Program Files\\Microsoft SQL Server\\MSSQL13.MSSQLSERVER\\MSSQL\\Binn\\Polybase\\Hadoop\\conf".
49
62
50
63
Update one of the PolyBase configuration files, **core-site.xml**, with the three properties below with the values set according to the environment:
51
64
```xml
52
65
<property>
53
-
<name>polybase.kerberos.realm</name>
54
-
<value>**CONTOSO.COM**</value>
55
-
</property>
56
-
<property>
57
-
<name>polybase.kerberos.kdchost</name>
58
-
<value>**kerberos.contoso.com**</value>
59
-
</property>
60
-
<property>
61
-
<name>hadoop.security.authentication</name>
62
-
<value>KERBEROS</value>
66
+
<name>polybase.kerberos.realm</name>
67
+
<value>**CONTOSO.COM**</value>
63
68
</property>
64
69
```
65
70
The other XMLs will later need to be updated as well if pushdown operations are desired, but with just this file configured, the HDFS file system should at least be able to be accessed.
66
71
67
72
The tool runs independently of SQL Server, so it does not need to be running, nor does it need to be restarted if updates are made to the configuration XMLs. To run the tool, execute the following on the host with SQL Server installed:
68
73
69
-
```dos
70
-
> cd "C:\\Program Files\\Microsoft SQL Server\\MSSQL13.MSSQLSERVER\\MSSQL\\Binn\\Polybase"
The output is verbose for enhanced debugging, but there are only four main checkpoints to look for regardless of whether you are using MIT or AD. They correspond to the four steps outlined above.
87
92
@@ -156,15 +161,15 @@ Reaching this point confirms that: (i) the three actors are able to communicate
156
161
```
157
162
## Common Errors
158
163
If the tool was run and the file properties of the target path were *not* printed (Checkpoint 4), there should be an exception thrown midway. Review it and consider the context of where in the four-step flow it occurred. Consider the following common issues that may have occurred, in order:
159
-
| Exception API | Message or display| Cause |
160
-
| --- | --- | --- |
161
-
| org.apache.hadoop.security.AccessControlException|SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]| The core-site.xml doesn't have the hadoop.security.authentication property set to "KERBEROS".|
162
-
|javax.security.auth.login.LoginException|Client not found in Kerberos database (6) - CLIENT_NOT_FOUND | The admin Service Principal supplied does not exist in the realm specified in core-site.xml.|
163
-
| javax.security.auth.login.LoginException| Checksum failed | Admin Service Principal exists, but bad password. |
164
-
|N/A |Native config name: C:\Windows\krb5.ini<br>Loaded from native config | This is not an exception, but it indicates that Java's krb5LoginModule detected custom client configurations on your machine. Check your custom client settings as they may be causing the issue. |
165
-
| javax.security.auth.login.LoginException:<br>java.lang.IllegalArgumentException|Illegal principal name admin_user@CONTOSO.COM: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to admin_user@CONTOSO.COM| Add the property “hadoop.security.auth_to_local” to core-site.xml with the appropriate rules per the Hadoop cluster. |
166
-
| java.net.ConnectException|Attempting to access external filesystem at URI: hdfs://10.193.27.230:8020<br>Call From IAAS16981207/10.107.0.245 to 10.193.27.230:8020 failed on connection exception | Authentication against the KDC was successful, but it failed to access the Hadoop name node. Check the name node IP and port. Verify the firewall is disabled on Hadoop. |
167
-
| java.io.FileNotFoundException|File does not exist: /test/data.csv | Authentication was successful, but the location specified does not exist. Check the path or test with root "/" first. |
164
+
| Exception and messages| Cause |
165
+
| --- | --- |
166
+
| org.apache.hadoop.security.AccessControlException<br>SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]| The core-site.xml doesn't have the hadoop.security.authentication property set to "KERBEROS".|
167
+
|javax.security.auth.login.LoginException<br>Client not found in Kerberos database (6) - CLIENT_NOT_FOUND | The admin Service Principal supplied does not exist in the realm specified in core-site.xml.|
168
+
| javax.security.auth.login.LoginException<br> Checksum failed | Admin Service Principal exists, but bad password. |
169
+
| Native config name: C:\Windows\krb5.ini<br>Loaded from native config | This is not an exception, but it indicates that Java's krb5LoginModule detected custom client configurations on your machine. Check your custom client settings as they may be causing the issue. |
170
+
| javax.security.auth.login.LoginException<br>java.lang.IllegalArgumentException<br>Illegal principal name admin_user@CONTOSO.COM: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to admin_user@CONTOSO.COM| Add the property “hadoop.security.auth_to_local” to core-site.xml with the appropriate rules per the Hadoop cluster. |
171
+
| java.net.ConnectException<br>Attempting to access external filesystem at URI: hdfs://10.193.27.230:8020<br>Call From IAAS16981207/10.107.0.245 to 10.193.27.230:8020 failed on connection exception | Authentication against the KDC was successful, but it failed to access the Hadoop name node. Check the name node IP and port. Verify the firewall is disabled on Hadoop. |
172
+
| java.io.FileNotFoundException<br>File does not exist: /test/data.csv | Authentication was successful, but the location specified does not exist. Check the path or test with root "/" first. |
168
173
## Debugging tips
169
174
### MIT KDC
170
175
All the SPNs registered with the KDC, including the admins, can be viewed by running **kadmin.local** > (admin login) > **listprincs** on the KDC host or any configured KDC client. If the Hadoop cluster was properly Kerberized, there should be one SPN for each one of the numerous services available in the cluster (e.g. nn, dn, rm, yarn, spnego, etc.) Their corresponding keytab files (password substitutes) can be seen under **/etc/security/keytabs**, by default. They are encrypted using the KDC's private key.
@@ -178,135 +183,13 @@ The KDC logs are available in **/var/log/krb5kdc.log**, by default, which inclu
178
183
```
179
184
### Active Directory
180
185
In Active Directory, the SPNs can be viewed by browsing to Control Panel > Active Directory Users and Computers > *MyRealm* > *MyOrganizationalUnit*. If the Hadoop cluster was properly Kerberized, there should be one SPN for each one of the numerous services available (e.g. nn, dn, rm, yarn, spnego, etc.)
181
-
## References
182
-
1. Sample output from an MIT KDC
183
-
1.[Sample output from an AD KDC](file:///D:\Share\site\Sample_Polybase_AD.txt)
184
-
1. Integrating PolyBase with Cloudera using Active Directory Authentication
185
-
1.[Cloudera’s Guide to setting up Kerberos for CDH](https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_sg_principal_keytab.html)
186
-
1.[Hortonworks’ Guide to Setting up Kerberos for HDP](https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_Ambari_Security_Guide/content/ch_configuring_amb_hdp_for_kerberos.html)
187
-
188
-
189
-
190
-
191
-
192
-
193
-
194
-
195
-
196
-
197
-
198
-
199
-
200
-
201
-
202
-
203
-
204
-
205
-
206
-
207
-
208
-
209
-
210
-
211
-
212
-
213
-
214
-
215
-
216
-
217
-
218
-
219
-
220
-
221
-
222
-
223
-
224
-
225
-
226
-
227
-
228
-
229
-
230
-
231
-
232
-
233
-
234
-
235
-
236
-
237
-
238
-
239
-
240
-
241
-
242
-
243
-
244
-
245
-
246
-
247
-
248
-
249
-
250
-
251
-
252
-
253
-
254
-
255
-
256
-
257
-
258
-
259
-
260
-
261
-
262
-
263
-
264
-
265
-
266
-
267
-
268
-
269
-
270
-
271
-
272
-
273
-
274
-
275
-
276
-
277
-
278
-
279
-
280
-
281
-
282
-
283
-
284
-
285
-
286
-
287
-
288
-
289
-
290
-
291
-
292
-
293
-
294
-
295
-
296
-
297
-
298
-
299
-
300
-
301
-
302
-
303
-
304
-
305
-
306
-
307
-
308
-
309
186
187
+
## Sample output
188
+
For sample output, see the text file located on your computer, for example: \\{share}\\{site}\\Sample_Polybase_AD.txt)
310
189
190
+
## See Also
191
+
1.[Integrating PolyBase with Cloudera using Active Directory Authentication](https://blogs.msdn.microsoft.com/microsoftrservertigerteam/2016/10/17/integrating-polybase-with-cloudera-using-active-directory-authentication)
192
+
1.[Cloudera’s Guide to setting up Kerberos for CDH](https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_sg_principal_keytab.html)
193
+
1.[Hortonworks’ Guide to Setting up Kerberos for HDP](https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_Ambari_Security_Guide/content/ch_configuring_amb_hdp_for_kerberos.html)
0 commit comments