Skip to content

Commit 313c78f

Browse files
author
Bruce Hamilton
committed
Visual Pass and light editing
1 parent 55ac911 commit 313c78f

2 files changed

Lines changed: 48 additions & 165 deletions

File tree

docs/relational-databases/polybase/TOC.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,4 @@
77
# [T-SQL objects](polybase-t-sql-objects.md)
88
# [Queries](polybase-queries.md)
99
# [Troubleshooting](polybase-troubleshooting.md)
10-
# [Troubleshoot connectivity](polybase-troubleshoot-connectivity.md)
10+
# [Troubleshoot PolyBase Kerberos connectivity](polybase-troubleshoot-connectivity.md)
Lines changed: 47 additions & 164 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Troubleshoot Polybase connectivty | Microsoft Docs
2+
title: Troubleshoot PolyBase Kerberos connectivity | Microsoft Docs
33
description:
44
services:
55
documentationcenter:
@@ -16,27 +16,28 @@ ms.tgt_pltfrm: na
1616
ms.devlang:
1717
ms.topic: article
1818
ms.date: 07/18/2017
19-
ms.author:
19+
ms.author: alazad
2020
---
21-
# Troubleshoot PolyBase connectivity
22-
You can use An interactive diagnostics tool, that has been built into PolyBase, to help troubleshoot authentication problems when using PolyBase against a Kerberos-secured Hadoop cluster.
21+
# Troubleshoot PolyBase Kerbrsos connectivity
22+
You can use an interactive diagnostics tool, that has been built into PolyBase, to help troubleshoot authentication problems when using PolyBase against a Kerberos-secured Hadoop cluster.
2323

24-
This article serves as a guide to walk through the debugging process of such issues by leveraging this tool.
24+
This article serves as a guide to walk through the debugging process of such issues by leveraging this tool, which is built into PolyBase.
25+
26+
## Prerequisites
2527

26-
**Prerequisites**
2728
1. SQL Server 2016 RTM CU6 / SQL Server 2016 SP1 CU3 / SQL Server 2017 or higher with PolyBase installed
2829
1. A Hadoop cluster (Cloudera or Hortonworks) secured with Kerberos (Active Directory or MIT)
2930

30-
**Introduction**
31+
## Introduction
3132
It helps to first understand the Kerberos protocol at a high-level. There are three actors involved:
32-
1. Kerberos Client (SQL Server)
33-
1. Secured Resource (HDFS, MR2, YARN, Job History, etc.)
34-
1. Key Distribution Center (referred to as a Domain Controller in Active Directory)
33+
1. Kerberos client (SQL Server)
34+
1. Secured resource (HDFS, MR2, YARN, Job History, etc.)
35+
1. Key distribution center (referred to as a domain controller in Active Directory)
3536

3637
Each one of Hadoop's secured resources is registered with the **Key Distribution Center (KDC)** with a unique **Service Principal Name (SPN)** as part of the Kerberization process of the Hadoop cluster. The goal is for the client to obtain a temporary user ticket, called a **Ticket Granting Ticket (TGT)**, in order to request another temporary ticket, called a **Service Ticket (ST)**, from the KDC against the particular SPN that it wants to access. 
3738
In PolyBase, when authentication is requested against any Kerberos-secured resource, the following four-round-trip handshake takes place:
3839
1. SQL Server connects to the KDC and obtains a TGT for the user. The TGT is encrypted using the KDC’s private key.
39-
1. SQL Server calls the Hadoop secured resource (e.g. HDFS) and determines what SPN it needs a ST for.
40+
1. SQL Server calls the Hadoop secured resource (e.g. HDFS) and determines which SPN it needs an ST.
4041
1. SQL Server goes back to the KDC, passes the TGT back, and requests a ST to access that particular secured resource. The ST is encrypted using the secured service’s private key.
4142
1. SQL Server forwards the ST to Hadoop and gets authenticated to have a session created against that service.
4243

@@ -45,31 +46,33 @@ In PolyBase, when authentication is requested against any Kerberos-secured resou
4546
Issues with authentication fall into one or more of the above four steps. To help with faster debugging, PolyBase has introduced an integrated diagnostics tool to help identify the point of failure.
4647

4748
## Troubleshooting
48-
PolyBase has multiple configuration XMLs containing properties of the Hadoop cluster. Namely, these are core-site.xml, hdfs-site.xml, hive-site.xml, jaas.conf, mapred-site.xml, and yarn-site.xml. They are located under "\[System Drive\]:*{{INSTALL\_PATH}}\\{{INSTANCE\_NAME}}*\\MSSQL\\Binn\\Polybase\\Hadoop\\conf". The default for SQL Server 2016, for instance, would be "C:\\Program Files\\Microsoft SQL Server\\MSSQL13.MSSQLSERVER\\MSSQL\\Binn\\Polybase\\Hadoop\\conf".
49+
PolyBase has multiple configuration XMLs containing properties of the Hadoop cluster. Namely, these are the following files:
50+
- core-site.xml
51+
- hdfs-site.xml
52+
- hive-site.xml
53+
- jaas.conf
54+
- mapred-site.xml
55+
- yarn-site.xml
56+
57+
These files are located under:
58+
59+
\\[System Drive\\]:{install path}\\{instance}\\{name}\\MSSQL\\Binn\\Polybase\\Hadoop\\conf
60+
61+
For example, the default for SQL Server 2016, for instance, would be "C:\\Program Files\\Microsoft SQL Server\\MSSQL13.MSSQLSERVER\\MSSQL\\Binn\\Polybase\\Hadoop\\conf".
4962

5063
Update one of the PolyBase configuration files, **core-site.xml**, with the three properties below with the values set according to the environment:
5164
```xml
5265
<property>
53-
<name>polybase.kerberos.realm</name>
54-
<value>**CONTOSO.COM**</value>
55-
</property>
56-
<property>
57-
<name>polybase.kerberos.kdchost</name>
58-
<value>**kerberos.contoso.com**</value>
59-
</property>
60-
<property>
61-
<name>hadoop.security.authentication</name>
62-
<value>KERBEROS</value>
66+
<name>polybase.kerberos.realm</name>
67+
<value>**CONTOSO.COM**</value>
6368
</property>
6469
```
6570
The other XMLs will later need to be updated as well if pushdown operations are desired, but with just this file configured, the HDFS file system should at least be able to be accessed.
6671

6772
The tool runs independently of SQL Server, so it does not need to be running, nor does it need to be restarted if updates are made to the configuration XMLs. To run the tool, execute the following on the host with SQL Server installed:
6873

69-
```dos
70-
> cd "C:\\Program Files\\Microsoft SQL Server\\MSSQL13.MSSQLSERVER\\MSSQL\\Binn\\Polybase"
71-
> java -classpath ".\\Hadoop\\conf;.\\Hadoop\\\*;.\\Hadoop\\HDP2\_2\\\*" com.microsoft.polybase.client.HdfsBridge <Name Node Address> <Name Node Port> <Service Principal> <Filepath containing Service Principal's Password> <Remote HDFS file path (optional)>
72-
```
74+
java -classpath ".\\Hadoop\\conf;.\\Hadoop\\\*;.\\Hadoop\\HDP2\_2\\\*" com.microsoft.polybase.client.HdfsBridge <Name Node Address> <Name Node Port> <Service Principal> <Filepath containing Service Principal's Password> <Remote HDFS file path (optional)>
75+
7376
## Arguments
7477
| Argument | Description|
7578
| --- | --- |
@@ -81,7 +84,9 @@ The tool runs independently of SQL Server, so it does not need to be running, no
8184

8285
## Example
8386
```dos
84-
> java -classpath ".\\Hadoop\\conf;.\\Hadoop\\\*;.\\Hadoop\\HDP2\_2\\\*" com.microsoft.polybase.client.HdfsBridge 10.193.27.232 8020 admin\_user "C:\\temp\\kerberos\_pass.txt"
87+
> java -classpath ".\Hadoop\conf;.\Hadoop\*;.\Hadoop\HDP2_2\*" com.microsoft.polybase.client.HdfsBridge 10.193.27.232 8020 admin\_user
88+
89+
C:\temp\kerberos\_pass.txt
8590
```
8691
The output is verbose for enhanced debugging, but there are only four main checkpoints to look for regardless of whether you are using MIT or AD. They correspond to the four steps outlined above.
8792

@@ -156,15 +161,15 @@ Reaching this point confirms that: (i) the three actors are able to communicate
156161
```
157162
## Common Errors
158163
If the tool was run and the file properties of the target path were *not* printed (Checkpoint 4), there should be an exception thrown midway. Review it and consider the context of where in the four-step flow it occurred. Consider the following common issues that may have occurred, in order:
159-
| Exception API | Message or display | Cause |
160-
| --- | --- | --- |
161-
| org.apache.hadoop.security.AccessControlException | SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS] | The core-site.xml doesn't have the hadoop.security.authentication property set to "KERBEROS".|
162-
| javax.security.auth.login.LoginException | Client not found in Kerberos database (6) - CLIENT_NOT_FOUND | The admin Service Principal supplied does not exist in the realm specified in core-site.xml.|
163-
| javax.security.auth.login.LoginException | Checksum failed | Admin Service Principal exists, but bad password. |
164-
| N/A | Native config name: C:\Windows\krb5.ini<br>Loaded from native config | This is not an exception, but it indicates that Java's krb5LoginModule detected custom client configurations on your machine. Check your custom client settings as they may be causing the issue. |
165-
| javax.security.auth.login.LoginException:<br>java.lang.IllegalArgumentException | Illegal principal name admin_user@CONTOSO.COM: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to admin_user@CONTOSO.COM | Add the property “hadoop.security.auth_to_local” to core-site.xml with the appropriate rules per the Hadoop cluster. |
166-
| java.net.ConnectException |Attempting to access external filesystem at URI: hdfs://10.193.27.230:8020<br>Call From IAAS16981207/10.107.0.245 to 10.193.27.230:8020 failed on connection exception | Authentication against the KDC was successful, but it failed to access the Hadoop name node. Check the name node IP and port. Verify the firewall is disabled on Hadoop. |
167-
| java.io.FileNotFoundException |File does not exist: /test/data.csv | Authentication was successful, but the location specified does not exist. Check the path or test with root "/" first. |
164+
| Exception and messages | Cause |
165+
| --- | --- |
166+
| org.apache.hadoop.security.AccessControlException<br>SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS] | The core-site.xml doesn't have the hadoop.security.authentication property set to "KERBEROS".|
167+
|javax.security.auth.login.LoginException<br>Client not found in Kerberos database (6) - CLIENT_NOT_FOUND | The admin Service Principal supplied does not exist in the realm specified in core-site.xml.|
168+
| javax.security.auth.login.LoginException<br> Checksum failed | Admin Service Principal exists, but bad password. |
169+
| Native config name: C:\Windows\krb5.ini<br>Loaded from native config | This is not an exception, but it indicates that Java's krb5LoginModule detected custom client configurations on your machine. Check your custom client settings as they may be causing the issue. |
170+
| javax.security.auth.login.LoginException<br>java.lang.IllegalArgumentException<br>Illegal principal name admin_user@CONTOSO.COM: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to admin_user@CONTOSO.COM | Add the property “hadoop.security.auth_to_local” to core-site.xml with the appropriate rules per the Hadoop cluster. |
171+
| java.net.ConnectException<br>Attempting to access external filesystem at URI: hdfs://10.193.27.230:8020<br>Call From IAAS16981207/10.107.0.245 to 10.193.27.230:8020 failed on connection exception | Authentication against the KDC was successful, but it failed to access the Hadoop name node. Check the name node IP and port. Verify the firewall is disabled on Hadoop. |
172+
| java.io.FileNotFoundException<br>File does not exist: /test/data.csv | Authentication was successful, but the location specified does not exist. Check the path or test with root "/" first. |
168173
## Debugging tips
169174
### MIT KDC 
170175
All the SPNs registered with the KDC, including the admins, can be viewed by running **kadmin.local** > (admin login) > **listprincs** on the KDC host or any configured KDC client. If the Hadoop cluster was properly Kerberized, there should be one SPN for each one of the numerous services available in the cluster (e.g. nn, dn, rm, yarn, spnego, etc.) Their corresponding keytab files (password substitutes) can be seen under **/etc/security/keytabs**, by default. They are encrypted using the KDC's private key. 
@@ -178,135 +183,13 @@ The KDC logs are available in **/var/log/krb5kdc.log**, by default, which inclu
178183
```
179184
### Active Directory
180185
In Active Directory, the SPNs can be viewed by browsing to Control Panel > Active Directory Users and Computers > *MyRealm* > *MyOrganizationalUnit*. If the Hadoop cluster was properly Kerberized, there should be one SPN for each one of the numerous services available (e.g. nn, dn, rm, yarn, spnego, etc.)
181-
## References
182-
1. Sample output from an MIT KDC
183-
1. [Sample output from an AD KDC](file:///D:\Share\site\Sample_Polybase_AD.txt)
184-
1. Integrating PolyBase with Cloudera using Active Directory Authentication
185-
1. [Cloudera’s Guide to setting up Kerberos for CDH](https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_sg_principal_keytab.html)
186-
1. [Hortonworks’ Guide to Setting up Kerberos for HDP](https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_Ambari_Security_Guide/content/ch_configuring_amb_hdp_for_kerberos.html)
187-
188-
189-
190-
191-
192-
193-
194-
195-
196-
197-
198-
199-
200-
201-
202-
203-
204-
205-
206-
207-
208-
209-
210-
211-
212-
213-
214-
215-
216-
217-
218-
219-
220-
221-
222-
223-
224-
225-
226-
227-
228-
229-
230-
231-
232-
233-
234-
235-
236-
237-
238-
239-
240-
241-
242-
243-
244-
245-
246-
247-
248-
249-
250-
251-
252-
253-
254-
255-
256-
257-
258-
259-
260-
261-
262-
263-
264-
265-
266-
267-
268-
269-
270-
271-
272-
273-
274-
275-
276-
277-
278-
279-
280-
281-
282-
283-
284-
285-
286-
287-
288-
289-
290-
291-
292-
293-
294-
295-
296-
297-
298-
299-
300-
301-
302-
303-
304-
305-
306-
307-
308-
309186

187+
## Sample output
188+
For sample output, see the text file located on your computer, for example: \\{share}\\{site}\\Sample_Polybase_AD.txt)
310189

190+
## See Also
191+
1. [Integrating PolyBase with Cloudera using Active Directory Authentication](https://blogs.msdn.microsoft.com/microsoftrservertigerteam/2016/10/17/integrating-polybase-with-cloudera-using-active-directory-authentication)
192+
1. [Cloudera’s Guide to setting up Kerberos for CDH](https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_sg_principal_keytab.html)
193+
1. [Hortonworks’ Guide to Setting up Kerberos for HDP](https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_Ambari_Security_Guide/content/ch_configuring_amb_hdp_for_kerberos.html)
311194

312195

0 commit comments

Comments
 (0)