NetCDF  4.9.3
auth.md
1 NetCDF Authorization Support
2 ====================================
3 
4 <!-- double header is needed to workaround doxygen bug -->
5 
6 [TOC]
7 
8 ## Introduction {#auth}
9 
10 netCDF can support user authorization using the facilities provided by the curl
11 library. This includes basic password authentication as well as
12 certificate-based authorization.
13 At the moment, this document only applies to DAP2 and DAP4 access.
14 
15 With some exceptions (e.g. see the section on <a href="#auth_redir">redirection</a>)
16 The libcurl authorization mechanisms can be accessed in two ways
17 
18 1. Inserting the username and password into the url, or
19 2. Accessing information from a so-called _rc_ file named either
20  `.ncrc` or `.dodsrc`. The latter is historical and deprecated, but will be supported indefinitely.
21 
22 ## URL-Based Authentication {#auth_url}
23 
24 For simple password based authentication, it is possible to
25 directly insert the username and the password into a url in this form.
26 
27  http://username:password@host/...
28 
29 This username and password will be used if the server asks for
30 authentication. Note that only simple password authentication
31 is supported in this format.
32 
33 Specifically note that [redirection-based](#auth_redir)
34 authorization may work with this but it is a security risk.
35 This is because the username and password
36 may be sent to each server in the redirection chain.
37 
38 Note also that the `user:password` form may contain characters that must be
39 escaped. See the <a href="#USERPWDESCAPE">password escaping</a> section to see
40 how to properly escape the user and password.
41 
42 ## RC File Authentication {#auth_dodsrc}
43 The netcdf library supports an _rc_ file mechanism to allow the passing
44 of a number of run-time parameters to libnetcdf and libcurl.
45 This is described in the file "quickstart_env.md".
46 
47 ## Authorization-Related Keys {#auth_keys}
48 
49 The currently defined set of authorization-related keys recognized in .netrc are as follows.
50 The second column is the affected curl_easy_setopt option(s), if any
51 (see reference #1).
52 <table>
53 <tr><th>Key</th><th>Affected curl_easy_setopt Options</th><th>Notes</th>
54 <tr><td>HTTP.COOKIEJAR</td><td>CURLOPT_COOKIEJAR</td>
55 <tr><td>HTTP.COOKIEFILE</td><td>CURLOPT_COOKIEJAR</td><td>COOKIEJAR and COOKIEFILE are considered aliases, so setting one will set the other as well.</td>
56 <tr><td>HTTP.PROXY.SERVER</td><td>CURLOPT_PROXY, CURLOPT_PROXYPORT, CURLOPT_PROXYUSERPWD</td>
57 <tr><td>HTTP.PROXY_SERVER</td><td>CURLOPT_PROXY, CURLOPT_PROXYPORT, CURLOPT_PROXYUSERPWD</td><td>Decprecated: use HTTP.PROXY.SERVER</td>
58 <tr><td>HTTP.SSL.CERTIFICATE</td><td>CURLOPT_SSLCERT</td>
59 <tr><td>HTTP.SSL.KEY</td><td>CURLOPT_SSLKEY</td>
60 <tr><td>HTTP.SSL.KEYPASSWORD</td><td>CURLOPT_KEYPASSWORD</td>
61 <tr><td>HTTP.SSL.CAINFO</td><td>CURLOPT_CAINFO</td>
62 <tr><td>HTTP.SSL.CAPATH</td><td>CURLOPT_CAPATH</td>
63 <tr><td>HTTP.SSL.VERIFYPEER</td><td>CURLOPT_SSL_VERIFYPEER</td>
64 <tr><td>HTTP.SSL.VALIDATE</td><td>CURLOPT_SSL_VERIFYPEER, CURLOPT_SSL_VERIFYHOST</td>
65 <tr><td>HTTP.CREDENTIALS.USERPASSWORD</td><td>CURLOPT_USERPASSWORD</td>
66 <tr><td>HTTP.CREDENTIALS.USERNAME</td><td>CURLOPT_USERNAME</td>
67 <tr><td>HTTP.CREDENTIALS.PASSWORD</td><td>CURLOPT_PASSWORD</td>
68 <tr><td>HTTP.NETRC</td><td>CURLOPT_NETRC, CURLOPT_NETRC_FILE</td><td>Specify path of the .netrc file to use and enables its use.</td>
69 <tr><td>AWS.PROFILE</td><td>N.A.</td><td>Specify name of a profile in from the .aws/credentials file</td>
70 <tr><td>AWS.REGION</td><td>N.A.</td><td>Specify name of a default region</td>
71 </table>
72 
73 ### Password Authentication
74 
75 The key
76 HTTP.CREDENTIALS.USERPASSWORD
77 can be used to set the simple password authentication.
78 This is an alternative to setting it in the url.
79 The value must be of the form "username:password".
80 See the <a href="#USERPWDESCAPE">password escaping</a> section
81 to see how this value must escape certain characters.
82 Also see <a href="#auth_redir">redirection authorization</a>
83 for important additional information.
84 
85 The pair of keys
86 HTTP.CREDENTIALS.USERNAME and HTTP.CREDENTIALS.PASSWORD
87 can be used as an alternative to HTTP.CREDENTIALS.USERPASSWORD
88 to set the simple password authentication.
89 If present, they take precedence over HTTP.CREDENTIALS.USERPASSWORD.
90 The values do not need to be escaped.
91 See <a href="#auth_redir">redirection authorization</a>
92 for important additional information.
93 
94 ### Cookie Jar
95 
96 The HTTP.COOKIEJAR key
97 specifies the name of file from which
98 to read cookies (CURLOPT_COOKIEJAR) and also
99 the file into which to store cookies (CURLOPT_COOKIEFILE).
100 The same value is used for both CURLOPT values.
101 It defaults to in-memory storage.
102 See [redirection authorization](#auth_redir)
103 for important additional information.
104 
105 ### Certificate Authentication
106 
107 HTTP.SSL.CERTIFICATE
108 specifies a file path for a file containing a PEM cerficate.
109 This is typically used for client-side authentication.
110 
111 HTTP.SSL.KEY is essentially the same as HTTP.SSL.CERTIFICATE
112 and should always have the same value.
113 
114 HTTP.SSL.KEYPASSWORD
115 specifies the password for accessing the HTTP.SSL.CERTIFICATE/HTTP.SSL.key file.
116 
117 HTTP.SSL.CAPATH
118 specifies the path to a directory containing
119 trusted certificates for validating server certificates.
120 See reference #2 for more info.
121 
122 HTTP.SSL.VALIDATE
123 is a boolean (1/0) value that if true (1)
124 specifies that the client should verify the server's presented certificate.
125 
126 HTTP.PROXY.SERVER
127 specifies the url for accessing the proxy:
128 e.g. *http://[username:password@]host[:port]*
129 
130 HTTP.PROXY_SERVER
131 deprecated; use HTTP.PROXY.SERVER
132 
133 HTTP.NETRC
134 specifies the absolute path of the .netrc file,
135 and causes it to be used instead of username and password.
136 See [redirection authorization](#auth_redir)
137 for information about using *.netrc*.
138 
139 ## Password Escaping {#auth_userpwdescape}
140 
141 With current password rules, it is is not unlikely that the password
142 will contain characters that need to be escaped. Similarly, the user
143 may contain characters such as '@' that need to be escaped. To support this,
144 it is assumed that all occurrences of `user:password` use URL (i.e. %%XX)
145 escaping for at least the characters in the table below.
146 
147 The minimum set of characters that must be escaped depends on the location.
148 If the user+pwd is embedded in the URL, then '@' and ':' __must__ be escaped.
149 If the user+pwd is the value for
150 the HTTP.CREDENTIALS.USERPASSWORD key in the _rc_ file, then
151 ':' __must__ be escaped.
152 Escaping should __not__ be used in the `.netrc` file nor in
153 HTTP.CREDENTIALS.USERNAME or HTTPCREDENTIALS.PASSWORD.
154 
155 The relevant escape codes are as follows.
156 <table>
157 <tr><th>Character</th><th>Escaped Form</th>
158 <tr><td>'@'</td><td>%40</td>
159 <tr><td>':'</td><td>%3a</td>
160 </table>
161 Additional characters can be escaped if desired.
162 
163 ## Redirection-Based Authentication {#auth_redir}
164 
165 Some sites provide authentication by using a third party site
166 to do the authentication. Examples include ESG, URS, RDA, and most oauth2-based
167 systems.
168 
169 The process is usually as follows.
170 
171 1. The client contacts the server of interest (SOI), the actual data provider
172 using, typically _http_ protocol.
173 2. The SOI sends a redirect to the client to connect to the e.g. URS system
174 using the _https_ protocol (note the use of _https_ instead of _http_).
175 3. The client authenticates with URS.
176 4. URS sends a redirect (with authorization information) to send
177 the client back to the SOI to actually obtain the data.
178 
179 It turns out that libcurl, by default, uses the password in the
180 `.ncrc` file (or from the url) for all connections that request
181 a password. This causes problems because only the the specific
182 redirected connection is the one that actually requires the password.
183 This is where the `.netrc` file comes in. Libcurl will use `.netrc`
184 for the redirected connection. It is possible to cause libcurl
185 to use the `.ncrc` password always, but this introduces a
186 security hole because it may send the initial user+pwd to every
187 server in the redirection chain.
188 In summary, if you are using redirection, then you are
189 ''strongly'' encouraged to create a `.netrc` file to hold the
190 password for the site to which the redirection is sent.
191 
192 The format of this `.netrc` file will contain lines that
193 typically look like this.
194 
195  machine mmmmmm login xxxxxx password yyyyyy
196 
197 where the machine, mmmmmm, is the hostname of the machine to
198 which the client is redirected for authorization, and the
199 login and password are those needed to authenticate on that machine.
200 
201 The location of the `.netrc` file can be specified by
202 putting the following line in your `.ncrc`/`.dodsrc` file.
203 
204  HTTP.NETRC=<path to .ncrc file>
205 
206 If not specified, then libcurl will look first in the current
207 directory, and then in the HOME directory.
208 
209 One final note. In using this, you MUST
210 to specify a real file in the file system to act as the
211 cookie jar file (HTTP.COOKIEJAR) so that the
212 redirect site can properly pass back authorization information.
213 
214 ### Accessing *earthdata.nasa.gov*
215 
216 Since it is so common, here is a set of templates to use to
217 access *earthdata.nasa.gov*.
218 
219 #### *.ncrc* File
220 ````
221 HTTP.NETRC=/home/<user>/.netrc
222 HTTP.COOKIEJAR=/home/<user>/.urs_cookies
223 ````
224 
225 #### *.netrc* File
226 ````
227 machine urs.earthdata.nasa.gov login <user> password <password>
228 ````
229 
230 ## Client-Side Certificates {#auth_clientcerts}
231 
232 Some systems, notably ESG (Earth System Grid), requires
233 the use of client-side certificates, as well as being
234 [re-direction based](#auth_redir).
235 This requires setting the following entries:
236 
237 - HTTP.COOKIEJAR &mdash; a file path for storing cookies across re-direction.
238 - HTTP.NETRC &mdash; the path to the netrc file.
239 - HTTP.SSL.CERTIFICATE &mdash; the file path for the client side certificate file.
240 - HTTP.SSL.KEY &mdash; this should have the same value as HTTP.SSL.CERTIFICATE.
241 - HTTP.SSL.CAPATH &mdash; the path to a "certificates" directory.
242 - HTTP.SSL.VALIDATE &mdash; force validation of the server certificate.
243 
244 Note that the first two are there to support re-direction based authentication.
245 
246 ## References
247 
248 1. https://curl.haxx.se/libcurl/c/curl_easy_setopt.html
249 2. https://curl.haxx.se/docs/ssl-compared.html
250 
251 ## Authorization Appendix A. All RC-File Keys {#auth_allkeys}
252 
253 For completeness, this is the list of all rc-file keys.
254 If this documentation is out of date with respect to the actual code,
255 the code is definitive.
256 <table>
257 <tr><th>Key</th><th>curl_easy_setopt Option</th>
258 <tr valign="top"><td>HTTP.DEFLATE</td><td>CUROPT_DEFLATE<br>with value "deflate,gzip"</td>
259 <tr><td>HTTP.VERBOSE</td><td>CUROPT_VERBOSE</td>
260 <tr><td>HTTP.TIMEOUT</td><td>CUROPT_TIMEOUT</td>
261 <tr><td>HTTP.USERAGENT</td><td>CUROPT_USERAGENT</td>
262 <tr><td>HTTP.COOKIEJAR</td><td>CUROPT_COOKIEJAR</td>
263 <tr><td>HTTP.COOKIE_JAR</td><td>CUROPT_COOKIEJAR</td>
264 <tr valign="top"><td>HTTP.PROXY.SERVER</td><td>CURLOPT_PROXY,<br>CURLOPT_PROXYPORT,<br>CURLOPT_PROXYUSERPWD</td>
265 <tr valign="top"><td>HTTP.PROXY_SERVER</td><td>CURLOPT_PROXY,<br>CURLOPT_PROXYPORT,<br>CURLOPT_PROXYUSERPWD</td>
266 <tr><td>HTTP.SSL.CERTIFICATE</td><td>CUROPT_SSLCERT</td>
267 <tr><td>HTTP.SSL.KEY</td><td>CUROPT_SSLKEY</td>
268 <tr><td>HTTP.SSL.KEYPASSWORD</td><td>CUROPT_KEYPASSWORD</td>
269 <tr><td>HTTP.SSL.CAINFO</td><td>CUROPT_CAINFO</td>
270 <tr><td>HTTP.SSL.CAPATH</td><td>CUROPT_CAPATH</td>
271 <tr><td>HTTP.SSL.VERIFYPEER</td><td>CUROPT_SSL_VERIFYPEER</td>
272 <tr><td>HTTP.CREDENTIALS.USERPASSWORD</td><td>CUROPT_USERPASSWORD</td>
273 <tr><td>HTTP.CREDENTIALS.USERNAME</td><td>CUROPT_USERNAME</td>
274 <tr><td>HTTP.CREDENTIALS.PASSWORD</td><td>CUROPT_PASSWORD</td>
275 <tr><td>HTTP.NETRC</td><td>CURLOPT_NETRC,CURLOPT_NETRC_FILE</td>
276 </table>
277 
278 ## Authorization Appendix B. URS Access in Detail {#auth_ursdetail}
279 
280 It is possible to use the NASA Earthdata Login System (URS)
281 with netcdf by using using the process specified in the
282 [redirection based authorization section](#auth_redir).
283 In order to access URS controlled datasets, however, it is necessary to
284 register as a user with NASA at this website (subject to change):
285 
286  https://uat.urs.earthdata.nasa.gov/
287 
288 ## Authorization Appendix C. ESG Access in Detail {#auth_esgdetail}
289 
290 It is possible to access Earth Systems Grid (ESG) datasets
291 from ESG servers through the netCDF API using the techniques
292 described in the section on [Client-Side Certificates](#auth_clientcerts).
293 
294 In order to access ESG datasets, however, it is necessary to
295 register as a user with ESG and to setup your environment
296 so that proper authentication is established between an netcdf
297 client program and the ESG data server. Specifically, it
298 is necessary to use what is called "client-side keys" to
299 enable this authentication. Normally, when a client accesses
300 a server in a secure fashion (using "https"), the server
301 provides an authentication certificate to the client.
302 With client-side keys, the client must also provide a
303 certificate to the server so that the server can know with
304 whom it is communicating. Note that this section is subject
305 to change as ESG changes its procedures.
306 
307 The netcdf library uses the _curl_ library and it is that
308 underlying library that must be properly configured.
309 
310 ### Terminology
311 
312 The key elements for client-side keys requires the constructions of
313 two "stores" on the client side.
314 
315 * Keystore - a repository to hold the client side key.
316 * Truststore - a repository to hold a chain of certificates
317 that can be used to validate the certificate
318 sent by the server to the client.
319 
320 The server actually has a similar set of stores, but the client
321 need not be concerned with those.
322 
323 ### Initial Steps
324 
325 The first step is to obtain authorization from ESG.
326 Note that this information may evolve over time, and
327 may be out of date.
328 This discussion is in terms of BADC and NCSA. You will need
329 to substitute as necessary.
330 
331 1. Register at http://badc.nerc.ac.uk/register
332  to obtain access to badc and to obtain an openid,
333  which will looks something like:
334  <pre>https://ceda.ac.uk/openid/Firstname.Lastname</pre>
335 
336 2. Ask BADC for access to whatever datasets are of interest.
337 
338 3. Obtain short term credentials at
339  _http://grid.ncsa.illinois.edu/myproxy/MyProxyLogon/_
340  You will need to download and run the MyProxyLogon program.
341  This will create a keyfile in, typically, the directory ".globus".
342  The keyfile will have a name similar to this: "x509up_u13615"
343  The other elements in ".globus" are certificates to use in
344  validating the certificate your client gets from the server.
345 
346 4. Obtain the program source ImportKey.java
347  from this location: _http://www.agentbob.info/agentbob/79-AB.html_
348  (read the whole page, it will help you understand the remaining steps).
349 
350 ### Building the KeyStore
351 
352 You will have to modify the keyfile in the previous step
353 and then create a keystore and install the key and a certificate.
354 The commands are these:
355 
356  openssl pkcs8 -topk8 -nocrypt -in x509up_u13615 -inform PEM -out key.der -outform DER
357  openssl x509 -in x509up_u13615 -inform PEM -out cert.der -outform DER
358  java -classpath <path to ImportKey.class> -Dkeypassword="<password>" -Dkeystore=./<keystorefilename> key.der cert.der
359 
360 Note, the file names "key.der" and "cert.der" can be whatever you choose.
361 It is probably best to leave the .der extension, though.
362 
363 ### Building the TrustStore
364 
365 Building the truststore is a bit tricky because as provided, the
366 certificates in ".globus" need some massaging. See the script below
367 for the details. The primary command is this, which is executed for every
368 certificate, c, in globus. It sticks the certificate into the file
369 named "truststore"
370 
371  keytool -trustcacerts -storepass "password" -v -keystore "truststore" -importcert -file "${c}"
372 
373 ### Running the C Client
374 
375 Refer to the section on [Client-Side Certificates](#auth_clientcerts).
376 The keys specified there must be set in the rc file to support ESG access.
377 
378 - HTTP.COOKIEJAR=~/.dods_cookies
379 - HTTP.NETRC=~/.netrc
380 - HTTP.SSL.CERTIFICATE=~/esgkeystore
381 - HTTP.SSL.KEY=~/esgkeystore
382 - HTTP.SSL.CAPATH=~/.globus
383 - HTTP.SSL.VALIDATE=1
384 
385 Of course, the file paths above are suggestions only;
386 you can modify as needed.
387 The HTTP.SSL.CERTIFICATE and HTTP.SSL.KEY
388 entries should have same value, which is the file path for the
389 certificate produced by MyProxyLogon. The HTTP.SSL.CAPATH entry
390 should be the path to the "certificates" directory produced by
391 MyProxyLogon.
392 
393 As noted, ESG also uses re-direction based authentication.
394 So, when it receives an initial connection from a client, it
395 redirects to a separate authentication server. When that
396 server has authenticated the client, it redirects back to
397 the original url to complete the request.
398 
399 ### Script for creating Stores
400 
401 The following script shows in detail how to actually construct the key
402 and trust stores. It is specific to the format of the globus file
403 as it was when ESG support was first added. It may have changed
404 since then, in which case, you will need to seek some help
405 in fixing this script. It would help if you communicated
406 what you changed to the author so this document can be updated.
407 
408  #!/bin/sh -x
409  KEYSTORE="esgkeystore"
410  TRUSTSTORE="esgtruststore"
411  GLOBUS="globus"
412  TRUSTROOT="certificates"
413  CERT="x509up_u13615"
414  TRUSTROOTPATH="$GLOBUS/$TRUSTROOT"
415  CERTFILE="$GLOBUS/$CERT"
416  PWD="password"
417 
418  D="-Dglobus=$GLOBUS"
419  CCP="bcprov-jdk16-145.jar"
420  CP="./build:${CCP}"
421  JAR="myproxy.jar"
422 
423  # Initialize needed directories
424  rm -fr build
425  mkdir build
426  rm -fr $GLOBUS
427  mkdir $GLOBUS
428  rm -f $KEYSTORE
429  rm -f $TRUSTSTORE
430 
431  # Compile MyProxyCmd and ImportKey
432  javac -d ./build -classpath "$CCP" *.java
433  javac -d ./build ImportKey.java
434 
435  # Execute MyProxyCmd
436  java -cp "$CP myproxy.MyProxyCmd
437 
438  # Build the keystore
439  openssl pkcs8 -topk8 -nocrypt -in $CERTFILE -inform PEM -out key.der -outform DER
440  openssl x509 -in $CERTFILE -inform PEM -out cert.der -outform DER
441  java -Dkeypassword=$PWD -Dkeystore=./${KEYSTORE} -cp ./build ImportKey key.der cert.der
442 
443  # Clean up the certificates in the globus directory
444  for c in ${TRUSTROOTPATH}/*.0 ; do
445  alias=`basename $c .0`
446  sed -e '0,/---/d' <$c >/tmp/${alias}
447  echo "-----BEGIN CERTIFICATE-----" >$c
448  cat /tmp/${alias} >>$c
449  done
450 
451  # Build the truststore
452  for c in ${TRUSTROOTPATH}/*.0 ; do
453  alias=`basename $c .0`
454  echo "adding: $TRUSTROOTPATH/${c}"
455  echo "alias: $alias"
456  yes | keytool -trustcacerts -storepass "$PWD" -v -keystore ./$TRUSTSTORE -alias $alias -importcert -file "${c}"
457  done
458  exit
459 
460 ## Point of Contact
461 
462 __Author__: Dennis Heimbigner<br>
463 __Email__: dmh at ucar dot edu
464 __Initial Version__: 11/21/2014<br>
465 __Last Revised__: 08/24/2017
466