forked from Unidata/netcdf4-python
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathindex.html
More file actions
1623 lines (1584 loc) · 95.5 KB
/
Copy pathindex.html
File metadata and controls
1623 lines (1584 loc) · 95.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="ascii"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>netCDF4</title>
<link rel="stylesheet" href="epydoc.css" type="text/css" />
<script type="text/javascript" src="epydoc.js"></script>
</head>
<body bgcolor="white" text="black" link="blue" vlink="#204080"
alink="#204080">
<!-- ==================== NAVIGATION BAR ==================== -->
<table class="navbar" border="0" width="100%" cellpadding="0"
bgcolor="#a0c0ff" cellspacing="0">
<tr valign="middle">
<!-- Home link -->
<th bgcolor="#70b0f0" class="navbar-select"
> Home </th>
<!-- Tree link -->
<th> <a
href="module-tree.html">Trees</a> </th>
<!-- Index link -->
<th> <a
href="identifier-index.html">Indices</a> </th>
<!-- Help link -->
<th> <a
href="help.html">Help</a> </th>
<th class="navbar" width="100%"></th>
</tr>
</table>
<table width="100%" cellpadding="0" cellspacing="0">
<tr valign="top">
<td width="100%">
<span class="breadcrumbs">
Module netCDF4
</span>
</td>
<td>
<table cellpadding="0" cellspacing="0">
<!-- hide/show private -->
</table>
</td>
</tr>
</table>
<!-- ==================== MODULE DESCRIPTION ==================== -->
<h1 class="epydoc">Module netCDF4</h1><p class="nomargin-top"></p>
<h1 class="heading">Introduction</h1>
<p>Python interface to the netCDF version 4 library. <a
href="http://www.unidata.ucar.edu/software/netcdf/netcdf-4"
target="_top">netCDF version 4</a> has many features not found in
earlier versions of the library and is implemented on top of <a
href="http://www.hdfgroup.org/HDF5" target="_top">HDF5</a>. This module
can read and write files in both the new netCDF 4 and the old netCDF 3
format, and can create files that are readable by HDF5 clients. The API
modelled after <a
href="http://dirac.cnrs-orleans.fr/plone/software/scientificpython/"
target="_top">Scientific.IO.NetCDF</a>, and should be familiar to users
of that module.</p>
<p>Most new features of netCDF 4 are implemented, such as multiple
unlimited dimensions, groups and zlib data compression. All the new
numeric data types (such as 64 bit and unsigned integer types) are
implemented. Compound and variable length (vlen) data types are
supported, but the enum and opaque data types are not. Mixtures of
compound and vlen data types (compound types containing vlens, and
vlens containing compound types) are not supported.</p>
<h1 class="heading">Download</h1>
<ul>
<li>
Latest bleeding-edge code from the <a
href="http://github.com/Unidata/netcdf4-python"
target="_top">github repository</a>.
</li>
<li>
Latest <a href="https://pypi.python.org/pypi/netCDF4"
target="_top">releases</a> (source code and windows installers).
</li>
</ul>
<h1 class="heading">Requires</h1>
<ul>
<li>
Python 2.5 or later (python 3 works too).
</li>
<li>
numpy array module <a href="http://numpy.scipy.org"
target="_top">http://numpy.scipy.org</a>, version 1.3.0 or later
(1.5.1 or higher recommended, required if using python 3).
</li>
<li>
<a href="http://cython.org" target="_top">Cython</a> is optional -
if it is installed setup.py will use it to recompile the Cython
source code into C, using conditional compilation to enable
features in the netCDF API that have been added since version
4.1.1. If Cython is not installed, these features (such as the
ability to rename Group objects) will be disabled to preserve
backward compatibility with older versions of the netCDF library.
</li>
<li>
For python < 2.7, the ordereddict module <a
href="http://python.org/pypi/ordereddict"
target="_top">http://python.org/pypi/ordereddict</a>.
</li>
<li>
The HDF5 C library version 1.8.4-patch1 or higher (1.8.8 or higher
recommended) from <a href="ftp://ftp.hdfgroup.org/HDF5/current/src"
target="_top">ftp://ftp.hdfgroup.org/HDF5/current/src</a>. Be sure
to build with '<code>--enable-hl --enable-shared</code>'.
</li>
<li>
<a href="http://curl.haxx.se/libcurl/" target="_top">Libcurl</a>,
if you want <a href="http://opendap.org/" target="_top">OPeNDAP</a>
support.
</li>
<li>
<a href="http://www.hdfgroup.org/products/hdf4/"
target="_top">HDF4</a>, if you want to be able to read HDF4
"Scientific Dataset" (SD) files.
</li>
<li>
The netCDF-4 C library from <a
href="ftp://ftp.unidata.ucar.edu/pub/netcdf"
target="_top">ftp://ftp.unidata.ucar.edu/pub/netcdf</a>. Version
4.1.1 or higher is required (4.2 or higher recommended). Be sure to
build with '<code>--enable-netcdf-4 --enable-shared</code>', and
set <code>CPPFLAGS="-I $HDF5_DIR/include"</code> and
<code>LDFLAGS="-L $HDF5_DIR/lib"</code>, where
<code>$HDF5_DIR</code> is the directory where HDF5 was installed.
If you want <a href="http://opendap.org/" target="_top">OPeNDAP</a>
support, add '<code>--enable-dap</code>'. If you want HDF4 SD
support, add '<code>--enable-hdf4</code>' and add the location of
the HDF4 headers and library to <code>CPPFLAGS</code> and
<code>LDFLAGS</code>.
</li>
</ul>
<h1 class="heading">Install</h1>
<ul>
<li>
install the requisite python modules and C libraries (see above).
It's easiest if all the C libs are built as shared libraries.
</li>
<li>
optionally, set the <code>HDF5_DIR</code> environment variable to
point to where HDF5 is installed (the libs in
<code>$HDF5_DIR/lib</code>, the headers in
<code>$HDF5_DIR/include</code>). If the headers and libs are
installed in different places, you can use <code>HDF5_INCDIR</code>
and <code>HDF5_LIBDIR</code> to define the locations of the headers
and libraries independently.
</li>
<li>
optionally, set the <code>NETCDF4_DIR</code> (or
<code>NETCDF4_INCDIR</code> and <code>NETCDF4_LIBDIR</code>)
environment variable(s) to point to where the netCDF version 4
library and headers are installed.
</li>
<li>
If the locations of the HDF5 and netCDF libs and headers are not
specified with environment variables, some standard locations will
be searched.
</li>
<li>
if HDF5 was built as a static library with <a
href="http://www.hdfgroup.org/doc_resource/SZIP/"
target="_top">szip</a> support, you may also need to set the
<code>SZIP_DIR</code> (or <code>SZIP_INCDIR</code> and
<code>SZIP_LIBDIR</code>) environment variable(s) to point to where
szip is installed. Note that the netCDF library does not support
creating szip compressed files, but can read szip compressed files
if the HDF5 lib is configured to support szip.
</li>
<li>
if netCDF lib was built as a static library with HDF4 and/or
OpenDAP support, you may also need to set <code>HDF4_DIR</code>,
<code>JPEG_DIR</code> and/or <code>CURL_DIR</code>.
</li>
<li>
Instead of using environment variables to specify the locations of
the required libraries, you can either let setup.py try to
auto-detect their locations, or use the file <code>setup.cfg</code>
to specify them. To use this method, copy the file
<code>setup.cfg.template</code> to <code>setup.cfg</code>, then
open <code>setup.cfg</code> in a text editor and follow the
instructions in the comments for editing. If you use
<code>setup.cfg</code>, environment variables will be ignored.
</li>
<li>
If you are using netcdf 4.1.2 or higher, instead of setting all
those enviroment variables defining where libs are installed, you
can just set one environment variable, USE_NCCONFIG, to 1. This
will tell python to run the netcdf nc-config utility to determine
where all the dependencies live.
</li>
<li>
run <code>python setup.py build</code>, then <code>python setup.py
install</code> (as root if necessary).
</li>
<li>
If using environment variables to specify build options, be sure to
run 'python setup.py build' *without* using sudo. sudo does not
pass environment variables. If you run 'setup.py build' first
without sudo, you can run 'setup.py install' with sudo.
</li>
<li>
run the tests in the 'test' directory by running <code>python
run_all.py</code>.
</li>
</ul>
<h1 class="heading">Tutorial</h1>
<h2 class="heading">1) Creating/Opening/Closing a netCDF file</h2>
<p>To create a netCDF file from python, you simply call the <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a>
constructor. This is also the method used to open an existing netCDF
file. If the file is open for write access (<code>w, r+</code> or
<code>a</code>), you may write any type of data including new
dimensions, groups, variables and attributes. netCDF files come in
several flavors (<code>NETCDF3_CLASSIC, NETCDF3_64BIT,
NETCDF4_CLASSIC</code>, and <code>NETCDF4</code>). The first two
flavors are supported by version 3 of the netCDF library.
<code>NETCDF4_CLASSIC</code> files use the version 4 disk format
(HDF5), but do not use any features not found in the version 3 API.
They can be read by netCDF 3 clients only if they have been relinked
against the netCDF 4 library. They can also be read by HDF5 clients.
<code>NETCDF4</code> files use the version 4 disk format (HDF5) and
use the new features of the version 4 API. The <code>netCDF4</code>
module can read and write files in any of these formats. When
creating a new file, the format may be specified using the
<code>format</code> keyword in the <code>Dataset</code> constructor.
The default format is <code>NETCDF4</code>. To see how a given file
is formatted, you can examine the <code>data_model</code> <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a> attribute.
Closing the netCDF file is accomplished via the <a
href="netCDF4.Dataset-class.html#close" class="link">close</a> method
of the <a href="netCDF4.Dataset-class.html" class="link">Dataset</a>
instance.</p>
<p>Here's an example:</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">from</span> netCDF4 <span class="py-keyword">import</span> Dataset
<span class="py-prompt">>>> </span>rootgrp = Dataset(<span class="py-string">'test.nc'</span>, <span class="py-string">'w'</span>, format=<span class="py-string">'NETCDF4'</span>)
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> rootgrp.data_model
<span class="py-output">NETCDF4</span>
<span class="py-output"></span><span class="py-prompt">>>></span>
<span class="py-prompt">>>> </span>rootgrp.close()</pre>
<p>Remote <a href="http://opendap.org"
target="_top">OPeNDAP</a>-hosted datasets can be accessed for reading
over http if a URL is provided to the <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a>
constructor instead of a filename. However, this requires that the
netCDF library be built with OPenDAP support, via the
<code>--enable-dap</code> configure option (added in version
4.0.1).</p>
<h2 class="heading">2) Groups in a netCDF file</h2>
<p>netCDF version 4 added support for organizing data in hierarchical
groups, which are analagous to directories in a filesystem. Groups
serve as containers for variables, dimensions and attributes, as well
as other groups. A <code>netCDF4.Dataset</code> defines creates a
special group, called the 'root group', which is similar to the root
directory in a unix filesystem. To create <a
href="netCDF4.Group-class.html" class="link">Group</a> instances, use
the <a href="netCDF4.Dataset-class.html#createGroup"
class="link">createGroup</a> method of a <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a> or <a
href="netCDF4.Group-class.html" class="link">Group</a> instance. <a
href="netCDF4.Dataset-class.html#createGroup"
class="link">createGroup</a> takes a single argument, a python string
containing the name of the new group. The new <a
href="netCDF4.Group-class.html" class="link">Group</a> instances
contained within the root group can be accessed by name using the
<code>groups</code> dictionary attribute of the <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a> instance.
Only <code>NETCDF4</code> formatted files support Groups, if you try
to create a Group in a netCDF 3 file you will get an error
message.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>rootgrp = Dataset(<span class="py-string">'test.nc'</span>, <span class="py-string">'a'</span>)
<span class="py-prompt">>>> </span>fcstgrp = rootgrp.createGroup(<span class="py-string">'forecasts'</span>)
<span class="py-prompt">>>> </span>analgrp = rootgrp.createGroup(<span class="py-string">'analyses'</span>)
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> rootgrp.groups
<span class="py-output">OrderedDict([('forecasts', <netCDF4.Group object at 0x1b4b7b0>),</span>
<span class="py-output"> ('analyses', <netCDF4.Group object at 0x1b4b970>)])</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p>Groups can exist within groups in a <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a>, just as
directories exist within directories in a unix filesystem. Each <a
href="netCDF4.Group-class.html" class="link">Group</a> instance has a
<code>'groups'</code> attribute dictionary containing all of the
group instances contained within that group. Each <a
href="netCDF4.Group-class.html" class="link">Group</a> instance also
has a <code>'path'</code> attribute that contains a simulated unix
directory path to that group.</p>
<p>Here's an example that shows how to navigate all the groups in a
<a href="netCDF4.Dataset-class.html" class="link">Dataset</a>. The
function <code>walktree</code> is a Python generator that is used to
walk the directory tree. Note that printing the <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a> or <a
href="netCDF4.Group-class.html" class="link">Group</a> object yields
summary information about it's contents.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>fcstgrp1 = fcstgrp.createGroup(<span class="py-string">'model1'</span>)
<span class="py-prompt">>>> </span>fcstgrp2 = fcstgrp.createGroup(<span class="py-string">'model2'</span>)
<span class="py-prompt">>>> </span><span class="py-keyword">def</span> <span class="py-defname">walktree</span>(top):
<span class="py-prompt">>>> </span> <span class="py-builtin">values</span> = top.groups.values()
<span class="py-prompt">>>> </span> yield <span class="py-builtin">values</span>
<span class="py-prompt">>>> </span> <span class="py-keyword">for</span> value <span class="py-keyword">in</span> top.groups.values():
<span class="py-prompt">>>> </span> <span class="py-keyword">for</span> children <span class="py-keyword">in</span> walktree(value):
<span class="py-prompt">>>> </span> yield children
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> rootgrp
<span class="py-prompt">>>> </span><span class="py-keyword">for</span> children <span class="py-keyword">in</span> walktree(rootgrp):
<span class="py-prompt">>>> </span> <span class="py-keyword">for</span> child <span class="py-keyword">in</span> children:
<span class="py-prompt">>>> </span> <span class="py-keyword">print</span> child
<span class="py-output"><type 'netCDF4.Dataset'></span>
<span class="py-output">root group (NETCDF4 file format):</span>
<span class="py-output"> dimensions: </span>
<span class="py-output"> variables: </span>
<span class="py-output"> groups: forecasts, analyses</span>
<span class="py-output"><type 'netCDF4.Group'></span>
<span class="py-output">group /forecasts:</span>
<span class="py-output"> dimensions:</span>
<span class="py-output"> variables:</span>
<span class="py-output"> groups: model1, model2</span>
<span class="py-output"><type 'netCDF4.Group'></span>
<span class="py-output">group /analyses:</span>
<span class="py-output"> dimensions:</span>
<span class="py-output"> variables:</span>
<span class="py-output"> groups:</span>
<span class="py-output"><type 'netCDF4.Group'></span>
<span class="py-output">group /forecasts/model1:</span>
<span class="py-output"> dimensions:</span>
<span class="py-output"> variables:</span>
<span class="py-output"> groups:</span>
<span class="py-output"><type 'netCDF4.Group'></span>
<span class="py-output">group /forecasts/model2:</span>
<span class="py-output"> dimensions:</span>
<span class="py-output"> variables:</span>
<span class="py-output"> groups:</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<h2 class="heading">3) Dimensions in a netCDF file</h2>
<p>netCDF defines the sizes of all variables in terms of dimensions,
so before any variables can be created the dimensions they use must
be created first. A special case, not often used in practice, is that
of a scalar variable, which has no dimensions. A dimension is created
using the <a href="netCDF4.Dataset-class.html#createDimension"
class="link">createDimension</a> method of a <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a> or <a
href="netCDF4.Group-class.html" class="link">Group</a> instance. A
Python string is used to set the name of the dimension, and an
integer value is used to set the size. To create an unlimited
dimension (a dimension that can be appended to), the size value is
set to <code>None</code> or 0. In this example, there both the
<code>time</code> and <code>level</code> dimensions are unlimited.
Having more than one unlimited dimension is a new netCDF 4 feature,
in netCDF 3 files there may be only one, and it must be the first
(leftmost) dimension of the variable.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>level = rootgrp.createDimension(<span class="py-string">'level'</span>, None)
<span class="py-prompt">>>> </span>time = rootgrp.createDimension(<span class="py-string">'time'</span>, None)
<span class="py-prompt">>>> </span>lat = rootgrp.createDimension(<span class="py-string">'lat'</span>, 73)
<span class="py-prompt">>>> </span>lon = rootgrp.createDimension(<span class="py-string">'lon'</span>, 144)</pre>
<p>All of the <a href="netCDF4.Dimension-class.html"
class="link">Dimension</a> instances are stored in a python
dictionary.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> rootgrp.dimensions
<span class="py-output">OrderedDict([('level', <netCDF4.Dimension object at 0x1b48030>),</span>
<span class="py-output"> ('time', <netCDF4.Dimension object at 0x1b481c0>),</span>
<span class="py-output"> ('lat', <netCDF4.Dimension object at 0x1b480f8>),</span>
<span class="py-output"> ('lon', <netCDF4.Dimension object at 0x1b48a08>)])</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p>Calling the python <code>len</code> function with a <a
href="netCDF4.Dimension-class.html" class="link">Dimension</a>
instance returns the current size of that dimension. The <a
href="netCDF4.Dimension-class.html#isunlimited"
class="link">isunlimited</a> method of a <a
href="netCDF4.Dimension-class.html" class="link">Dimension</a>
instance can be used to determine if the dimensions is unlimited, or
appendable.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> len(lon)
<span class="py-output">144</span>
<span class="py-output"></span><span class="py-prompt">>>> </span><span class="py-keyword">print</span> len.is_unlimited()
<span class="py-output">False</span>
<span class="py-output"></span><span class="py-prompt">>>> </span><span class="py-keyword">print</span> time.is_unlimited()
<span class="py-output">True</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p>Printing the <a href="netCDF4.Dimension-class.html"
class="link">Dimension</a> object provides useful summary info,
including the name and length of the dimension, and whether it is
unlimited.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">for</span> dimobj <span class="py-keyword">in</span> rootgrp.dimensions.values():
<span class="py-prompt">>>> </span> <span class="py-keyword">print</span> dimobj
<span class="py-output"><type 'netCDF4.Dimension'> (unlimited): name = 'level', size = 0</span>
<span class="py-output"><type 'netCDF4.Dimension'> (unlimited): name = 'time', size = 0</span>
<span class="py-output"><type 'netCDF4.Dimension'>: name = 'lat', size = 73</span>
<span class="py-output"><type 'netCDF4.Dimension'>: name = 'lon', size = 144</span>
<span class="py-output"><type 'netCDF4.Dimension'> (unlimited): name = 'time', size = 0</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p><a href="netCDF4.Dimension-class.html" class="link">Dimension</a>
names can be changed using the <a
href="netCDF4.Dataset-class.html#renameDimension"
class="link">renameDimension</a> method of a <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a> or <a
href="netCDF4.Group-class.html" class="link">Group</a> instance.</p>
<h2 class="heading">4) Variables in a netCDF file</h2>
<p>netCDF variables behave much like python multidimensional array
objects supplied by the <a href="http://numpy.scipy.org"
target="_top">numpy module</a>. However, unlike numpy arrays, netCDF4
variables can be appended to along one or more 'unlimited'
dimensions. To create a netCDF variable, use the <a
href="netCDF4.Dataset-class.html#createVariable"
class="link">createVariable</a> method of a <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a> or <a
href="netCDF4.Group-class.html" class="link">Group</a> instance. The
<a href="netCDF4.Dataset-class.html#createVariable"
class="link">createVariable</a> method has two mandatory arguments,
the variable name (a Python string), and the variable datatype. The
variable's dimensions are given by a tuple containing the dimension
names (defined previously with <a
href="netCDF4.Dataset-class.html#createDimension"
class="link">createDimension</a>). To create a scalar variable,
simply leave out the dimensions keyword. The variable primitive
datatypes correspond to the dtype attribute of a numpy array. You can
specify the datatype as a numpy dtype object, or anything that can be
converted to a numpy dtype object. Valid datatype specifiers
include: <code>'f4'</code> (32-bit floating point), <code>'f8'</code>
(64-bit floating point), <code>'i4'</code> (32-bit signed integer),
<code>'i2'</code> (16-bit signed integer), <code>'i8'</code> (64-bit
singed integer), <code>'i1'</code> (8-bit signed integer),
<code>'u1'</code> (8-bit unsigned integer), <code>'u2'</code> (16-bit
unsigned integer), <code>'u4'</code> (32-bit unsigned integer),
<code>'u8'</code> (64-bit unsigned integer), or <code>'S1'</code>
(single-character string). The old Numeric single-character
typecodes (<code>'f'</code>,<code>'d'</code>,<code>'h'</code>,
<code>'s'</code>,<code>'b'</code>,<code>'B'</code>,<code>'c'</code>,<code>'i'</code>,<code>'l'</code>),
corresponding to
(<code>'f4'</code>,<code>'f8'</code>,<code>'i2'</code>,<code>'i2'</code>,<code>'i1'</code>,<code>'i1'</code>,<code>'S1'</code>,<code>'i4'</code>,<code>'i4'</code>),
will also work. The unsigned integer types and the 64-bit integer
type can only be used if the file format is <code>NETCDF4</code>.</p>
<p>The dimensions themselves are usually also defined as variables,
called coordinate variables. The <a
href="netCDF4.Dataset-class.html#createVariable"
class="link">createVariable</a> method returns an instance of the <a
href="netCDF4.Variable-class.html" class="link">Variable</a> class
whose methods can be used later to access and set variable data and
attributes.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>times = rootgrp.createVariable(<span class="py-string">'time'</span>,<span class="py-string">'f8'</span>,(<span class="py-string">'time'</span>,))
<span class="py-prompt">>>> </span>levels = rootgrp.createVariable(<span class="py-string">'level'</span>,<span class="py-string">'i4'</span>,(<span class="py-string">'level'</span>,))
<span class="py-prompt">>>> </span>latitudes = rootgrp.createVariable(<span class="py-string">'latitude'</span>,<span class="py-string">'f4'</span>,(<span class="py-string">'lat'</span>,))
<span class="py-prompt">>>> </span>longitudes = rootgrp.createVariable(<span class="py-string">'longitude'</span>,<span class="py-string">'f4'</span>,(<span class="py-string">'lon'</span>,))
<span class="py-prompt">>>> </span><span class="py-comment"># two dimensions unlimited.</span>
<span class="py-prompt">>>> </span>temp = rootgrp.createVariable(<span class="py-string">'temp'</span>,<span class="py-string">'f4'</span>,(<span class="py-string">'time'</span>,<span class="py-string">'level'</span>,<span class="py-string">'lat'</span>,<span class="py-string">'lon'</span>,))</pre>
<p>All of the variables in the <a href="netCDF4.Dataset-class.html"
class="link">Dataset</a> or <a href="netCDF4.Group-class.html"
class="link">Group</a> are stored in a Python dictionary, in the same
way as the dimensions:</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> rootgrp.variables
<span class="py-output">OrderedDict([('time', <netCDF4.Variable object at 0x1b4ba70>),</span>
<span class="py-output"> ('level', <netCDF4.Variable object at 0x1b4bab0>), </span>
<span class="py-output"> ('latitude', <netCDF4.Variable object at 0x1b4baf0>),</span>
<span class="py-output"> ('longitude', <netCDF4.Variable object at 0x1b4bb30>),</span>
<span class="py-output"> ('temp', <netCDF4.Variable object at 0x1b4bb70>)])</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p>To get summary info on a <a href="netCDF4.Variable-class.html"
class="link">Variable</a> instance in an interactive session, just
print it.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> rootgrp.variables[<span class="py-string">'temp'</span>]
<span class="py-output"><type 'netCDF4.Variable'></span>
<span class="py-output">float32 temp(time, level, lat, lon)</span>
<span class="py-output"> least_significant_digit: 3</span>
<span class="py-output"> units: K</span>
<span class="py-output">unlimited dimensions: time, level</span>
<span class="py-output">current shape = (0, 0, 73, 144)</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p><a href="netCDF4.Variable-class.html" class="link">Variable</a>
names can be changed using the <a
href="netCDF4.Dataset-class.html#renameVariable"
class="link">renameVariable</a> method of a <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a>
instance.</p>
<h2 class="heading">5) Attributes in a netCDF file</h2>
<p>There are two types of attributes in a netCDF file, global and
variable. Global attributes provide information about a group, or the
entire dataset, as a whole. <a href="netCDF4.Variable-class.html"
class="link">Variable</a> attributes provide information about one of
the variables in a group. Global attributes are set by assigning
values to <a href="netCDF4.Dataset-class.html"
class="link">Dataset</a> or <a href="netCDF4.Group-class.html"
class="link">Group</a> instance variables. <a
href="netCDF4.Variable-class.html" class="link">Variable</a>
attributes are set by assigning values to <a
href="netCDF4.Variable-class.html" class="link">Variable</a>
instances variables. Attributes can be strings, numbers or sequences.
Returning to our example,</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">import</span> time
<span class="py-prompt">>>> </span>rootgrp.description = <span class="py-string">'bogus example script'</span>
<span class="py-prompt">>>> </span>rootgrp.history = <span class="py-string">'Created '</span> + time.ctime(time.time())
<span class="py-prompt">>>> </span>rootgrp.source = <span class="py-string">'netCDF4 python module tutorial'</span>
<span class="py-prompt">>>> </span>latitudes.units = <span class="py-string">'degrees north'</span>
<span class="py-prompt">>>> </span>longitudes.units = <span class="py-string">'degrees east'</span>
<span class="py-prompt">>>> </span>levels.units = <span class="py-string">'hPa'</span>
<span class="py-prompt">>>> </span>temp.units = <span class="py-string">'K'</span>
<span class="py-prompt">>>> </span>times.units = <span class="py-string">'hours since 0001-01-01 00:00:00.0'</span>
<span class="py-prompt">>>> </span>times.calendar = <span class="py-string">'gregorian'</span></pre>
<p>The <a href="netCDF4.Dataset-class.html#ncattrs"
class="link">ncattrs</a> method of a <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a>, <a
href="netCDF4.Group-class.html" class="link">Group</a> or <a
href="netCDF4.Variable-class.html" class="link">Variable</a> instance
can be used to retrieve the names of all the netCDF attributes. This
method is provided as a convenience, since using the built-in
<code>dir</code> Python function will return a bunch of private
methods and attributes that cannot (or should not) be modified by the
user.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">for</span> name <span class="py-keyword">in</span> rootgrp.ncattrs():
<span class="py-prompt">>>> </span> <span class="py-keyword">print</span> <span class="py-string">'Global attr'</span>, name, <span class="py-string">'='</span>, getattr(rootgrp,name)
<span class="py-output">Global attr description = bogus example script</span>
<span class="py-output">Global attr history = Created Mon Nov 7 10.30:56 2005</span>
<span class="py-output">Global attr source = netCDF4 python module tutorial</span></pre>
<p>The <code>__dict__</code> attribute of a <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a>, <a
href="netCDF4.Group-class.html" class="link">Group</a> or <a
href="netCDF4.Variable-class.html" class="link">Variable</a> instance
provides all the netCDF attribute name/value pairs in a python
dictionary:</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> rootgrp.__dict__
<span class="py-output">OrderedDict([(u'description', u'bogus example script'),</span>
<span class="py-output"> (u'history', u'Created Thu Mar 3 19:30:33 2011'), </span>
<span class="py-output"> (u'source', u'netCDF4 python module tutorial')])</span></pre>
<p>Attributes can be deleted from a netCDF <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a>, <a
href="netCDF4.Group-class.html" class="link">Group</a> or <a
href="netCDF4.Variable-class.html" class="link">Variable</a> using
the python <code>del</code> statement (i.e. <code>del grp.foo</code>
removes the attribute <code>foo</code> the the group
<code>grp</code>).</p>
<h2 class="heading">6) Writing data to and retrieving data from a netCDF variable</h2>
<p>Now that you have a netCDF <a href="netCDF4.Variable-class.html"
class="link">Variable</a> instance, how do you put data into it? You
can just treat it like an array and assign data to a slice.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">import</span> numpy
<span class="py-prompt">>>> </span>lats = numpy.arange(-90,91,2.5)
<span class="py-prompt">>>> </span>lons = numpy.arange(-180,180,2.5)
<span class="py-prompt">>>> </span>latitudes[:] = lats
<span class="py-prompt">>>> </span>longitudes[:] = lons
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> <span class="py-string">'latitudes =\n'</span>,latitudes[:]
<span class="py-output">latitudes =</span>
<span class="py-output">[-90. -87.5 -85. -82.5 -80. -77.5 -75. -72.5 -70. -67.5 -65. -62.5</span>
<span class="py-output"> -60. -57.5 -55. -52.5 -50. -47.5 -45. -42.5 -40. -37.5 -35. -32.5</span>
<span class="py-output"> -30. -27.5 -25. -22.5 -20. -17.5 -15. -12.5 -10. -7.5 -5. -2.5</span>
<span class="py-output"> 0. 2.5 5. 7.5 10. 12.5 15. 17.5 20. 22.5 25. 27.5</span>
<span class="py-output"> 30. 32.5 35. 37.5 40. 42.5 45. 47.5 50. 52.5 55. 57.5</span>
<span class="py-output"> 60. 62.5 65. 67.5 70. 72.5 75. 77.5 80. 82.5 85. 87.5</span>
<span class="py-output"> 90. ]</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p>Unlike NumPy's array objects, netCDF <a
href="netCDF4.Variable-class.html" class="link">Variable</a> objects
with unlimited dimensions will grow along those dimensions if you
assign data outside the currently defined range of indices.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-comment"># append along two unlimited dimensions by assigning to slice.</span>
<span class="py-prompt">>>> </span>nlats = len(rootgrp.dimensions[<span class="py-string">'lat'</span>])
<span class="py-prompt">>>> </span>nlons = len(rootgrp.dimensions[<span class="py-string">'lon'</span>])
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> <span class="py-string">'temp shape before adding data = '</span>,temp.shape
<span class="py-output">temp shape before adding data = (0, 0, 73, 144)</span>
<span class="py-output"></span><span class="py-prompt">>>></span>
<span class="py-prompt">>>> </span><span class="py-keyword">from</span> numpy.random <span class="py-keyword">import</span> uniform
<span class="py-prompt">>>> </span>temp[0:5,0:10,:,:] = uniform(size=(5,10,nlats,nlons))
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> <span class="py-string">'temp shape after adding data = '</span>,temp.shape
<span class="py-output">temp shape after adding data = (6, 10, 73, 144)</span>
<span class="py-output"></span><span class="py-prompt">>>></span>
<span class="py-prompt">>>> </span><span class="py-comment"># levels have grown, but no values yet assigned.</span>
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> <span class="py-string">'levels shape after adding pressure data = '</span>,levels.shape
<span class="py-output">levels shape after adding pressure data = (10,)</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p>Note that the size of the levels variable grows when data is
appended along the <code>level</code> dimension of the variable
<code>temp</code>, even though no data has yet been assigned to
levels.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-comment"># now, assign data to levels dimension variable.</span>
<span class="py-prompt">>>> </span>levels[:] = [1000.,850.,700.,500.,300.,250.,200.,150.,100.,50.]</pre>
<p>However, that there are some differences between NumPy and netCDF
variable slicing rules. Slices behave as usual, being specified as a
<code>start:stop:step</code> triplet. Using a scalar integer index
<code>i</code> takes the ith element and reduces the rank of the
output array by one. Boolean array and integer sequence indexing
behaves differently for netCDF variables than for numpy arrays. Only
1-d boolean arrays and integer sequences are allowed, and these
indices work independently along each dimension (similar to the way
vector subscripts work in fortran). This means that</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>temp[0, 0, [0,1,2,3], [0,1,2,3]]</pre>
<p>returns an array of shape (4,4) when slicing a netCDF variable,
but for a numpy array it returns an array of shape (4,). Similarly, a
netCDF variable of shape <code>(2,3,4,5)</code> indexed with
<code>[0, array([True, False, True]), array([False, True, True,
True]), :]</code> would return a <code>(2, 3, 5)</code> array. In
NumPy, this would raise an error since it would be equivalent to
<code>[0, [0,1], [1,2,3], :]</code>. While this behaviour can cause
some confusion for those used to NumPy's 'fancy indexing' rules, it
provides a very powerful way to extract data from multidimensional
netCDF variables by using logical operations on the dimension arrays
to create slices.</p>
<p>For example,</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>tempdat = temp[::2, [1,3,6], lats>0, lons>0]</pre>
<p>will extract time indices 0,2 and 4, pressure levels 850, 500 and
200 hPa, all Northern Hemisphere latitudes and Eastern Hemisphere
longitudes, resulting in a numpy array of shape (3, 3, 36, 71).</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> <span class="py-string">'shape of fancy temp slice = '</span>,tempdat.shape
<span class="py-output">shape of fancy temp slice = (3, 3, 36, 71)</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p>Time coordinate values pose a special challenge to netCDF users.
Most metadata standards (such as CF and COARDS) specify that time
should be measure relative to a fixed date using a certain calendar,
with units specified like <code>hours since YY:MM:DD hh-mm-ss</code>.
These units can be awkward to deal with, without a utility to convert
the values to and from calendar dates. The functione called <a
href="netCDF4-module.html#num2date" class="link">num2date</a> and <a
href="netCDF4-module.html#date2num" class="link">date2num</a> are
provided with this package to do just that. Here's an example of how
they can be used:</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-comment"># fill in times.</span>
<span class="py-prompt">>>> </span><span class="py-keyword">from</span> datetime <span class="py-keyword">import</span> datetime, timedelta
<span class="py-prompt">>>> </span><span class="py-keyword">from</span> netCDF4 <span class="py-keyword">import</span> num2date, date2num
<span class="py-prompt">>>> </span>dates = [datetime(2001,3,1)+n*timedelta(hours=12) <span class="py-keyword">for</span> n <span class="py-keyword">in</span> range(temp.shape[0])]
<span class="py-prompt">>>> </span>times[:] = date2num(dates,units=times.units,calendar=times.calendar)
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> <span class="py-string">'time values (in units %s): '</span> % times.units+<span class="py-string">'\n'</span>,times[:]
<span class="py-output">time values (in units hours since January 1, 0001): </span>
<span class="py-output">[ 17533056. 17533068. 17533080. 17533092. 17533104.]</span>
<span class="py-output"></span><span class="py-prompt">>>></span>
<span class="py-prompt">>>> </span>dates = num2date(times[:],units=times.units,calendar=times.calendar)
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> <span class="py-string">'dates corresponding to time values:\n'</span>,dates
<span class="py-output">dates corresponding to time values:</span>
<span class="py-output">[2001-03-01 00:00:00 2001-03-01 12:00:00 2001-03-02 00:00:00</span>
<span class="py-output"> 2001-03-02 12:00:00 2001-03-03 00:00:00]</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p><a href="netCDF4-module.html#num2date" class="link">num2date</a>
converts numeric values of time in the specified <code>units</code>
and <code>calendar</code> to datetime objects, and <a
href="netCDF4-module.html#date2num" class="link">date2num</a> does
the reverse. All the calendars currently defined in the <a
href="http://cf-pcmdi.llnl.gov/documents/cf-conventions/"
target="_top">CF metadata convention</a> are supported. A function
called <a href="netCDF4-module.html#date2index"
class="link">date2index</a> is also provided which returns the
indices of a netCDF time variable corresponding to a sequence of
datetime instances.</p>
<h2 class="heading">7) Reading data from a multi-file netCDF dataset.</h2>
<p>If you want to read data from a variable that spans multiple
netCDF files, you can use the <a href="netCDF4.MFDataset-class.html"
class="link">MFDataset</a> class to read the data as if it were
contained in a single file. Instead of using a single filename to
create a <a href="netCDF4.Dataset-class.html"
class="link">Dataset</a> instance, create a <a
href="netCDF4.MFDataset-class.html" class="link">MFDataset</a>
instance with either a list of filenames, or a string with a wildcard
(which is then converted to a sorted list of files using the python
glob module). Variables in the list of files that share the same
unlimited dimension are aggregated together, and can be sliced across
multiple files. To illustrate this, let's first create a bunch of
netCDF files with the same variable (with the same unlimited
dimension). The files must in be in <code>NETCDF3_64BIT</code>,
<code>NETCDF3_CLASSIC</code> or <code>NETCDF4_CLASSIC format</code>
(<code>NETCDF4</code> formatted multi-file datasets are not
supported).</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">for</span> nfile <span class="py-keyword">in</span> range(10):
<span class="py-prompt">>>> </span> f = Dataset(<span class="py-string">'mftest'</span>+repr(nfile)+<span class="py-string">'.nc'</span>,<span class="py-string">'w'</span>,format=<span class="py-string">'NETCDF4_CLASSIC'</span>)
<span class="py-prompt">>>> </span> f.createDimension(<span class="py-string">'x'</span>,None)
<span class="py-prompt">>>> </span> x = f.createVariable(<span class="py-string">'x'</span>,<span class="py-string">'i'</span>,(<span class="py-string">'x'</span>,))
<span class="py-prompt">>>> </span> x[0:10] = numpy.arange(nfile*10,10*(nfile+1))
<span class="py-prompt">>>> </span> f.close()</pre>
<p>Now read all the files back in at once with <a
href="netCDF4.MFDataset-class.html" class="link">MFDataset</a></p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">from</span> netCDF4 <span class="py-keyword">import</span> MFDataset
<span class="py-prompt">>>> </span>f = MFDataset(<span class="py-string">'mftest*nc'</span>)
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> f.variables[<span class="py-string">'x'</span>][:]
<span class="py-output">[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24</span>
<span class="py-output"> 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49</span>
<span class="py-output"> 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74</span>
<span class="py-output"> 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99]</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p>Note that MFDataset can only be used to read, not write,
multi-file datasets.</p>
<h2 class="heading">8) Efficient compression of netCDF variables</h2>
<p>Data stored in netCDF 4 <a href="netCDF4.Variable-class.html"
class="link">Variable</a> objects can be compressed and decompressed
on the fly. The parameters for the compression are determined by the
<code>zlib</code>, <code>complevel</code> and <code>shuffle</code>
keyword arguments to the <a
href="netCDF4.Dataset-class.html#createVariable"
class="link">createVariable</a> method. To turn on compression, set
<code>zlib=True</code>. The <code>complevel</code> keyword regulates
the speed and efficiency of the compression (1 being fastest, but
lowest compression ratio, 9 being slowest but best compression
ratio). The default value of <code>complevel</code> is 4. Setting
<code>shuffle=False</code> will turn off the HDF5 shuffle filter,
which de-interlaces a block of data before compression by reordering
the bytes. The shuffle filter can significantly improve compression
ratios, and is on by default. Setting <code>fletcher32</code>
keyword argument to <a
href="netCDF4.Dataset-class.html#createVariable"
class="link">createVariable</a> to <code>True</code> (it's
<code>False</code> by default) enables the Fletcher32 checksum
algorithm for error detection. It's also possible to set the HDF5
chunking parameters and endian-ness of the binary data stored in the
HDF5 file with the <code>chunksizes</code> and <code>endian</code>
keyword arguments to <a
href="netCDF4.Dataset-class.html#createVariable"
class="link">createVariable</a>. These keyword arguments only are
relevant for <code>NETCDF4</code> and <code>NETCDF4_CLASSIC</code>
files (where the underlying file format is HDF5) and are silently
ignored if the file format is <code>NETCDF3_CLASSIC</code> or
<code>NETCDF3_64BIT</code>,</p>
<p>If your data only has a certain number of digits of precision (say
for example, it is temperature data that was measured with a
precision of 0.1 degrees), you can dramatically improve zlib
compression by quantizing (or truncating) the data using the
<code>least_significant_digit</code> keyword argument to <a
href="netCDF4.Dataset-class.html#createVariable"
class="link">createVariable</a>. The least significant digit is the
power of ten of the smallest decimal place in the data that is a
reliable value. For example if the data has a precision of 0.1, then
setting <code>least_significant_digit=1</code> will cause data the
data to be quantized using
<code>numpy.around(scale*data)/scale</code>, where scale = 2**bits,
and bits is determined so that a precision of 0.1 is retained (in
this case bits=4). Effectively, this makes the compression 'lossy'
instead of 'lossless', that is some precision in the data is
sacrificed for the sake of disk space.</p>
<p>In our example, try replacing the line</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>temp = rootgrp.createVariable(<span class="py-string">'temp'</span>,<span class="py-string">'f4'</span>,(<span class="py-string">'time'</span>,<span class="py-string">'level'</span>,<span class="py-string">'lat'</span>,<span class="py-string">'lon'</span>,))</pre>
<p>with</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>temp = dataset.createVariable(<span class="py-string">'temp'</span>,<span class="py-string">'f4'</span>,(<span class="py-string">'time'</span>,<span class="py-string">'level'</span>,<span class="py-string">'lat'</span>,<span class="py-string">'lon'</span>,),zlib=True)</pre>
<p>and then</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>temp = dataset.createVariable(<span class="py-string">'temp'</span>,<span class="py-string">'f4'</span>,(<span class="py-string">'time'</span>,<span class="py-string">'level'</span>,<span class="py-string">'lat'</span>,<span class="py-string">'lon'</span>,),zlib=True,least_significant_digit=3)</pre>
<p>and see how much smaller the resulting files are.</p>
<h2 class="heading">9) Beyond homogenous arrays of a fixed type - compound data types</h2>
<p>Compound data types map directly to numpy structured (a.k.a
'record' arrays). Structured arrays are akin to C structs, or
derived types in Fortran. They allow for the construction of
table-like structures composed of combinations of other data types,
including other compound types. Compound types might be useful for
representing multiple parameter values at each point on a grid, or at
each time and space location for scattered (point) data. You can then
access all the information for a point by reading one variable,
instead of reading different parameters from different variables.
Compound data types are created from the corresponding numpy data
type using the <a
href="netCDF4.Dataset-class.html#createCompoundType"
class="link">createCompoundType</a> method of a <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a> or <a
href="netCDF4.Group-class.html" class="link">Group</a> instance.
Since there is no native complex data type in netcdf, compound types
are handy for storing numpy complex arrays. Here's an example:</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>f = Dataset(<span class="py-string">'complex.nc'</span>,<span class="py-string">'w'</span>)
<span class="py-prompt">>>> </span>size = 3 <span class="py-comment"># length of 1-d complex array</span>
<span class="py-prompt">>>> </span><span class="py-comment"># create sample complex data.</span>
<span class="py-prompt">>>> </span>datac = numpy.exp(1j*(1.+numpy.linspace(0, numpy.pi, size)))
<span class="py-prompt">>>> </span><span class="py-comment"># create complex128 compound data type.</span>
<span class="py-prompt">>>> </span>complex128 = numpy.dtype([(<span class="py-string">'real'</span>,numpy.float64),(<span class="py-string">'imag'</span>,numpy.float64)])
<span class="py-prompt">>>> </span>complex128_t = f.createCompoundType(complex128,<span class="py-string">'complex128'</span>)
<span class="py-prompt">>>> </span><span class="py-comment"># create a variable with this data type, write some data to it.</span>
<span class="py-prompt">>>> </span>f.createDimension(<span class="py-string">'x_dim'</span>,None)
<span class="py-prompt">>>> </span>v = f.createVariable(<span class="py-string">'cmplx_var'</span>,complex128_t,<span class="py-string">'x_dim'</span>)
<span class="py-prompt">>>> </span>data = numpy.empty(size,complex128) <span class="py-comment"># numpy structured array</span>
<span class="py-prompt">>>> </span>data[<span class="py-string">'real'</span>] = datac.real; data[<span class="py-string">'imag'</span>] = datac.imag
<span class="py-prompt">>>> </span>v[:] = data <span class="py-comment"># write numpy structured array to netcdf compound var</span>
<span class="py-prompt">>>> </span><span class="py-comment"># close and reopen the file, check the contents.</span>
<span class="py-prompt">>>> </span>f.close(); f = Dataset(<span class="py-string">'complex.nc'</span>)
<span class="py-prompt">>>> </span>v = f.variables[<span class="py-string">'cmplx_var'</span>]
<span class="py-prompt">>>> </span>datain = v[:] <span class="py-comment"># read in all the data into a numpy structured array</span>
<span class="py-prompt">>>> </span><span class="py-comment"># create an empty numpy complex array</span>
<span class="py-prompt">>>> </span>datac2 = numpy.empty(datain.shape,numpy.complex128)
<span class="py-prompt">>>> </span><span class="py-comment"># .. fill it with contents of structured array.</span>
<span class="py-prompt">>>> </span>datac2.real = datain[<span class="py-string">'real'</span>]; datac2.imag = datain[<span class="py-string">'imag'</span>]
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> datac.dtype,datac <span class="py-comment"># original data</span>
<span class="py-output">complex128 [ 0.54030231+0.84147098j -0.84147098+0.54030231j -0.54030231-0.84147098j]</span>
<span class="py-output"></span><span class="py-prompt">>>></span>
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> datac2.dtype,datac2 <span class="py-comment"># data from file</span>
<span class="py-output">complex128 [ 0.54030231+0.84147098j -0.84147098+0.54030231j -0.54030231-0.84147098j]</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p>Compound types can be nested, but you must create the 'inner' ones
first. All of the compound types defined for a <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a> or <a
href="netCDF4.Group-class.html" class="link">Group</a> are stored in
a Python dictionary, just like variables and dimensions. As always,
printing objects gives useful summary information in an interactive
session:</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> f
<span class="py-output"><type 'netCDF4.Dataset'></span>
<span class="py-output">root group (NETCDF4 file format):</span>
<span class="py-output"> dimensions: x_dim</span>
<span class="py-output"> variables: cmplx_var</span>
<span class="py-output"> groups:</span>
<span class="py-output"><type 'netCDF4.Variable'></span>
<span class="py-output"></span><span class="py-prompt">>>> </span><span class="py-keyword">print</span> f.variables[<span class="py-string">'cmplx_var'</span>]
<span class="py-output">compound cmplx_var(x_dim)</span>
<span class="py-output">compound data type: [('real', '<f8'), ('imag', '<f8')]</span>
<span class="py-output">unlimited dimensions: x_dim</span>
<span class="py-output">current shape = (3,)</span>
<span class="py-output"></span><span class="py-prompt">>>> </span><span class="py-keyword">print</span> f.cmptypes
<span class="py-output">OrderedDict([('complex128', <netCDF4.CompoundType object at 0x1029eb7e8>)])</span>
<span class="py-output"></span><span class="py-prompt">>>> </span><span class="py-keyword">print</span> f.cmptypes[<span class="py-string">'complex128'</span>]
<span class="py-output"><type 'netCDF4.CompoundType'>: name = 'complex128', numpy dtype = [(u'real','<f8'), (u'imag', '<f8')]</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<h2 class="heading">10) Variable-length (vlen) data types.</h2>
<p>NetCDF 4 has support for variable-length or "ragged"
arrays. These are arrays of variable length sequences having the
same type. To create a variable-length data type, use the <a
href="netCDF4.Dataset-class.html#createVLType"
class="link">createVLType</a> method method of a <a
href="netCDF4.Dataset-class.html" class="link">Dataset</a> or <a
href="netCDF4.Group-class.html" class="link">Group</a> instance.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>f = Dataset(<span class="py-string">'tst_vlen.nc'</span>,<span class="py-string">'w'</span>)
<span class="py-prompt">>>> </span>vlen_t = f.createVLType(numpy.int32, <span class="py-string">'phony_vlen'</span>)</pre>
<p>The numpy datatype of the variable-length sequences and the name
of the new datatype must be specified. Any of the primitive datatypes
can be used (signed and unsigned integers, 32 and 64 bit floats, and
characters), but compound data types cannot. A new variable can then
be created using this datatype.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>x = f.createDimension(<span class="py-string">'x'</span>,3)
<span class="py-prompt">>>> </span>y = f.createDimension(<span class="py-string">'y'</span>,4)
<span class="py-prompt">>>> </span>vlvar = f.createVariable(<span class="py-string">'phony_vlen_var'</span>, vlen_t, (<span class="py-string">'y'</span>,<span class="py-string">'x'</span>))</pre>
<p>Since there is no native vlen datatype in numpy, vlen arrays are
represented in python as object arrays (arrays of dtype
<code>object</code>). These are arrays whose elements are Python
object pointers, and can contain any type of python object. For this
application, they must contain 1-D numpy arrays all of the same type
but of varying length. In this case, they contain 1-D numpy
<code>int32</code> arrays of random length betwee 1 and 10.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span><span class="py-keyword">import</span> random
<span class="py-prompt">>>> </span>data = numpy.empty(len(y)*len(x),object)
<span class="py-prompt">>>> </span><span class="py-keyword">for</span> n <span class="py-keyword">in</span> range(len(y)*len(x)):
<span class="py-prompt">>>> </span> data[n] = numpy.arange(random.randint(1,10),dtype=<span class="py-string">'int32'</span>)+1
<span class="py-prompt">>>> </span>data = numpy.reshape(data,(len(y),len(x)))
<span class="py-prompt">>>> </span>vlvar[:] = data
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> <span class="py-string">'vlen variable =\n'</span>,vlvar[:]
<span class="py-output">vlen variable =</span>
<span class="py-output">[[[ 1 2 3 4 5 6 7 8 9 10] [1 2 3 4 5] [1 2 3 4 5 6 7 8]]</span>
<span class="py-output"> [[1 2 3 4 5 6 7] [1 2 3 4 5 6] [1 2 3 4 5]]</span>
<span class="py-output"> [[1 2 3 4 5] [1 2 3 4] [1]]</span>
<span class="py-output"> [[ 1 2 3 4 5 6 7 8 9 10] [ 1 2 3 4 5 6 7 8 9 10]</span>
<span class="py-output"> [1 2 3 4 5 6 7 8]]]</span>
<span class="py-output"></span><span class="py-prompt">>>> </span><span class="py-keyword">print</span> f
<span class="py-output"><type 'netCDF4.Dataset'></span>
<span class="py-output">root group (NETCDF4 file format):</span>
<span class="py-output"> dimensions: x, y</span>
<span class="py-output"> variables: phony_vlen_var</span>
<span class="py-output"> groups:</span>
<span class="py-output"></span><span class="py-prompt">>>> </span><span class="py-keyword">print</span> f.variables[<span class="py-string">'phony_vlen_var'</span>]
<span class="py-output"><type 'netCDF4.Variable'></span>
<span class="py-output">vlen phony_vlen_var(y, x)</span>
<span class="py-output">vlen data type: int32</span>
<span class="py-output">unlimited dimensions:</span>
<span class="py-output">current shape = (4, 3)</span>
<span class="py-output"></span><span class="py-prompt">>>> </span><span class="py-keyword">print</span> f.VLtypes[<span class="py-string">'phony_vlen'</span>]
<span class="py-output"><type 'netCDF4.VLType'>: name = 'phony_vlen', numpy dtype = int32</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p>Numpy object arrays containing python strings can also be written
as vlen variables, For vlen strings, you don't need to create a vlen
data type. Instead, simply use the python <code>str</code> builtin
(or a numpy string datatype with fixed length greater than 1) when
calling the <a href="netCDF4.Dataset-class.html#createVariable"
class="link">createVariable</a> method.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>z = f.createDimension(<span class="py-string">'z'</span>,10)
<span class="py-prompt">>>> </span>strvar = rootgrp.createVariable(<span class="py-string">'strvar'</span>, str, <span class="py-string">'z'</span>)</pre>
<p>In this example, an object array is filled with random python
strings with random lengths between 2 and 12 characters, and the data
in the object array is assigned to the vlen string variable.</p>
<pre class="py-doctest">
<span class="py-prompt">>>> </span>chars = <span class="py-string">'1234567890aabcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'</span>
<span class="py-prompt">>>> </span>data = numpy.empty(10,<span class="py-string">'O'</span>)
<span class="py-prompt">>>> </span><span class="py-keyword">for</span> n <span class="py-keyword">in</span> range(10):
<span class="py-prompt">>>> </span> stringlen = random.randint(2,12)
<span class="py-prompt">>>> </span> data[n] = <span class="py-string">''</span>.join([random.choice(chars) <span class="py-keyword">for</span> i <span class="py-keyword">in</span> range(stringlen)])
<span class="py-prompt">>>> </span>strvar[:] = data
<span class="py-prompt">>>> </span><span class="py-keyword">print</span> <span class="py-string">'variable-length string variable:\n'</span>,strvar[:]
<span class="py-output">variable-length string variable:</span>
<span class="py-output">[aDy29jPt jd7aplD b8t4RM jHh8hq KtaPWF9cQj Q1hHN5WoXSiT MMxsVeq td LUzvVTzj</span>
<span class="py-output"> 5DS9X8S]</span>
<span class="py-output"></span><span class="py-prompt">>>> </span><span class="py-keyword">print</span> f
<span class="py-output"><type 'netCDF4.Dataset'></span>
<span class="py-output">root group (NETCDF4 file format):</span>
<span class="py-output"> dimensions: x, y, z</span>
<span class="py-output"> variables: phony_vlen_var, strvar</span>
<span class="py-output"> groups:</span>
<span class="py-output"></span><span class="py-prompt">>>> </span><span class="py-keyword">print</span> f.variables[<span class="py-string">'strvar'</span>]
<span class="py-output"><type 'netCDF4.Variable'></span>
<span class="py-output">vlen strvar(z)</span>
<span class="py-output">vlen data type: <type 'str'></span>
<span class="py-output">unlimited dimensions:</span>
<span class="py-output">current size = (10,)</span>
<span class="py-output"></span><span class="py-prompt">>>></span></pre>
<p>It is also possible to set contents of vlen string variables with
numpy arrays of any string or unicode data type. Note, however, that
accessing the contents of such variables will always return numpy
arrays with dtype <code>object</code>.</p>
<p>All of the code in this tutorial is available in
<code>examples/tutorial.py</code>, Unit tests are in the
<code>test</code> directory.</p>
<hr />
<div class="fields"> <p><strong>Contact:</strong>
Jeffrey Whitaker <jeffrey.s.whitaker@noaa.gov>
</p>
<p><strong>Copyright:</strong>
2008 by Jeffrey Whitaker.
</p>
<p><strong>License:</strong>
Permission to use, copy, modify, and distribute this software and
its documentation for any purpose and without fee is hereby
granted, provided that the above copyright notice appear in all
copies and that both the copyright notice and this permission
notice appear in supporting documentation. THE AUTHOR DISCLAIMS ALL
WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE
AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES
OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THIS SOFTWARE.
</p>
<p><strong>Version:</strong>
1.1.0
</p>
</div><!-- ==================== CLASSES ==================== -->
<a name="section-Classes"></a>
<table class="summary" border="1" cellpadding="3"
cellspacing="0" width="100%" bgcolor="white">
<tr bgcolor="#70b0f0" class="table-header">
<td align="left" colspan="2" class="table-header">
<span class="table-header">Classes</span></td>
</tr>
<tr>
<td width="15%" align="right" valign="top" class="summary">
<span class="summary-type"> </span>
</td><td class="summary">
<a href="netCDF4.CompoundType-class.html" class="summary-name">CompoundType</a><br />
A <a href="netCDF4.CompoundType-class.html"
class="link">CompoundType</a> instance is used to describe a
compound data type.
</td>
</tr>
<tr>
<td width="15%" align="right" valign="top" class="summary">
<span class="summary-type"> </span>
</td><td class="summary">
<a href="netCDF4.Dataset-class.html" class="summary-name">Dataset</a><br />
Dataset(self, filename, mode="r", clobber=True,
diskless=False, persist=False, keepweakref=False, format='NETCDF4')
</td>
</tr>
<tr>
<td width="15%" align="right" valign="top" class="summary">
<span class="summary-type"> </span>
</td><td class="summary">
<a href="netCDF4.Dimension-class.html" class="summary-name">Dimension</a><br />
Dimension(self, group, name, size=None)
</td>
</tr>
<tr>
<td width="15%" align="right" valign="top" class="summary">
<span class="summary-type"> </span>
</td><td class="summary">
<a href="netCDF4.Group-class.html" class="summary-name">Group</a><br />
Group(self, parent, name)
</td>
</tr>
<tr>
<td width="15%" align="right" valign="top" class="summary">
<span class="summary-type"> </span>
</td><td class="summary">
<a href="netCDF4.MFDataset-class.html" class="summary-name">MFDataset</a><br />
MFDataset(self, files, check=False, aggdim=None, exclude=[])
</td>
</tr>