-
Notifications
You must be signed in to change notification settings - Fork 7
Expand file tree
/
Copy pathjcdf.xhtml
More file actions
313 lines (285 loc) · 12.5 KB
/
jcdf.xhtml
File metadata and controls
313 lines (285 loc) · 12.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
<html>
<head>
<title>JCDF</title>
</head>
<body>
<h1>JCDF</h1>
<h2>Overview</h2>
<p>JCDF is a pure java library capable of reading files in the
<a href="http://cdf.gsfc.nasa.gov/">Common Data Format</a> defined by NASA.
It runs within the J2SE1.5 (or later), but other than that has no dependencies,
neither the official CDF C library nor any other java class libraries.
</p>
<h2>Documentation</h2>
<p>The classes are provided with comprehensive
<a href="javadocs/index.html">javadocs</a>.
Start reading at the
<a href="javadocs/uk/ac/bristol/star/cdf/CdfContent.html"
><code>CdfContent</code></a> class
for high-level access to CDF data and metadata, or
<a href="javadocs/uk/ac/bristol/star/cdf/CdfReader.html"
><code>CdfReader</code></a>
for low-level access to the CDF internal records.
</p>
<h2>Comparison with the official CDF library</h2>
<p>JCDF is a completely separate implementation from the Java interface to
the official CDF library, which uses native code via JNI.
It was written mainly with reference to the CDF Internal Format Description
document (v3.4). Minor updates at version JCDF 1.1 are believed to
bring it into line with CDF v3.6.
</p>
<p>The main benefit of using JCDF, and the reason for developing it,
is that it's pure java, so it can be deployed using only the JCDF jar file.
There is no need to install the system-dependent official CDF library.
</p>
<p>The API is very different from that of the official CDF library.
JCDF gives you a simple view of the CDF file, in terms of its
global attributes, variable attributes and variables.
This is fairly easy to use, but may or may not suit your purposes.
It's also possible to get a low-level view of the CDF file as a
sequence of CDF records.
</p>
<p>JCDF offers no capabilities for writing or editing CDF files,
it only reads them.
</p>
<h2>Implementation Notes</h2>
<p>JCDF is based on NIO mapped ByteBuffers, it's expected to be reasonably
fast, but I haven't done any benchmarking.
</p>
<p>Use of mapped buffers leads to efficient serial and random I/O, but
does have some associated problems concerning release of allocated resources.
For reasons that are quite
<a href="http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4724038">complicated</a>,
it is hard to release mapped buffers after use, which means that for instance
files you have read using JCDF may be hard to delete,
and that reading large files may leak significant amounts of virtual memory.
There are some more or less nasty ways to work round this in user code or
in the library. If these issues cause practical problems for library users,
you are encouraged to contact me by email or open a github issue.
</p>
<h2>Implementation Status</h2>
<p>Support for the CDF format is almost, but not totally, complete.
In particular:
</p>
<ul>
<li><strong>Versions:</strong>
The code was written with reference to version 3.4
(updated to 3.6) of the CDF Internal Format Description document.
Following comments in that document, it is believed that
versions 2.6, 2.7 and 3.* of the CDF format are supported.
</li>
<li><strong>Large files:</strong>
The library imposes no restriction on file size,
so >2Gb files should be OK for v3 CDF files (CDF v2 did not
support 64-bit addressing) as long as a 64-bit JVM is in use.
At JCDF v1.2 I didn't have any large files to test on,
and there was a bug which caused a failure.
Jeremy Faden kindly pointed me at some large CDFs
which allowed me to identify and fix the bug,
so at v1.2 and beyond this is tested and working.
</li>
<li><strong>Compression:</strong>
All formats supported (GZIP, HUFF, AHUFF, RLE).
</li>
<li><strong>Numeric encodings:</strong>
Normal big- and little-endian encodings supported,
but VMS D_FLOAT and G_FLOAT encodings are not supported.
</li>
<li><strong>Layout:</strong>
Single-file CDF files are supported, but multiple-file CDF files are not.
This could be added fairly easily if necessary.
</li>
<li><strong>I/O:</strong>
Access is read-only, there is no attempt or intention
to support write access.
</li>
<li><strong>Data types:</strong>
All CDF data types are supported, more or less.
Unsigned integer types are transformed to larger signed types on read,
because of the difficulty of handling unsigned integers in java,
so for instance a CDF_UINT1 is read as a java <code>short</code> (16-bit)
integer.
</li>
<li><strong>Record data access</strong>:
For array-valued variables you can currently only read a whole record
at a time, not just part of an array-valued record.
You can either get the raw elements or a shaped version.
This is considerably less flexible than a hyper-read.
</li>
</ul>
<h2>Utilities</h2>
<p>The library comes with a couple of simple utilities for examining
CDF files:
</p>
<dl>
<dt><strong><code>CdfList</code></strong>:</dt>
<dd>displays the metadata and data from a CDF file,
along the lines of the <code>cdfdump</code> command in the official
CDF distribution.
If the <code>-data</code> flag is supplied, record data as well as
metadata is shown.
See <a href="cdflist.html">CdfList examples</a>.
</dd>
<dt><strong><code>CdfDump</code></strong>:</dt>
<dd>displays a dump of the sequence of low-level
CDF records found in the CDF file, along the lines of the
<code>cdfirsdump</code> command in the official CDF distribution.
If the <code>-fields</code> flag is supplied, field information from
each record is shown. If the <code>-html</code> flag is supplied,
the output is in HTML with files offsets displayed as hyperlinks,
which is nice for chasing pointers.
See <a href="cdfdump.html">CdfDump examples</a>.
</dd>
</dl>
<h2>Downloads</h2>
<p>The source code is hosted on github at
<a href="https://github.com/mbtaylor/jcdf">https://github.com/mbtaylor/jcdf</a>.
It comes with a makefile that can be used to build the jar file,
javadocs, and this documentation, and to run some tests.
</p>
<p>Pre-built copies of the jar file and documentation
for the current version (v1.2-5) can be found here:
</p>
<ul>
<li><a href="jcdf.jar">jcdf.jar</a></li>
<li><a href="javadocs/index.html">javadocs</a></li>
<li><a href="https://search.maven.org/artifact/uk.ac.starlink/jcdf"
>Maven central repository</a></li>
</ul>
<p>Previous versions may be available at
<a href="https://www.star.bristol.ac.uk/mbt/releases/jcdf/"
>https://www.star.bristol.ac.uk/mbt/releases/jcdf/</a>.
</p>
<h2>History</h2>
<dl>
<dt><strong>Version 0.1 (28 Jun 2013)</strong></dt>
<dd>Initial release.
Tested, documented and believed working, though could use some
more testing and perhaps functionality related to time-related DataTypes.
Support for some time types not complete.
</dd>
<dt><strong>Version 1.0 (13 Aug 2013)</strong></dt>
<dd><ul>
<li>More extensive tests added.</li>
<li>Fix failure when reading non-sparse variables with zero records.</li>
<li>Fix bug: pad values not explicitly defined are given default values
rather than causing an error.</li>
<li>Fix EPOCH16 bug.</li>
<li>Add EPOCH16 formatting.</li>
<li>Modify CdfList data output: better formatting, and distinguish
NOVARY values from virtual ones.</li>
<li>TIME_TT2000 values now handled correctly, including leap seconds,
and optional leap seconds table referenced by
<code>CDF_LEAPSECONDSTABLE</code> environment variable as for
NASA library. Internal leap seconds table is updated until
2012-07-01.</li>
</ul>
</dd>
<dt><strong>Version 1.1 (23 Apr 2015)</strong></dt>
<dd><ul>
<li>2015-07-01 leap second added to internal leap second table.</li>
<li>Updated to match v3.6 of the CDF library/format.
The GDR field <code>rfuD</code> is now renamed as
<code>leapSecondLastUpdated</code>.
It is also used when formatting TIME_TT2000 data values for output;
if the library leap second table is out of date with respect to
the data a warning is issued for information, and if the
time values are known to have leap seconds applied invalidly,
an error is thrown or a severe log message is issued.
This behaviour follows that of the official CDF library.</li>
</ul></dd>
<dt><strong>Version 1.2 (9 Sep 2015)</strong></dt>
<dd><ul>
<li>Fix a bug that caused a failure when accessing large (>2Gb) files.
</li>
<li>Update tests to use NASA's CDF library v3.6.0.4;
this fixes a missing leap second bug, which simplifies the tests
somewhat.</li>
</ul></dd>
<dt><strong>Version 1.2-1 (25 Sep 2015)</strong></dt>
<dd><ul>
<li>Fix a bug in leap second handling - it was sensitive to the JVM's
default time zone, and only ran correctly if the default time zone
matched UTC. The error in formatting TIME_TT2000 data values
was previously a second out within up to a day of the
occurrence of a leap second.</li>
<li>Add <code>Variable.getDescriptor</code> method.</li>
<li>Improve the build (mainly javadocs) a bit to reduce
some warnings that showed up in JDKs later than Java 5.
Also add a switch in the makefile to reduce noise
when building under JDK8.</li>
</ul></dd>
<dt><strong>Version 1.2-2 (4 Jan 2017)</strong></dt>
<dd><ul>
<li>2017-01-01 leap second added to internal leap second table.</li>
</ul></dd>
<dt><strong>Version 1.2-3 (16 Nov 2017)</strong></dt>
<dd><ul>
<li>Fix bugs in low-level reading code (<code>BankBuf</code>):
unsigned bytes could be read wrong in some cases, and
data could be read wrong near the boundaries of multi-buffer files
(only likely to show up for files >2Gbyte).
Thanks to Lukas Kvasnica (Brno) for identifying and fixing these.</li>
<li>Add unit tests to test the supplied <code>Buf</code>
implementations.</li>
<li>Some minor adjustments to the build/test framework to accommodate
Debian packaging (mostly replacement of test data files encumbered
by NASA copyright statement).
No change to distributed library code.</li>
</ul></dd>
<dt><strong>Version 1.2-4 (15 Dec 2021)</strong></dt>
<dd><ul>
<li>Some runtime log warnings about unexpected field values removed.
These generally concerned fields marked "Reserved for future use"
in the CDF Internal Format Description, and reported on deviations
from a particular CDF version (currently 3.4), but which may have
legitimate use in later versions of the CDF format.</li>
<li>Fix an error in interpreting the <code>leapSecondLastUpdated</code>
field from the Global Descriptor Record; this could cause a
RuntimeException with a message about the library leap second table
being out of date during TT2000 date formatting.</li>
</ul></dd>
<dt><strong>Version 1.2-5 (27 Feb 2025)</strong></dt>
<dd><ul>
<li>Performance improvements achieved by removing synchronized blocks
in the code. Read performance typically improved by about 40%
for large files.</li>
<li><code>EpochFormatter</code> class instances are now thread-safe.</li>
</ul></dd>
</dl>
<h2>Context</h2>
<p>This software was written by
<a href="https://www.star.bristol.ac.uk/mbt/">Mark Taylor</a>
at the University of Bristol at the request of, and funded by,
the science archive group at ESA's European Space Astronomy Centre (ESAC).
</p>
<p>It is used within
<a href="http://www.starlink.ac.uk/topcat/">TOPCAT</a
>/<a href="http://www.starlink.ac.uk/stilts/">STILTS</a
>/<a href="http://www.starlink.ac.uk/stil/">STIL</a>
to enable access to CDF tables alongside other tabular data formats
by that software.
It is believed to be in some use also at ESAC and elsewhere.
If anybody out there is using it and is willing to be listed here,
let me know.
</p>
<p>It is licenced under the LGPL, though if you need a different licence
I can probably fix it.
</p>
<p>My thanks to several people who have helped with development and
testing, including Christophe Arviset, Robert Candey, Michael Liu,
Jeremy Faden and Bogdan Nicula.
</p>
<p>Bugs, questions, feedback, enhancement requests welcome to
<a href="mailto:m.b.taylor@bristol.ac.uk">m.b.taylor@bristol.ac.uk</a>.
</p>
<address>
<hr />
Mark Taylor --
<a href='https://www.bristol.ac.uk/physics/research/astrophysics/'
>Astrophysics Group</a>,
<a href='https://www.bristol.ac.uk/physics/'>School of Physics</a>,
<a href='https://www.bristol.ac.uk/'>University of Bristol</a>
</address>
</body>
</html>