nxarray¶
xarray extension for NeXus input/output.
nxarray
extends xarray DataArrays and Datasets with a high-level python interface for NeXus file input and output.
Installation¶
You can install nxarray with pip:
$ pip install nxarray
Prerequisites¶
nxarray is built on and depends on nexusformat
and xarray
packages:
Usage¶
After installation, import nxarray with:
>>> import nxarray
Now the nxr.save()
method will be available to xarray Datasets. To save an existing Dataset to a NeXus file simply type:
>>> ds = xarray.Dataset()
>>> ds.nxr.save('path/to/file.nx')
To load a NeXus file into an xarray Dataset use the nxarray.load()
function:
>>> ds = nxarray.load('path/to/file.nx')
The default NXentry in the NeXus file will be loaded into the Dataset, with all its subgroups (NXdata, NXinstrument, NXsample…).
Note that just a single NXentry at once can be loaded into a Dataset. To load a different NXentry, specify it using the entry=
argument:
>>> ds = nxarray.load('path/to/file.nx', entry="myentry")
Upon loading, the fields in the NXdata groups within the NXentry are loaded into data variable and coordinates of the dataset, with their relevant attributes:
>>> ds
The NeXus tree of the NXentry with all the subgroups (NXinstrument, NXsample…) is stored in the NXtree
attribute of the Dataset (TAB completion can be used on NXtree
).
>>> ds.NXtree
data:NXdata
@axes = 'energy'
@energy_indices = 0
@signal = 'absorbed_beam'
instrument:NXinstrument
source:NXsource
current = 308.52
@units = 'mA'
>>> ds.NXtree.instrument
NXinstrument('instrument')
All xarray methods and attributes are accesible as usual. E.g. to plot the default signal:
>>> ds.absorbed_beam.plot()
For more info on the resulting Dataset structure and the architecture of nxarray
look at the Design section.
Examples¶
Let’s start by importing:
import numpy as np
import xarray as xr
import nxarray as nxr
and creating a dataset ds
:
ds = xr.Dataset()
data = xr.DataArray(np.random.randn(2, 3),
dims=('x', 'y'),
coords={'x': [10, 20], 'y': [1,2,3]},
name='some_data')
ds['MyData'] = data
The ds
Dataset can be saved to a NeXus file to disk simply with:
ds.nxr.save('ds.nxs')
You can load it back, let’s say to another Dataset ds2
with:
ds2 = nxarray.load('ds.nxs')
and you can check that the whole structure of your Dataset is preserved.
Additionally, the NXtree
attribute is present (in this example containing zero objects).
Naming conventions¶
Note that the nxr
accessor for xarray objects will always be available with this naming, independently of the shorthand used when import nxarray.
Design¶
The architecture of a NeXus file resembles the structure of an xarray Dataset, with some important differences. In the following it is assumed the reader is familiar with the nomenclature of xarray and NeXus NXdata.
The following table summarize the correspondence brought by nxarray between NeXus and xarray objects and definitions.
NeXus | xarray |
---|---|
NXentry |
Dataset.NXtree (*) |
NXdata.entries |
Dataset.data_vars, Dataset.coords (**) |
signal | data variable |
NXdata.nxaxes |
Dataset.dims |
axes | dimensions |
(*) The complete structure of the NXentry is loaded into the NXtree
Dataset attribute, with the exception of the entries in NXdata.entries
which are loaded into the Dataset data variables and coordinates as DataArrays (see below).
(**) The entries in NXdata.entries
are loaded into the Dataset data variables and coordinates as DataArrays, provided the attributes @signal
and @axes
are present in the NXdata group. NXlinks are resolved transparently and are kept when saving back to NeXus. The entry attributes are assigned to the correspondent DataArray. Additionally, the nxgroup
attribute is added to each DatArray and its value is set to the name of the NXdata group (NXdata.nxname
).
The identification of an entry as data variable or coordinate is performed as follows:
An entry referred by the
@signal
attribute of NXdata is considered a Dataset data variable.An entry is considered a coordinate if:
- it is listed in the
@axes
attribute of NXdata or - an attribute
AXIS_indices
is present in the NXdata group
- it is listed in the
Any other entry:
- is considered a data variable if its shape matches the
@signal
field shape - is disregarded otherwise.
- is considered a data variable if its shape matches the
Motivations¶
Despite xarray supports natively import/export of HDF5 (a file format designed to efficiently store and organize large amount of data), it does not provide an integrated interface to the NeXus file format, the standard de facto for scientific data storage, based on HDF5 and increasingly adopted in laboratories and large-scale facilities all over the world.
With this respect, the nxarray package comes into play, bridging xarray with the NeXus format. This package actually extends xarray, providing convenient loading and saving methods for NeXus files, directly to Datasets objects. The architecture of a NeXus file resembles the structure of an xarray Dataset, and indeed both of them are specifically designed for handling scientific data with its relevant metadata.
nxarray
is part of the reScipy project.
Feedback¶
Please report any feedback, bugs, or feature requests by opening an issue on the issue tracker of the code repository. You should provide as much information as possible to reproduce the problem, and details of your desiderata.