Bayesian inference produces naturally high dimensional data: in the case of Markov chain Monte Carlo, it is common to produce multiple independent simulations (chains) to facilitate calculations like effective sample size and r-hat statistics. In this case, posterior samples are of dimension at least 2, and higher for multivariate random variables. Storing the data as an xarray dataset allows for labeled querying of this data, along with serialization, and attached metadata. ArviZ is a software library that stores these multiple datasets from inference using netCDF groups, which are themselves built with HDF5. The
InferenceData class implements this functionality. By using netCDF, in addition to native handling of high dimensional data and being able to use existing serialization and deserialization function, all functions need be implemented only once.