Giacomo Debidda

Giacomo Debidda (IT)

I am a former biomedical/clinical engineer fallen in love with programming. I use Python for data science and Javascript for data visualizations.
Always leave the codebase cleaner than you found it.

Getting started with HDF5 and PyTables
English flag Talk Data Science Hall (-1.61)
Sunday 11. March: 14:15

HDF5 is a data model, a library, a file format for storing and managing big and complex data. PyTables is a Python package built on top of the HDF5 library and NumPy. It provides a high-level interface with advanced indexing and database-like query capabilities. PyTables is both easy to use and extremely fast, so it might be an invaluable tool if you need to work with large, hierarchical datasets. At the end of this talk you will learn what HDF5 is, why it might be the right file format for you, and where PyTables fits in the Python data ecosystem.

- What is HDF5 and who uses it?
- Brief overview of the HDF5 data model
- First steps with PyTables
- PyTables tools
- Search big data with PyTables and NumExpr
- Additional resources to learn more
- Q&A