The file format is created by saving a file and contains information about the structure of data contained in a file, its purpose and affiliation. Application software can use the information available in the file format to interpret the data and make the contents available. The format of a file is added to the actual name with an appropriate extension. This consists of a point and two to four letters.
With so-called proprietary formats, the files can only be opened, edited and saved with the associated application, auxiliary or system programs (e.g. .doc/.docx, .xls/.xlsx). Open formats (e.g. .html, .jpg, .mp3, .gif), on the other hand, make it possible to open and edit the file with software from different manufacturers.
File formats can be actively changed by conversion when saving, but data loss can occur. In the scientific field, particular attention should be paid to compatibility, suitability for long-term archiving and lossless conversion to alternative formats.
Duration: 5:12 mins
Content: This short knowledge clip explains what file formats are, why they are important for research data management and what you should pay attention to.
Ghent University Data Stewards (2020). Knowledge clip: file formats. Available at: https://youtu.be/kxxlQnc8u1I
Licence: CC BY 4.0
You can find further information including descriptions of various file formats on the Library of Congress website.
You can download data management best practices evaluation checklist from the UCSB Library for some helpful tips on file formats and organization.
FAIR Data Austria (2021). “File formats”. In: Research Data Management Open Educational Resources Collection. (https://fair-office.at/index.php/fileformates/?lang=en).
License: CC BY 4.0 unless otherwise stated.