turtle-matter

FileObject is the Croissant class used to represent individual files that are part of a dataset.

FileObject is a general purpose class that inherits from Schema.org CreativeWork, and can be used to represent instances of more specific types of content like DigitalDocument and MediaObject.

Most of the important properties needed to describe a FileObject are defined in the classes it inherits from:

Property ExpectedType Cardinality Description
sc:name Text ONE The name of the file. As much as possible, the name should reflect the name of the file as downloaded, including the file extension. e.g. "images.zip".
sc:contentUrl URL ONE Actual bytes of the media object, for example the image file or video file.
sc:contentSize Text ONE File size in (mega/kilo/…)bytes. Defaults to bytes if a unit is not specified.
sc:encodingFormat Text ONE The format of the file, given as a mime type.
sc:sameAs URL MANY URL (or local name) of a FileObject with the same content, but in a different format.
sc:sha256 Text ONE Checksum for the file contents.

In addition, FileObject defines the following property:

Property ExpectedType Cardinality Description
containedIn Text MANY Another FileObject or FileSet that this one is contained in, e.g., in the case of a file extracted from an archive. When this property is present, the contentUrl is evaluated as a relative path within the container object.

Let’s look at a few examples of FileObject definitions.

First, a single CSV file:

{
  "@type": "cr:FileObject",
  "@id": "pass_metadata.csv",
  "contentUrl": "https://zenodo.org/record/6615455/files/pass_metadata.csv",
  "encodingFormat": "text/csv",
  "sha256": "0b033707ea49365a5ffdd14615825511"
}

Next: An archive and some files extracted from it (represented via the containedIn property):

{
  "@type": "cr:FileObject",
  "@id": "ml-25m.zip",
  "contentUrl": "https://files.grouplens.org/datasets/movielens/ml-25m.zip",
  "encodingFormat": "application/zip",
  "sha256": "6b51fb2759a8657d3bfcbfc42b592ada"
},
{
  "@type": "cr:FileObject",
  "@id": "ratings-table",
  "contentUrl": "ratings.csv",
  "containedIn": { "@id": "ml-25m.zip" },
  "encodingFormat": "text/csv"
},
{
  "@type": "cr:FileObject",
  "@id": "movies-table",
  "contentUrl": "movies.csv",
  "containedIn": { "@id": "ml-25m.zip" },
  "encodingFormat": "text/csv"
}