turtle-matter

The data type of values expected for a Field in a RecordSet. This class is inspired by the Datatype class in CSVW. In addition to simple atomic types, types can be semantic types, such as schema.org classes, as well types defined in other vocabularies.

Key Features

Atomic Data Types

dataType Usage
sc:Boolean Describes a boolean
sc:Date Describes a date
sc:Float Describes a float
sc:Integer Describes an integer
sc:Text Describes a string

ML-Specific Data Types

dataType Usage
sc:ImageObject Describes a field containing the content of an image (pixels)
cr:BoundingBox Describes the coordinates of a bounding box (4-number array)
cr:Split Describes a RecordSet used to divide data into multiple sets according to intended usage with regards to models

Using Data Types from Other Vocabularies

Croissant datasets can use data types from other vocabularies, such as Wikidata. These may be supported by the tools consuming the data, but don’t need to. For example:

dataType Usage
wd:Q48277 (gender) Describes a Field or a RecordSet whose values are indicative of someone’s gender

Examples

Simple Field Type

{
  "@id": "images/color_sample",
  "@type": "cr:Field",
  "dataType": "sc:ImageObject"
}

Multiple Data Types

{
  "@id": "cities/url",
  "@type": "cr:Field",
  "dataType": ["https://schema.org/URL", "https://www.wikidata.org/wiki/Q515"]
}

This example shows a field that is expected to be a URL, whose semantic type is City, so values will be URLs referring to cities.