The data type of values expected for a Field
in a RecordSet
. This class is inspired by the Datatype class in CSVW. In addition to simple atomic types, types can be semantic types, such as schema.org classes, as well types defined in other vocabularies.
dataType
, in which case at least one must be an atomic data type (e.g.: sc:Text
), while other types can provide more semantic information, possibly in the context of ML.Field
s and on entire RecordSet
s.dataType | Usage |
---|---|
sc:Boolean | Describes a boolean |
sc:Date | Describes a date |
sc:Float | Describes a float |
sc:Integer | Describes an integer |
sc:Text | Describes a string |
dataType | Usage |
---|---|
sc:ImageObject | Describes a field containing the content of an image (pixels) |
cr:BoundingBox | Describes the coordinates of a bounding box (4-number array) |
cr:Split | Describes a RecordSet used to divide data into multiple sets according to intended usage with regards to models |
Croissant datasets can use data types from other vocabularies, such as Wikidata. These may be supported by the tools consuming the data, but don’t need to. For example:
dataType | Usage |
---|---|
wd:Q48277 (gender) | Describes a Field or a RecordSet whose values are indicative of someone’s gender |
{
"@id": "images/color_sample",
"@type": "cr:Field",
"dataType": "sc:ImageObject"
}
{
"@id": "cities/url",
"@type": "cr:Field",
"dataType": ["https://schema.org/URL", "https://www.wikidata.org/wiki/Q515"]
}
This example shows a field that is expected to be a URL, whose semantic type is City, so values will be URLs referring to cities.