Argument Types

DLHub supports many types of data as inputs and outputs to servables. In this part of the guide, we describe what these types are and how to define them when describing an interface. A full listing of the types is maintained in the DLHub schemas repository. A utility for creating type definitions can also be found in the DLHub SDK.

float, integer, number, complex

There are a variety of ways to express numerical values in DLHub interfaces: - number implies any real numerical value and implies there is limitation of the data being a float or integer. - float and integer are available if it is necessary to ensure the values are stored as floats or integers. - The complex argument type is used for complex numbers

string

Used for any string values.

boolean

Used for Boolean values.

timedelta, datetime

These data types define values associated with time. timedelta and datetime represent a length of time and a specific time, respectively.

python object

The python object type is used for data that cannot be expressed by other types. The only required argument for the python object is the Python type of the object by listed the Python class as python_type keyword. For example, a Pandas Dataframe would be expressed as:

{
    "type": "python object",
    "python_type": "pandas.DataFrame"
}

ndarray

ndarray values are matrices. It is required to specify the shape using the shape keyword, which takes a list of integers are None values. The null values in a shape definition represent that the dimension can take on any size. The type of each value in the array can be defined using the item_type keyword, which takes a type definition as its only argument. For example, an Nx3 array of integers can be represented by:

{
   "type": "ndarray",
   "item_type": {"type": "integer"},
   "shape": [null, 3]
}

file

File inputs to a servable are supported by the file argument type. Users can specify what types are allowed using MIME types For example, an application that takes CSV files and Excel spreadsheets would represented as:

{
   "type": "file",
   "file_types": [
      "text/csv",
      "application/vnd.ms-excel",
      "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
   ]
}

Files can only be mixed with other types of inputs in a reduced number of ways. The only acceptable types of function signatures with files are:

  1. A single file is the sole input/output
  2. A list of files is the sole input/output
  3. Files or lists of files is one or more of the inputs or outputs to the function

In other words, files may not be part of a dictionary or a tuple data structure, or as a list of list of files.

list

List types define an ordered collection of indefinite length of all the same type of items. Only the item type need be defined using the item_type keyword, which requires an argument type as its value. For example, a list of 1D ndarray would be:

{
    "type": "list",
    "item_type": {
        "type": "ndarray",
        "shape": [null]
    }
}

tuple

Tuple types define an ordered collection of known length where each member can be a different type. The item type of each member and, thereby, the length must be defined using the member_types keyword. A tuple of a integer, float, and list of strings would be:

{
    "type": "tuple",
    "element_types": [
        {"type": "integer"},
        {"type": "float"},
        {"type": "list", "item_type": "string"}
    ]
}

dict

The dict argument type is used for dictionary objects. The data type requires the names and types of each key to be defined in the properties keyword. For example, a dictionary with key “x” mapped to an integer and “y” mapped to a float would be:

{
    "type": "dict",
    "properties": {
        "x": {"type": "integer"},
        "y": {"type": "float"}
    }
}