Record selector
The record selector is responsible for translating an HTTP response into a list of Airbyte records by extracting records from the response and optionally filtering and shaping records based on a heuristic. Schema:
  HttpSelector:
    type: object
    anyOf:
      - "$ref": "#/definitions/RecordSelector"
  RecordSelector:
    type: object
    required:
      - extractor
    properties:
      "$parameters":
        "$ref": "#/definitions/$parameters"
      extractor:
        "$ref": "#/definitions/RecordExtractor"
      record_filter:
        "$ref": "#/definitions/RecordFilter"
The current record extraction implementation uses dpath to select records from the json-decoded HTTP response.
For nested structures * can be used to iterate over array elements.
Schema:
  DpathExtractor:
    type: object
    additionalProperties: true
    required:
      - field_path
    properties:
      "$parameters":
        "$ref": "#/definitions/$parameters"
      field_path:
        type: array
        items:
          type: string
Common recipes:
Here are some common patterns:
Selecting the whole response
If the root of the response is an array containing the records, the records can be extracted using the following definition:
selector:
  extractor:
    field_path: [ ]
If the root of the response is a json object representing a single record, the record can be extracted and wrapped in an array. For example, given a response body of the form
{
  "id": 1
}
and a selector
selector:
  extractor:
    field_path: [ ]
The selected records will be
[
  {
    "id": 1
  }
]
Selecting a field
Given a response body of the form
{
  "data": [{"id": 0}, {"id": 1}],
  "metadata": {"api-version": "1.0.0"}
}
and a selector
selector:
  extractor:
    field_path: [ "data" ]
The selected records will be
[
  {
    "id": 0
  },
  {
    "id": 1
  }
]
Selecting an inner field
Given a response body of the form
{
  "data": {
    "records": [
      {
        "id": 1
      },
      {
        "id": 2
      }
    ]
  }
}
and a selector
selector:
  extractor:
    field_path: [ "data", "records" ]
The selected records will be
[
  {
    "id": 1
  },
  {
    "id": 2
  }
]
Selecting fields nested in arrays
Given a response body of the form
{
  "data": [
    {
      "record": {
        "id": "1"
      }
    },
    {
      "record": {
        "id": "2"
      }
    }
  ]
}
and a selector
selector:
  extractor:
    field_path: [ "data", "*", "record" ]
The selected records will be
[
  {
    "id": 1
  },
  {
    "id": 2
  }
]
Filtering records
Records can be filtered by adding a record_filter to the selector. The expression in the filter will be evaluated to a boolean returning true if the record should be included.
In this example, all records with a created_at field greater than the stream slice's start_time will be filtered out:
selector:
  extractor:
    field_path: [ ]
  record_filter:
    condition: "{{ record['created_at'] < stream_slice['start_time'] }}"
Transformations
Fields can be added or removed from records by adding Transformations to a stream's definition.
Schema:
  RecordTransformation:
    type: object
    anyOf:
      - "$ref": "#/definitions/AddFields"
      - "$ref": "#/definitions/RemoveFields"
Adding fields
Fields can be added with the AddFields transformation.
This example adds a top-level field "field1" with a value "static_value"
Schema:
  AddFields:
    type: object
    required:
      - fields
    additionalProperties: true
    properties:
      "$parameters":
        "$ref": "#/definitions/$parameters"
      fields:
        type: array
        items:
          "$ref": "#/definitions/AddedFieldDefinition"
  AddedFieldDefinition:
    type: object
    required:
      - path
      - value
    additionalProperties: true
    properties:
      "$parameters":
        "$ref": "#/definitions/$parameters"
      path:
        "$ref": "#/definitions/FieldPointer"
      value:
        type: string
  FieldPointer:
    type: array
    items:
      type: string
Example:
stream:
  <...>
  transformations:
      - type: AddFields
        fields:
          - path: [ "field1" ]
            value: "static_value"
This example adds a top-level field "start_date", whose value is evaluated from the stream slice:
stream:
  <...>
  transformations:
      - type: AddFields
        fields:
          - path: [ "start_date" ]
            value: { { stream_slice[ 'start_date' ] } }
Fields can also be added in a nested object by writing the fields' path as a list.
Given a record of the following shape:
{
  "id": 0,
  "data":
  {
    "field0": "some_data"
  }
}
this definition will add a field in the "data" nested object:
stream:
  <...>
  transformations:
      - type: AddFields
        fields:
          - path: [ "data", "field1" ]
            value: "static_value"
resulting in the following record:
{
  "id": 0,
  "data":
  {
    "field0": "some_data",
    "field1": "static_value"
  }
}
Removing fields
Fields can be removed from records with the RemoveFields transformation.
Schema:
  RemoveFields:
    type: object
    required:
      - field_pointers
    additionalProperties: true
    properties:
      "$parameters":
        "$ref": "#/definitions/$parameters"
      field_pointers:
        type: array
        items:
          "$ref": "#/definitions/FieldPointer"
Given a record of the following shape:
{
  "path": 
  {
    "to":
    {
      "field1": "data_to_remove",
      "field2": "data_to_keep"
    }
  },
  "path2": "data_to_remove",
  "path3": "data_to_keep"
}
this definition will remove the 2 instances of "data_to_remove" which are found in "path2" and "path.to.field1":
the_stream:
  <...>
  transformations:
      - type: RemoveFields
        field_pointers:
          - [ "path", "to", "field1" ]
          - [ "path2" ]
resulting in the following record:
{
  "path": 
  {
    "to":
    {
      "field2": "data_to_keep"
    }
  },
  "path3": "data_to_keep"
}