> ## Documentation Index
> Fetch the complete documentation index at: https://docs.permutive.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Cohort query format

> How to define the behavior of cohorts

The behavior of a cohort is defined by a 'query', which specifies the conditions a user must meet to fall into the cohort. When creating, updating or viewing cohorts, the Cohorts API uses a JSON format to represent these queries.

A query for a 'football lovers' cohort might say something like 'the user has viewed at least two pages with the word "football" in the URL in the last 30 days.' This particular query would be represented in the Cohorts API as:

```json theme={"dark"}
{
  "event": "Pageview",
  "frequency": {
    "greater_than_or_equal_to": 2
  },
  "where": {
    "property": "properties.client.url",
    "condition": {
      "contains": "football"
    }
  },
  "during": {
    "the_last": {
      "value": 30,
      "unit": "days"
    }
  }
}
```

The above query consists of a single 'clause'. It is possible to combine an arbitrary number of clauses using 'and' or 'or' logic, for example, a query specifying 'the user has viewed at least two pages with the word "football" in the URL **OR** has viewed at least one page with "London" in the title' would be represented as:

```json theme={"dark"}
{
  "or": [
    {
      "event": "Pageview",
      "frequency": {
        "greater_than_or_equal_to": 2
      },
      "where": {
        "property": "properties.client.url",
        "condition": {
          "contains": "football"
        }
      }
    },
    {
      "event": "Pageview",
      "frequency": {
        "greater_than_or_equal_to": 1
      },
      "where": {
        "property": "properties.client.title",
        "condition": {
          "contains": "London"
        }
      }
    }
  ]
}
```

The most complex supported query logic can be supplied in conjunctive normal form, or effectively an AND of ORs of clauses. For example, the query specifying that '(the user has viewed at least two pages with the word "football" in the URL **OR** has viewed at least one page with "London" in the title) **AND** the user has not visited a page on the domain example.com' would be expressed as:

```json theme={"dark"}
{
  "and": [
    {
      "or": [
        {
         "event": "Pageview",
          "frequency": {
            "greater_than_or_equal_to": 2
          },
          "where": {
            "property": "properties.client.url",
            "condition": {
              "contains": "football"
            }
          }
        },
        {
          "event": "Pageview",
          "frequency": {
            "greater_than_or_equal_to": 1
          },
          "where": {
            "property": "properties.client.title",
            "condition": {
              "contains": "London"
            }
          }
        }
      ]
    },
    {
      "event": "Pageview",
      "frequency": {
        "equal_to": 0
      },
      "where": {
        "property": "properties.client.domain",
        "condition": {
          "equal_to": "example.com"
        }
      }
    }
  ]
}
```

<Tip>
  **Simplification**: The second half of the AND expression doesn't need to show the OR explicitly, since there is only one clause.
</Tip>

## Clause

A clause can be either an "expression" clause, an "engagement" clause, a "transition" clause, a "cohort membership" clause, or a "connections import" clause. It is essentially a particular condition which must be met in order for a user to enter a cohort. A query is composed from one or more clauses.

In the Behavior section of the Custom Cohort builder in the Permutive dashboard, a clause is represented as a single white box containing conditions on a single event type. The image below shows a cohort with two clauses:

<Image alt="2676" border={false} src="https://files.readme.io/d07a490-cohort-builder-clauses.png" title="cohort-builder-clauses.png" />

## Expression Clause

An expression clause represents conditions relating to a particular type of event, which must be met in order for the user to fall into the segment. It has the following top level fields:

| Top level key | Value                                                                      | Description                                                    |
| :------------ | :------------------------------------------------------------------------- | :------------------------------------------------------------- |
| "event"       | a string value                                                             | the name of the relevant event                                 |
| "frequency"   | a 'number comparison' object (see relevant subsection below)               | how often/how many times the condition must be met             |
| "during"      | a 'during' object (see relevant subsection below)                          | the time period within which the conditions must have been met |
| "where"       | An object representing conditions on the event (see `where` section below) | the conditions on the event which must be met                  |

The 'football lovers' query above is an example of an expression clause.

## Engagement Clause

An engagement clause selects users based on the time they spend active on-site (*engaged time*) and their page scroll-depth (*completion*). It has **one** of the following top level keys:

* `"engaged_time"` - identify users with a total amount of engaged time over a period, regardless of how many pageviews the user has had. For example, *users with 120 seconds or more engaged time in the last 7 days across pages about dogs*.
* `"engaged_completion"` - identify users with a specified maximum completion on the current page. For example, *users with at least 40% completion on the current page*.
* `"engaged_views"` - identify users who have had distinct page views each with some amount of engaged time or completion. For example, *users with 3 or more page views about dogs each with more than 30 seconds' engaged time*.

These keys would then point to an object with the following respective fields:

| Top level key         | Nested object key                                           | Value                                                                      | Description                                                             |
| :-------------------- | :---------------------------------------------------------- | :------------------------------------------------------------------------- | :---------------------------------------------------------------------- |
| "engaged\_time"       | "seconds"                                                   | A 'number comparison' object (see relevant section below)                  | How long the user must have spent on the page                           |
|                       | "during" (optional)                                         | An object representing a time period (see `during` section below)          | The time period within which the conditions must have been met          |
|                       | "where" (optional)                                          | An object representing conditions on the event (see `where` section below) | The conditions on the pageview which must be met                        |
| "engaged\_completion" | "completion"                                                | A 'number comparison' object (see relevant section below)                  | The fraction of the page which must have been completed                 |
|                       | "where" (optional)                                          | An object representing conditions on the event (see `where` section below) | The conditions on the pageview which must be met                        |
| "engaged\_views"      | "times"                                                     | A 'number comparison' object (see relevant section below)                  | How many times the condition must have been met                         |
|                       | "during" (optional)                                         | An object representing a time period (see `during` section below)          | The time period within which the conditions must have been met          |
|                       | "where" (optional)                                          | An object representing conditions on the event (see `where` section below) | The conditions on the pageview which must be met                        |
|                       | "engaged\_time" (must have EITHER this key OR "completion") | A 'number comparison' object (see relevant section below)                  | Condition on the number of seconds the user must have spent on the page |
|                       | "completion" (must have EITHER this key OR "engaged\_time") | A 'number comparison' object (see relevant section below)                  | The fraction of the page which must have been completed                 |

Here are some examples of engagement clause objects:

```json theme={"dark"}
{
  "engaged_time": {
    "where": {
      "property": "properties.article.title",
      "condition": {
        "equal_to": "My Interesting Article"
      }
    },
    "seconds": {
      "greater_than": 42
    }
  }
}
```

```json theme={"dark"}
{
  "engaged_completion": {
    "completion": {
      "greater_than": 0.5
    }
  }
}
```

```json theme={"dark"}
{
  "engaged_views": {
    "completion": {
      "greater_than": 0.3
    },
    "where": {
      "property": "properties.article.categories",
      "condition": {
        "list_contains": "sport"
      }
    },
    "during": {
      "the_last": {
        "value": 2,
        "unit": "days"
      }
    },
    "times": {
      "greater_than_or_equal_to": 2
    }
  }
}
```

## Transition Clause

A transition clause selects users based on whether or not they have entered or left another particular cohort. It has **one** of the following top level keys:

* `"has_entered"` - the user has entered the given cohort
* `"has_not_entered"` - the user has **not** entered the given cohort
* `"has_exited"` - the user has exited the given cohort
* `"has_not_exited"` - the user has **not** exited the given cohort

Whichever one of these keys is used, the value must be an object with the following fields:

| Key                 | Value                                                                      | Description                                                                         |
| :------------------ | :------------------------------------------------------------------------- | :---------------------------------------------------------------------------------- |
| "segment"           | an integer value                                                           | the Short Cohort ID of the segment on which the condition is based                  |
| "during" (optional) | a 'during' object (see relevant subsection below)                          | the time period within which the entry/exit condition must have been met            |
| "where" (optional)  | An object representing conditions on the event (see `where` section below) | the conditions on the pageview which must hold when the entry/exit condition is met |

<Warning>
  **Cohort IDs**: Cohorts have two different types of ID, a long UUID and a short integer ID. Cohorts are addressed in API URLs by the long UUID, but when identifying a cohort in a transition clause the short integer ID must be used.
</Warning>

Here is an example of a transition clause object:

```json theme={"dark"}
{
  "has_entered": {
    "where": {
      "property": "properties.client.url",
      "condition": {
        "does_not_contain": "gossip"
      }
    },
    "segment": 1234,
    "during": {
      "after": "2021-08-10T00:00:00Z"
    }
  }
}
```

## Cohort Membership Clause

This clause represents the requirement for a user to belong to a given third party (or second party) cohort. It has **one** of the following top level keys:

* `"in_third_party_segment"` - the user is in the given third party cohort
* `"not_in_third_party_segment"` - the user is **not** in the given third party cohort
* `"in_second_party_segment"` - the user is in the given second party cohort
* `"not_in_second_party_segment"` - the user is **not** in the given second party cohort

For any of these keys, the value is an object with the following fields:

| Key        | Value  | Description                                                                                   |
| :--------- | :----- | :-------------------------------------------------------------------------------------------- |
| "provider" | string | a string identifying the second/third party data provider within Permutive platform           |
| "segment"  | string | a string identifying the particular second/third party cohort on which the condition is based |

<Warning>
  **Provider and Cohort IDs**: The provider and cohort IDs used in Cohort Membership clauses are the identifiers used for the relevant entity within the Permutive platform. Depending on the use case, it might be necessary to request a list of these identifiers from the Support team or your Customer Success Manager.
</Warning>

Here are some examples of cohort membership clauses:

```json theme={"dark"}
{
  "in_third_party_segment": {
    "segment": "123456",
    "provider": "my_tpd_provider"
  }
}
```

```json theme={"dark"}
{
  "not_in_second_party_segment": {
    "segment": "1000",
    "provider": "test_2nd_party"
  }
}
```

## Connections Import Clause

A connections import clause matches users based on their membership of a Connectivity import (of either "User Profile" or "User Activity" data). You can optionally apply property filters, restrict the time window, and set frequency constraints. It has **one** of the following top level keys:

* `"in_connections_import_segment"` - the user is in the given connections import segment
* `"not_in_connections_import_segment"` - the user is **not** in the given connections import segment

For either of these keys, the value is an object with the following fields:

| Key                    | Value                                                        | Description                                                   |
| :--------------------- | :----------------------------------------------------------- | :------------------------------------------------------------ |
| "provider"             | a string value                                               | an identifier for the Connectivity import                     |
| "filters" (optional)   | a 'filters' object (see below)                               | conditions on properties of the connections import            |
| "during" (optional)    | a 'during' object (see relevant subsection below)            | the time period within which the condition must have been met |
| "frequency" (optional) | a 'number comparison' object (see relevant subsection below) | how many times the condition must be met                      |

<Warning>
  **Provider IDs**: The list of provider IDs corresponding to Connectivity imports is not currently visible within the Permutive dashboard - please contact Support or your Customer Success Manager to receive a list of available IDs.
</Warning>

### Filters

The `"filters"` field defines conditions on properties of the connections import segment. Property names reference the connections import schema directly and do not use the `properties.` prefix.

A single filter condition:

```json theme={"dark"}
{
  "property": "field_name",
  "condition": { ... }
}
```

Multiple filter conditions can be combined with `"and"` or `"or"`:

```json theme={"dark"}
{
  "and": [
    { "property": "field_name_1", "condition": { ... } },
    { "property": "field_name_2", "condition": { ... } }
  ]
}
```

The `"condition"` object uses the same format as described in the `"where"` section below (string, integer, float, date, boolean, and list conditions are all supported).

Here are some examples of connections import clause objects:

```json theme={"dark"}
{
  "in_connections_import_segment": {
    "provider": "my_data_provider"
  }
}
```

```json theme={"dark"}
{
  "not_in_connections_import_segment": {
    "provider": "my_data_provider",
    "filters": {
      "property": "interest",
      "condition": {
        "equal_to": "sports"
      }
    }
  }
}
```

```json theme={"dark"}
{
  "in_connections_import_segment": {
    "provider": "my_data_provider",
    "filters": {
      "and": [
        {
          "property": "education",
          "condition": {
            "equal_to": "GCSE"
          }
        },
        {
          "property": "age",
          "condition": {
            "greater_than": 18
          }
        }
      ]
    },
    "during": {
      "the_last": {
        "value": 30,
        "unit": "days"
      }
    },
    "frequency": {
      "greater_than_or_equal_to": 2
    }
  }
}
```

## Number Comparison

A 'number comparison' object is used in several different cases when a numeric property needs to be measured against a specific condition. The object consists of a single key-value, with the following available options:

| Key                            | Value                                                                           | Description                                                                                                              |
| :----------------------------- | :------------------------------------------------------------------------------ | :----------------------------------------------------------------------------------------------------------------------- |
| "equal\_to"                    | a single number                                                                 | condition is met if the property is exactly equal to the given number                                                    |
|                                | a non-empty list of numbers                                                     | condition is met if the property is exactly equal to any one of the given numbers                                        |
| "not\_equal\_to"               | a single number                                                                 | condition is met if the property is not exactly equal to the given number                                                |
|                                | a non-empty list of numbers                                                     | condition is met if the property is not exactly equal to any of the given numbers                                        |
| "between"                      | an object with two members, "start" and "end", each with a single numeric value | condition is met if the property is greater than or equal to the "start" value and less than or equal to the "end" value |
| "greater\_than"                | a single number                                                                 | condition is met if the property is greater than the given number                                                        |
| "less\_than"                   | a single number                                                                 | condition is met if the property is less than the given number                                                           |
| "greater\_than\_or\_equal\_to" | a single number                                                                 | condition is met if the property is greater than or equal to the given number                                            |
| "less\_than\_or\_equal\_to"    | a single number                                                                 | condition is met if the property is less than or equal to the given number                                               |

## "during"

This defines the time period during which the conditions must be met in order for a user to enter a cohort. It can be either a single string value, or an object with a single key pointing to a single value or a nested object. Valid values are:

| Key             | Nested object key | Value                                               | Description                                                                                           |
| :-------------- | :---------------- | :-------------------------------------------------- | :---------------------------------------------------------------------------------------------------- |
| "this\_session" | N/A               | N/A (string value only)                             | the condition must have been met during the current session                                           |
| "the\_last"     | "value"           | integer                                             | taken with "unit", gives the length of time before now during which the condition must have been met  |
|                 | "unit"            | "seconds"/"minutes"/"hours"/"days"/"weeks"/"months" | taken with "value", gives the length of time before now during which the condition must have been met |
| "in\_interval"  | "start"           | timestamp                                           | the start of the time window during which the condition must have been met                            |
|                 | "end"             | timestamp                                           | the end of the time window during which the condition must have been met                              |
| "before"        | N/A               | timestamp                                           | the time before which the condition must have been met                                                |
| "after"         | N/A               | timestamp                                           | the time after which the condition must have been met                                                 |
| "first"         | N/A               | integer                                             | the condition must have been met during the first N events of this type                               |
| "last"          | N/A               | integer                                             | the condition must have been met during the most recent N events of this type                         |
| "current\_view" | N/A               | N/A (string value only)                             | the condition must have been met within the current pageview                                          |

Here are examples of `"during"` conditions:

```json theme={"dark"}
{
  "in_interval": {
    "start": "2021-08-11T00:00:00Z",
    "end": "2021-08-12T00:00:00Z"
  }
}
```

```json theme={"dark"}
"this_session"
```

```json theme={"dark"}
{"last": 5}
```

## "where"

This defines conditions on properties of a given event which must be met in order for a user to enter a cohort. It consists of an object with two fields:

* `"property"` - this is the name of the property on which the condition is tested
* `"condition"` - this defines the condition to be tested

<Note>
  **Property naming conventions**: Property names consist of string segments separated by periods. All properties start with `"properties."`, although this is hidden in the Permutive Dashboard. The full name for an event property that appears in the dashboard as `client.title` would therefore be `"properties.client.title"`.
</Note>

The `"condition"` can be one of the following:

**Integer condition**\
This is a condition on an integer numeric value. It takes the form of a 'Number Comparison' object as described above.

**Float condition**\
This is a condition on a floating point numeric value. Its form is exactly the same as a normal 'number comparison' object, except that the top level key is prefixed with `float_`, for example `"float_equal_to"`, `"float_between"`, etc, and the actual comparison value is interpreted as a floating point number rather than an integer.

**Date condition**\
This is a condition on a timestamp value. Its form is exactly the same as a normal 'number comparison' object, except that the top level key is prefixed with `date_`, for example `"date_equal_to"`, `"date_between"`, etc, and the actual comparison value is a string timestamp instead of an integer.

**String condition**\
This is a condition on a string property. It consists either of a single string value, or an object with a single field. Valid values are:

| Key                  | Value                     | Description                                                                                  |
| :------------------- | :------------------------ | :------------------------------------------------------------------------------------------- |
| "equal\_to"          | string                    | the property value must exactly match the provided value                                     |
|                      | non-empty list of strings | the property value must exactly match any one of the provided values                         |
| "not\_equal\_to"     | string                    | the property value must not exactly match the provided value                                 |
|                      | non-empty list of strings | the property value must not exactly match any of the provided values                         |
| "contains"           | string                    | the property value must include the provided value as a substring                            |
|                      | non-empty list of strings | the property value must include at least one of the provided values as a substring           |
| "does\_not\_contain" | string                    | the property value must not include the provided value as a substring                        |
|                      | non-empty list of strings | the property value must not include any of the provided values as a substring                |
| "is\_empty"          | N/A (string value only)   | the property value must be an empty string, or the event must have no value for the property |
| "is\_not\_empty"     | N/A (string value only)   | the property value must be a non-empty string                                                |

**List condition**\
This is a condition on a list property. It consists either of a single string value, or an object with a single field. Valid values are:

| Key                                 | Value                     | Description                                                                                |
| :---------------------------------- | :------------------------ | :----------------------------------------------------------------------------------------- |
| "list\_contains"                    | string                    | the list must include the provided string value                                            |
|                                     | non-empty list of strings | the list must include at least one of the provided string values                           |
| "list\_does\_not\_contain"          | string                    | the list must not include the provided string value                                        |
|                                     | non-empty list of strings | the list must not include any of the provided string values                                |
| "list\_contains\_date"              | timestamp                 | the list must include the provided timestamp                                               |
| "list\_does\_not\_contain\_date"    | timestamp                 | the list must not include the provided timestamp value                                     |
| "list\_contains\_float"             | float                     | the list must include the provided floating point numeric value                            |
| "list\_does\_not\_contain\_float"   | float                     | the list must not include the provided floating point numeric value                        |
| "list\_contains\_integer"           | integer                   | the list must include the provided integer numeric value                                   |
| "list\_does\_not\_contain\_integer" | integer                   | the list must not include the provided integer numeric value                               |
| "list\_is\_empty"                   | N/A (string value only)   | the property value must be an empty list, or the event must have no value for the property |
| "list\_is\_not\_empty"              | N/A (string value only)   | the property value must be a non-empty list                                                |

**List summary condition**\
This is a condition on some aggregation of a property which is a list of objects. It consists of an object with four fields:

* "`property`" - the property within the listed objects which is to be aggregated
* `"condition"` - the condition to be applied to the specified property
* `"function"` - the type of aggregation to perform on the list
* `"where"` (optional) - an additional filter to apply to the listed objects before applying the aggregation

<Note>
  **Property naming**: The `"property"` naming convention within a list summary condition is to omit the common prefix. For example, say we have a list summary condition on a property `properties.slot.targeting`, which is a list of objects with two fields, `properties.slot.targeting.key` and `properties.slot.targeting.value`. Within the list summary condition, we would refer to those two properties as `key` and `value` respectively, since the first part of the property path is implicit.
</Note>

The `"condition"` object takes the same form as the `"condition"` clause within a normal `"where"` object (see relevant section above).

The `"function"` must be one of the following: `"any"`, `"all"`, `"sum"`, `"product"`, "max"`, `"min"`, ``"count"`, or `"mean"`. Some of these (sum, product, max, min and mean) can only be used on a numeric sub-property.

The `"where"` object takes the same form as the `"where"` component of a top level clause.

An example of a list summary condition is:

```json theme={"dark"}
{
  "property": "properties.slot.targeting",
  "condition": {
    "property": "key",
    "function": "any",
    "where": {
      "property": "value",
      "condition": {
        "list_contains": "efgh"
      }
    },
    "condition": {
      "contains": "abcd"
    }
  }
}
```

**Boolean condition**\
This is a condition on a Boolean property. Is consists of an object with a single key, `"boolean_equal_to"`, and a value of either `"true"` or `"false"`.

## Compound `"where"` conditions

Anywhere a `"where"` object is expected, it is also possible to provide a list of multiple conditions separated by either ORs or ANDs. This can only be a single list, and can only be one level deep. For example:

```json theme={"dark"}
{
  "and": [
    {
      "property": "properties.article.title",
      "condition": {
        "equal_to": "abcd"
      }
    },
    {
      "property": "properties.article.tags",
      "condition": {
        "list_contains": "defg"
      }
    }
  ]
}
```

```json theme={"dark"}
{
  "or": [
    {
      "property": "properties.article.title",
      "condition": {
        "equal_to": "abcd"
      }
    },
    {
      "property": "properties.article.tags",
      "condition": {
        "list_contains": "defg"
      }
    }
  ]
}
```
