temporian.EventSet #

Actual temporal data.

Use tp.event_set() to create an EventSet manually, or tp.from_pandas() to create an EventSet from a pandas DataFrame.

creator `property` #

creator: Optional[Operator]

Creator.

The creator is the operator that outputted this EventSet. Manually created EventSets have a None creator.

add #

__add__(other: Any) -> EventSetOrNode

Adds an EventSet or a scalar value to self element-wise.

If an EventSet, each feature in self is added to the feature in other in the same position. self and other must have the same sampling, index, number of features and dtype for the features in the same positions.

If a scalar, other is added to each item in each feature in self.

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200], "f2": [10, -10, 5]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f3": [-1, 1, 2], "f4": [1, -1, 5]},
...     same_sampling_as=a
... )

>>> c = a + b
>>> c
indexes: []
features: [('f1', int64), ('f2', int64)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ -1 101 202]
        'f2': [ 11 -11 10]
...

Example with scalar value

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200], "f2": [10, -10, 5]}
... )

>>> b = a + 3
>>> b
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [ 3 103 203]
        'f2': [13 -7 8]
...

>>> b = 3 + a
>>> b
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [ 3 103 203]
        'f2': [13 -7 8]
...

Cast dtypes example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200], "f2": [10., -10., 5.]}
... )

>>> # Cannot add: f1 is int64 but f2 is float64
>>> c = a["f1"] + a["f2"]
Traceback (most recent call last):
    ...
ValueError: ... corresponding features should have the same dtype. ...

>>> # Cast f1 to float
>>> c = a["f1"].cast(tp.float64) + a["f2"]
>>> c
indexes: []
features: [('f1', float64)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ 10. 90. 205.]
...

Resample example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"fa": [1, 2, 3]},
... )
>>> b = tp.event_set(
...     timestamps=[-1, 1.5, 3, 5],
...     features={"fb": [-10, 15, 30, 50]},
... )

>>> # Cannot add different samplings
>>> c = a + b
Traceback (most recent call last):
    ...
ValueError: ... should have the same sampling. ...

>>> # Resample a to match b timestamps
>>> c = a.resample(b) + b
>>> c
indexes: []
features: [('fa', int64)]
events:
    (4 events):
        timestamps: [-1. 1.5 3. 5. ]
        'fa': [-10 16 33 53]
...

Reindex example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3, 4],
...     features={
...         "cat": [1, 1, 2, 2],
...         "M": [10, 20, 30, 40]
...     },
...     indexes=["cat"]
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3, 4],
...     features={
...         "cat": [1, 1, 2, 2],
...         "N": [10, 20, 30, 40]
...     },
... )

>>> # Cannot add with different index (only 'a' is indexed by 'cat')
>>> c = a + b
Traceback (most recent call last):
    ...
ValueError: Arguments don't have the same index. ...

>>> # Add index 'cat' to b
>>> b = b.add_index("cat")
>>> # Make explicit same samplings and add
>>> c = a + b.resample(a)
>>> c
indexes: [('cat', int64)]
features: [('M', int64)]
events:
    cat=1 (2 events):
        timestamps: [1. 2.]
        'M': [20 40]
    cat=2 (2 events):
        timestamps: [3. 4.]
        'M': [60 80]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet or scalar value.	required

Returns:

Type	Description
`EventSetOrNode`	Result of the operation.

and #

__and__(other: Any) -> EventSetOrNode

Computes logical and (self & other) element-wise with another EventSet.

Each feature in self is compared element-wise to the feature in other in the same position.

self and other must have the same sampling, the same number of features, and all feature types must be bool (see cast example below).

Example

>>> a = tp.event_set(timestamps=[1, 2, 3], features={"f1": [100, 150, 200]})

>>> # Sample boolean features
>>> b = a > 100
>>> c = a < 200

>>> d = b & c
>>> d
indexes: []
features: [('f1', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [False True False]
...

Example casting integer to boolean

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 1, 1], "f2": [1, 1, 0]}
... )
>>> b = a.cast(bool)
>>> c = b["f1"] & b["f2"]
>>> c
indexes: []
features: [('f1', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [False True False]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet with only boolean features.	required

Returns:

Type	Description
`EventSetOrNode`	EventSet with result of the comparison.

bool #

__bool__() -> None

Catches bool evaluation with an error message.

floordiv #

__floordiv__(other: Any) -> EventSetOrNode

Divides self by an EventSet or a scalar value and takes the floor of the result, element-wise.

If an EventSet, each feature in self is divided by the feature in other in the same position. self and other must have the same sampling, index, number of features and dtype for the features in the same positions.

If a scalar, each item in each feature in self is divided by other.

See examples in EventSet.__add__() to see how to match samplings, dtypes and index, in order to apply arithmetic operators in different EventSets.

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [10, 3, 150]},
...     same_sampling_as=a
... )

>>> c = a // b
>>> c
indexes: []
features: [('f1', int64)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ 0 33 1]
...

Example with scalar value

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [1, 100, 200], "f2": [10., -10., 5.]}
... )

>>> b = a // 3
>>> b
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [ 0 33 66]
        'f2': [ 3. -4. 1.]
...

>>> c = 300 // a
>>> c
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [300 3 1]
        'f2': [ 30. -30. 60.]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet or scalar value.	required

Returns:

Type	Description
`EventSetOrNode`	Result of the operation.

ge #

__ge__(other: Any) -> EventSetOrNode

Computes greater equal (self >= other) element-wise with another EventSet or a scalar value.

If an EventSet, each feature in self is compared element-wise to the feature in other in the same position. self and other must have the same sampling and the same number of features.

If a scalar value, each item in each feature in input is compared to value.

Note that it will always return False on NaN elements.

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [-10, 100, 5]},
...     same_sampling_as=a
... )

>>> c = a >= b
>>> c
indexes: []
features: [('f1', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ True True True]
...

Example with scalar

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200], "f2": [-10, 100, 5]}
... )

>>> b = a >= 100
>>> b
indexes: []
features: [('f1', bool_), ('f2', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [False True True]
        'f2': [False True False]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet or scalar value.	required

Returns:

Type	Description
`EventSetOrNode`	Result of the comparison.

getitem #

__getitem__(feature_names: Union[str, List[str]])

Creates an EventSet with a subset of the features.

gt #

__gt__(other: Any) -> EventSetOrNode

Computes greater (self > other) element-wise with another EventSet or a scalar value.

If an EventSet, each feature in self is compared element-wise to the feature in other in the same position. self and other must have the same sampling and the same number of features.

If a scalar value, each item in each feature in self is compared to other.

Note that it will always return False on NaN elements.

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [-10, 100, 5]},
...     same_sampling_as=a
... )

>>> c = a > b
>>> c
indexes: []
features: [('f1', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ True False True]
...

Example with scalar

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200], "f2": [-10, 100, 5]}
... )

>>> b = a != 100
>>> b
indexes: []
features: [('f1', bool_), ('f2', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ True False True]
        'f2': [ True False True]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet or scalar value.	required

Returns:

Type	Description
`EventSetOrNode`	Result of the comparison.

invert #

__invert__() -> EventSetOrNode

Inverts a boolean EventSet element-wise.

Swaps False <-> True.

Does not work on integers, they should be cast to tp.bool_ beforehand, using EventSet.cast().

Example

>>> a = tp.event_set(
...     timestamps=[1, 2],
...     features={"M": [1, 5], "N": [1.0, 5.5]},
... )
>>> # Boolean EventSet
>>> b = a < 2
>>> b
indexes: ...
        'M': [ True False]
        'N': [ True False]
...

>>> # Inverted EventSet
>>> c = ~b
>>> c
indexes: ...
        'M': [False True]
        'N': [False True]
...

Returns:

Type	Description
`EventSetOrNode`	Inverted EventSet.

le #

__le__(other: Any) -> EventSetOrNode

Computes less equal (self <= other) element-wise with another EventSet or a scalar value.

If an EventSet, each feature in self is compared element-wise to the feature in other in the same position. self and other must have the same sampling and the same number of features.

If a scalar value, each item in each feature in input is compared to value.

Note that it will always return False on NaN elements.

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [-10, 100, 5]},
...     same_sampling_as=a
... )

>>> c = a <= b
>>> c
indexes: []
features: [('f1', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [False True False]
...

Example with scalar

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200], "f2": [-10, 100, 5]}
... )

>>> b = a <= 100
>>> b
indexes: []
features: [('f1', bool_), ('f2', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ True True False]
        'f2': [ True True True]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet or scalar value.	required

Returns:

Type	Description
`EventSetOrNode`	Result of the comparison.

lt #

__lt__(other: Any) -> EventSetOrNode

Computes less (self < other) element-wise with another EventSet or a scalar value.

If an EventSet, each feature in self is compared element-wise to the feature in other in the same position. self and other must have the same sampling and the same number of features.

If a scalar value, each item in each feature in input is compared to value.

Note that it will always return False on NaN elements.

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [-10, 100, 5]},
...     same_sampling_as=a
... )

>>> c = a < b
>>> c
indexes: []
features: [('f1', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [False False False]
...

Example with scalar

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200], "f2": [-10, 100, 5]}
... )

>>> b = a < 100
>>> b
indexes: []
features: [('f1', bool_), ('f2', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ True False False]
        'f2': [ True False True]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet or scalar value.	required

Returns:

Type	Description
`EventSetOrNode`	Result of the comparison.

mod #

__mod__(other: Any) -> EventSetOrNode

Computes modulo or remainder of division with another EventSet or a scalar value.

If an EventSet, each feature in self is reduced modulo the feature in other in the same position. self and other must have the same sampling, index, number of features and dtype for the features in the same positions.

If a scalar, each item in each feature in self is reduced modulo other.

See examples in EventSet.__add__() to see how to match samplings, dtypes and index, in order to apply arithmetic operators in different EventSets.

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 7, 200]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [10, 5, 150]},
...     same_sampling_as=a
... )

>>> c = a % b
>>> c
indexes: []
features: [('f1', int64)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ 0 2 50]
...

Example with scalar value

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [1, 100, 200], "f2": [10., -10., 5.]}
... )

>>> b = a % 3
>>> b
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [1 1 2]
        'f2': [1. 2. 2.]
...

>>> c = 300 % a
>>> c
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [ 0 0 100]
        'f2': [ 0. -0. 0.]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet or scalar value.	required

Returns:

Type	Description
`EventSetOrNode`	Result of the operation.

mul #

__mul__(other: Any) -> EventSetOrNode

Multiplies an EventSet or a scalar value with self element-wise.

If an EventSet, each feature in self is multiplied with the feature in other in the same position. self and other must have the same sampling, index, number of features and dtype for the features in the same positions.

If a scalar, each item in each feature in self is multiplied with other.

See examples in EventSet.__add__() to see how to match samplings, dtypes and index, in order to apply arithmetic operators in different EventSets.

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [10, 3, 2]},
...     same_sampling_as=a
... )

>>> c = a * b
>>> c
indexes: []
features: [('f1', int64)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ 0 300 400]
...

Example with scalar value

```python

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200], "f2": [10, -10, 5]}
... )

>>> b = a * 2
>>> b
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [ 0 200 400]
        'f2': [ 20 -20 10]
...

>>> b = 2 * a
>>> b
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [ 0 200 400]
        'f2': [ 20 -20 10]
...

```

Args: other: EventSet or scalar value.

Returns: Result of the operation.

ne #

__ne__(other: Any) -> EventSetOrNode

Computes not equal (self != other) element-wise with another EventSet or a scalar value.

If an EventSet, each feature in self is compared element-wise to the feature in other in the same position. self and other must have the same sampling and the same number of features.

If a scalar value, each item in each feature in self is compared to other.

Note that it will always return True on NaNs (even if both are).

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [-10, 100, 5]},
...     same_sampling_as=a
... )

>>> c = a != b
>>> c
indexes: []
features: [('f1', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ True False True]
...

Example with scalar value

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200], "f2": [-10, 100, 5]}
... )

>>> b = a != 100
>>> b
indexes: []
features: [('f1', bool_), ('f2', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ True False True]
        'f2': [ True False True]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet or scalar value.	required

Returns:

Type	Description
`EventSetOrNode`	Result of the comparison.

neg #

__neg__() -> EventSetOrNode

Negates an EventSet element-wise.

Example

>>> a = tp.event_set(
...     timestamps=[1, 2],
...     features={"M": [1, -5], "N": [-1.0, 5.5]},
... )
>>> -a
indexes: ...
        'M': [-1  5]
        'N': [ 1.  -5.5]
...

Returns:

Type	Description
`EventSetOrNode`	Negated EventSet.

or #

__or__(other: Any) -> EventSetOrNode

Computes logical or (self | other) element-wise with another EventSet.

Each feature in self is compared element-wise to the feature in other in the same position.

self and other must have the same sampling, the same number of features, and all feature types must be bool.

See cast example in EventSet.__and__().

Example

>>> a = tp.event_set(timestamps=[1, 2, 3], features={"f1": [100, 150, 200]})

>>> # Sample boolean features
>>> b = a <= 100
>>> c = a >= 200

>>> d = b | c
>>> d
indexes: []
features: [('f1', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ True False True]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet with only boolean features.	required

Returns:

Type	Description
`EventSetOrNode`	EventSet with result of the comparison.

pow #

__pow__(other: Any) -> EventSetOrNode

Computes power with another EventSet or a scalar value element-wise.

If an EventSet, each feature in self is raised to the feature in other in the same position. self and other must have the same sampling, index, number of features and dtype for the features in the same positions.

If a scalar, each item in each feature in self is raised to other.

See examples in EventSet.__add__() to see how to match samplings, dtypes and index, in order to apply arithmetic operators in different EventSets.

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [5, 2, 4]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [0, 3, 2]},
...     same_sampling_as=a
... )

>>> c = a ** b
>>> c
indexes: []
features: [('f1', int64)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ 1 8 16]
...

Example with scalar value

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 2, 3], "f2": [1., 2., 3.]}
... )

>>> b = a ** 3
>>> b
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [ 0 8 27]
        'f2': [ 1. 8. 27.]
...

>>> c = 3 ** a
>>> c
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [ 1 9 27]
        'f2': [ 3. 9. 27.]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet or scalar value.	required

Returns:

Type	Description
`EventSetOrNode`	Result of the operation.

repr #

__repr__() -> str

Text representation, showing schema and data

setitem #

__setitem__(feature_names: Any, value: Any) -> None

Fails, features cannot be assigned.

sub #

__sub__(other: Any) -> EventSetOrNode

Subtracts an EventSet or a scalar value from self element-wise.

If an EventSet, each feature in self is subtracted from the feature in other in the same position. self and other must have the same sampling, index, number of features and dtype for the features in the same positions.

If a scalar, other is subtracted from each item in each feature in self.

See examples in EventSet.__add__() to see how to match samplings, dtypes and index, in order to apply arithmetic operators in different EventSets.

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [10, 20, -5]},
...     same_sampling_as=a
... )

>>> c = a - b
>>> c
indexes: []
features: [('f1', int64)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [-10 80 205]
...

Example with scalar value

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200], "f2": [10, -10, 5]}
... )

>>> b = a - 3
>>> b
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [ -3  97 197]
        'f2': [ 7 -13   2]
...

>>> c = 3 - a
>>> c
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [ 3  -97 -197]
        'f2': [-7 13  -2]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet or scalar value.	required

Returns:

Type	Description
`EventSetOrNode`	Result of the operation.

truediv #

__truediv__(other: Any) -> EventSetOrNode

Divides self by an EventSet or a scalar value element-wise.

If an EventSet, each feature in self is divided by the feature in other in the same position. self and other must have the same sampling, index, number of features and dtype for the features in the same positions.

If a scalar, each item in each feature in self is divided by other.

This operator cannot be used in features with dtypes int32 or int64. Cast to float before (see example) or use EventSet.__floordiv__() instead.

See examples in EventSet.__add__() to see how to match samplings, dtypes and index, in order to apply arithmetic operators in different EventSets.

Example with EventSet

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0.0, 100.0, 200.0]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [10.0, 20.0, 50.0]},
...     same_sampling_as=a
... )

>>> c = a / b
>>> c
indexes: []
features: [('f1', float64)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [0. 5. 4.]
...

Example casting integer features

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [10, 20, 50]},
...     same_sampling_as=a
... )

>>> # Cannot divide int64 features
>>> c = a / b
Traceback (most recent call last):
    ...
ValueError: Cannot use the divide operator on feature f1 of type int64. ...

>>> # Cast to tp.float64 or tp.float32 before
>>> c = a.cast(tp.float64) / b.cast(tp.float64)
>>> c
indexes: []
features: [('f1', float64)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [0. 5. 4.]
...

Example with scalar value

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0., 100., 200.], "f2": [10., -10., 5.]}
... )

>>> b = a / 2
>>> b
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [ 0. 50. 100.]
        'f2': [ 5. -5. 2.5]
...

>>> c = 1000 / a
>>> c
indexes: ...
        timestamps: [1. 2. 3.]
        'f1': [inf 10. 5.]
        'f2': [ 100. -100. 200.]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet or scalar value.	required

Returns:

Type	Description
`EventSetOrNode`	Result of the operation.

xor #

__xor__(other: Any) -> EventSetOrNode

Computes logical xor (self ^ other) element-wise with another EventSet.

Each feature in self is compared element-wise to the feature in other in the same position.

self and other must have the same sampling, the same number of features, and all feature types must be bool.

See cast example in EventSet.__and__().

Example

>>> a = tp.event_set(timestamps=[1, 2, 3], features={"f1": [100, 150, 200]})

>>> # Sample boolean features
>>> b = a > 100
>>> c = a < 200

>>> d = b ^ c
>>> d
indexes: []
features: [('f1', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [ True False True]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	EventSet with only boolean features.	required

Returns:

Type	Description
`EventSetOrNode`	EventSet with result of the comparison.

abs #

abs() -> EventSetOrNode

Gets the absolute value of an EventSet's features.

Example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"M":[np.nan, -1., 2.], "N":  [-1, -3, 5]},
... )
>>> a.abs()
indexes: ...
        'M': [nan 1. 2.]
        'N': [1 3 5]
...

Returns:

Type	Description
`EventSetOrNode`	EventSet with positive valued features.

add_index #

add_index(indexes: Union[str, List[str]]) -> EventSetOrNode

Adds indexes to an EventSet.

Usage example

>>> a = tp.event_set(
...     timestamps=[1, 2, 1, 0, 1, 1],
...     features={
...         "f1": [1, 1, 1, 2, 2, 2],
...         "f2": [1, 1, 2, 1, 1, 2],
...         "f3": [1, 1, 1, 1, 1, 1]
...     },
... )

>>> # No index
>>> a
indexes: []
features: [('f1', int64), ('f2', int64), ('f3', int64)]
events:
    (6 events):
        timestamps: [0. 1. 1. 1. 1. 2.]
        'f1': [2 1 1 2 2 1]
        'f2': [1 1 2 1 2 1]
        'f3': [1 1 1 1 1 1]
...

>>> # Add only "f1" as index
>>> b = a.add_index("f1")
>>> b
indexes: [('f1', int64)]
features: [('f2', int64), ('f3', int64)]
events:
    f1=1 (3 events):
        timestamps: [1. 1. 2.]
        'f2': [1 2 1]
        'f3': [1 1 1]
    f1=2 (3 events):
        timestamps: [0. 1. 1.]
        'f2': [1 1 2]
        'f3': [1 1 1]
...

>>> # Add "f1" and "f2" as indices
>>> b = a.add_index(["f1", "f2"])
>>> b
indexes: [('f1', int64), ('f2', int64)]
features: [('f3', int64)]
events:
    f1=1 f2=1 (2 events):
        timestamps: [1. 2.]
        'f3': [1 1]
    f1=1 f2=2 (1 events):
        timestamps: [1.]
        'f3': [1]
    f1=2 f2=1 (2 events):
        timestamps: [0. 1.]
        'f3': [1 1]
    f1=2 f2=2 (1 events):
        timestamps: [1.]
        'f3': [1]
...

Parameters:

Name	Type	Description	Default
`indexes`	`Union[str, List[str]]`	List of feature names (strings) that should be added to the indexes. These feature names should already exist in the input.	required

Returns:

Type	Description
`EventSetOrNode`	EventSet with the extended index.

Raises:

Type	Description
`KeyError`	If any of the specified `indexes` are not found in the input.

after #

after(
    timestamp: Union[int, float, datetime]
) -> EventSetOrNode

Filters events EventSet that happened after a particular timestamp.

The timestamp can be a datetime if the EventSet's timestamps are unix timestamps.

The comparison is strict, meaning that the obtained timestamps would be greater than (>) the provided timestamp.

This operation is equivalent to: input.filter(input.timestamps() < timestamp)

Usage example

>>> a = tp.event_set(
...     timestamps=[0, 1, 5, 6],
...     features={"f1": [0, 10, 50, 60]},
... )

>>> a.after(4)
indexes: []
features: [('f1', int64)]
events:
     (2 events):
        timestamps: [5. 6.]
        'f1': [50 60]
...

>>> from datetime import datetime
>>> a = tp.event_set(
...     timestamps=[datetime(2022, 1, 1), datetime(2022, 1, 2)],
...     features={"f1": [1, 2]},
... )

>>> a.after(datetime(2022, 1, 1, 12))
indexes: []
features: [('f1', int64)]
events:
     (1 events):
        timestamps: ['2022-01-02T00:00:00']
        'f1': [2]
...

Parameters:

Name	Type	Description	Default
`timestamp`	`Union[int, float, datetime]`	EventSet with a single boolean feature.	required

Returns:

Type	Description
`EventSetOrNode`	Filtered EventSet.

arccos #

arccos() -> EventSetOrNode

Calculates the inverse cosine of an EventSet's features.

Can only be used on floating point features.

Example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"M": [1.0, 0, -1.0]},
... )
>>> a.arccos()
indexes: ...
        timestamps: [1. 2. 3.]
        'M': [0.     1.5708 3.1416]
...

Returns:

Type	Description
`EventSetOrNode`	EventSetOrNode with inverse cosine of input features.

arcsin #

arcsin() -> EventSetOrNode

Calculates the inverse sine of an EventSet's features.

Can only be used on floating point features.

Example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"M": [0, 0.5, -0.5]},
... )
>>> a.arcsin()
indexes: ...
        timestamps: [1. 2. 3.]
        'M': [ 0.      0.5236 -0.5236]
...

Returns:

Type	Description
`EventSetOrNode`	EventSetOrNode with inverse sine of input features.

arctan #

arctan() -> EventSetOrNode

Calculates the inverse tangent of an EventSet's features.

Can only be used on floating point features.

Example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3, 4],
...     features={"M": [0, 1.0, -1.0, 5.0]},
... )
>>> a.arctan()
indexes: ...
        timestamps: [1. 2. 3. 4.]
        'M': [ 0.      0.7854 -0.7854  1.3734]
...

Returns:

Type	Description
`EventSetOrNode`	EventSetOrNode with inverse tangent of input features.

assign #

assign(**others: EventSetOrNode) -> EventSetOrNode

Assign new features to an EventSet.

If the name provided already exists on the EventSet, the feature is overriden.

Usage example

>>> a = tp.event_set(
...     timestamps=[1, 2],
...     features={'A': [1, 2]},
... )
>>> b = tp.event_set(
...     timestamps=[1, 2],
...     features={'B': [3, 4]},
...     same_sampling_as=a,
... )
>>> ab = a.assign(new_name=b)
>>> ab
indexes: []
features: [('A', int64), ('new_name', int64)]
events:
    (2 events):
        timestamps: [1. 2.]
        'A': [1 2]
        'new_name': [3 4]
...
>>> ab = a.assign(B=b, B2=b['B'] * 2)
>>> ab
indexes: []
features: [('A', int64), ('B', int64), ('B2', int64)]
events:
    (2 events):
        timestamps: [1. 2.]
        'A': [1 2]
        'B': [3 4]
        'B2': [6 8]
...

Parameters:

Name	Type	Description	Default
`**others`	`EventSetOrNode`	The argument name is going to be used as the new feature name. The EventSets need to have a single feature	`{}`

Returns:

Type	Description
`EventSetOrNode`	EventSet with the added feature.

before #

before(
    timestamp: Union[int, float, datetime]
) -> EventSetOrNode

Filters events EventSet that happened before a particular timestamp.

The timestamp can be a datetime if the EventSet's timestamps are unix timestamps.

The comparison is strict, meaning that the obtained timestamps would be less than (<) the provided timestamp.

This operation is equivalent to: input.filter(input.timestamps() < timestamp)

Usage example

>>> a = tp.event_set(
...     timestamps=[0, 1, 5, 6],
...     features={"f1": [0, 10, 50, 60]},
... )

>>> a.before(5)
indexes: []
features: [('f1', int64)]
events:
     (2 events):
        timestamps: [0. 1.]
        'f1': [ 0 10]
...

>>> from datetime import datetime
>>> a = tp.event_set(
...     timestamps=[datetime(2022, 1, 1), datetime(2022, 1, 2)],
...     features={"f1": [1, 2]},
... )

>>> a.before(datetime(2022, 1, 1, 12))
indexes: []
features: [('f1', int64)]
events:
     (1 events):
        timestamps: ['2022-01-01T00:00:00']
        'f1': [1]
...

Parameters:

Name	Type	Description	Default
`timestamp`	`Union[int, float, datetime]`	EventSet with a single boolean feature.	required

Returns:

Type	Description
`EventSetOrNode`	Filtered EventSet.

begin #

begin() -> EventSetOrNode

Generates a single timestamp at the beginning of the EventSet, per index group.

Usage example

>>> a = tp.event_set(
...     timestamps=[5, 6, 7, -1],
...     features={"f": [50, 60, 70, -10], "idx": [1, 1, 1, 2]},
...     indexes=["idx"]
... )

>>> a_ini = a.begin()
>>> a_ini
indexes: [('idx', int64)]
features: []
events:
    idx=1 (1 events):
        timestamps: [5.]
    idx=2 (1 events):
        timestamps: [-1.]
...

Returns:

Type	Description
`EventSetOrNode`	A feature-less EventSet with a single timestamp per index group.

calendar_day_of_month #

calendar_day_of_month(
    tz: Union[str, float, int] = 0
) -> EventSetOrNode

Obtains the day of month the timestamps in an EventSet's sampling are in.

Features in the input are ignored, only the timestamps are used and they must be unix timestamps (is_unix_timestamp=True).

Output feature contains numbers between 1 and 31.

By default, the timezone is UTC unless the tz argument is specified, as an offset in hours or a timezone name. See EventSet.calendar_hour() for an example using timezones.

Usage example

>>> a = tp.event_set(
...    timestamps=["2023-02-04", "2023-02-20", "2023-03-01", "2023-05-07"],
... )
>>> b = a.calendar_day_of_month()
>>> b
indexes: ...
features: [('calendar_day_of_month', int32)]
events:
    (4 events):
        timestamps: [...]
        'calendar_day_of_month': [ 4 20  1  7]
...

Parameters:

Name	Type	Description	Default
`tz`	`Union[str, float, int]`	timezone name (see `pytz.all_timezones`) or UTC offset in hours.	`0`

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature with the day of the month each timestamp in `sampling` belongs to.

calendar_day_of_week #

calendar_day_of_week(
    tz: Union[str, float, int] = 0
) -> EventSetOrNode

Obtains the day of the week the timestamps in an EventSet's sampling are in.

Features in the input are ignored, only the timestamps are used and they must be unix timestamps (is_unix_timestamp=True).

Output feature contains numbers from 0 (Monday) to 6 (Sunday).

By default, the timezone is UTC unless the tz argument is specified, as an offset in hours or a timezone name. See EventSet.calendar_hour() for an example using timezones.

Usage example

>>> a = tp.event_set(
...    timestamps=["2023-06-19", "2023-06-21", "2023-06-25", "2023-07-03"],
... )
>>> b = a.calendar_day_of_week()
>>> b
indexes: ...
features: [('calendar_day_of_week', int32)]
events:
    (4 events):
        timestamps: [...]
        'calendar_day_of_week': [0  2  6  0]
...

Parameters:

Name	Type	Description	Default
`tz`	`Union[str, float, int]`	timezone name (see `pytz.all_timezones`) or UTC offset in hours.	`0`

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature with the day of the week each timestamp in `sampling` belongs to.

calendar_day_of_year #

calendar_day_of_year(
    tz: Union[str, float, int] = 0
) -> EventSetOrNode

Obtains the day of year the timestamps in an EventSet's sampling are in.

Features in the input are ignored, only the timestamps are used and they must be unix timestamps (is_unix_timestamp=True).

Output feature contains numbers between 1 and 366.

By default, the timezone is UTC unless the tz argument is specified, as an offset in hours or a timezone name. See EventSet.calendar_hour() for an example using timezones.

Usage example

>>> a = tp.event_set(
...    timestamps=["2020-01-01", "2021-06-01", "2022-12-31", "2024-12-31"],
... )
>>> b = a.calendar_day_of_year()
>>> b
indexes: ...
features: [('calendar_day_of_year', int32)]
events:
    (4 events):
        timestamps: [...]
        'calendar_day_of_year': [ 1 152 365 366]
...

Parameters:

Name	Type	Description	Default
`tz`	`Union[str, float, int]`	timezone name (see `pytz.all_timezones`) or UTC offset in hours.	`0`

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature with the day of the year each timestamp in `sampling` belongs to.

calendar_hour #

calendar_hour(
    tz: Union[str, float, int] = 0
) -> EventSetOrNode

Obtains the hour the timestamps in an EventSet's sampling are in.

Features in the input are ignored, only the timestamps are used and they must be unix timestamps (is_unix_timestamp=True).

Output feature contains numbers between 0 and 23.

By default, the timezone is UTC unless the tz argument is specified, as an offset in hours or a timezone name. See EventSet.calendar_hour() for an example using timezones.

Basic example with UTC datetimes

>>> from datetime import datetime
>>> a = tp.event_set(
...    timestamps=[datetime(2020,1,1,18,30), datetime(2020,1,1,23,59)],
... )
>>> b = a.calendar_hour()
>>> b
indexes: ...
features: [('calendar_hour', int32)]
events:
    (2 events):
        timestamps: [...]
        'calendar_hour': [18 23]
...

Example with timezone

>>> # UTC datetimes (unless datetime(tzinfo=...) is used)
>>> a = tp.event_set(timestamps=["2020-01-01 09:00",
...                              "2020-01-01 15:00"])

>>> # Option 1: specify UTC-3 offset in hours
>>> a.calendar_hour(tz=-3)
indexes: ...
        'calendar_hour': [ 6 12]
...

>>> # Option 2: specify timezone name (see pytz.all_timezones)
>>> a.calendar_hour(tz="America/Montevideo")
indexes: ...
        'calendar_hour': [ 6 12]
...

>>> # No timezone specified, get UTC hour
>>> a.calendar_hour()
indexes: ...
        'calendar_hour': [ 9 15]
...

Parameters:

Name	Type	Description	Default
`tz`	`Union[str, float, int]`	timezone name (see `pytz.all_timezones`) or UTC offset in hours.	`0`

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature with the hour each timestamp in `sampling` belongs to.

calendar_iso_week #

calendar_iso_week(
    tz: Union[str, float, int] = 0
) -> EventSetOrNode

Obtains the ISO week the timestamps in an EventSet's sampling are in.

Features in the input are ignored, only the timestamps are used and they must be unix timestamps (is_unix_timestamp=True).

Output feature contains numbers between 1 and 53.

By default, the timezone is UTC unless the tz argument is specified, as an offset in hours or a timezone name. See EventSet.calendar_hour() for an example using timezones.

Usage example

>>> a = tp.event_set(
...    # Note: 2023-01-01 is Sunday in the same week as 2022-12-31
...    timestamps=["2022-12-31", "2023-01-01", "2023-01-02", "2023-12-20"],
... )
>>> b = a.calendar_iso_week()
>>> b
indexes: ...
features: [('calendar_iso_week', int32)]
events:
    (4 events):
        timestamps: [...]
        'calendar_iso_week': [52 52 1 51]
...

Parameters:

Name	Type	Description	Default
`tz`	`Union[str, float, int]`	timezone name (see `pytz.all_timezones`) or UTC offset in hours.	`0`

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature with the ISO week each timestamp in `sampling` belongs to.

calendar_minute #

calendar_minute(
    tz: Union[str, float, int] = 0
) -> EventSetOrNode

Obtain the minute the timestamps in an EventSet's sampling are in.

Features in the input are ignored, only the timestamps are used and they must be unix timestamps (is_unix_timestamp=True).

Output feature contains numbers between 0 and 59.

By default, the timezone is UTC unless the tz argument is specified, as an offset in hours or a timezone name. See EventSet.calendar_hour() for an example using timezones.

Usage example

>>> from datetime import datetime
>>> a = tp.event_set(
...    timestamps=[datetime(2020,1,1,18,30), datetime(2020,1,1,23,59)],
...    name='random_hours'
... )
>>> b = a.calendar_minute()
>>> b
indexes: ...
features: [('calendar_minute', int32)]
events:
    (2 events):
        timestamps: [...]
        'calendar_minute': [30 59]
...

Parameters:

Name	Type	Description	Default
`tz`	`Union[str, float, int]`	timezone name (see `pytz.all_timezones`) or UTC offset in hours.	`0`

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature with the minute each timestamp in `sampling` belongs to.

calendar_month #

calendar_month(
    tz: Union[str, float, int] = 0
) -> EventSetOrNode

Obtains the month the timestamps in an EventSet's sampling are in.

Features in the input are ignored, only the timestamps are used and they must be unix timestamps (is_unix_timestamp=True).

Output feature contains numbers between 1 and 12.

By default, the timezone is UTC unless the tz argument is specified, as an offset in hours or a timezone name. See EventSet.calendar_hour() for an example using timezones.

Usage example

>>> a = tp.event_set(
...    timestamps=["2023-02-04", "2023-02-20", "2023-03-01", "2023-05-07"],
...    name='special_events'
... )
>>> b = a.calendar_month()
>>> b
indexes: ...
features: [('calendar_month', int32)]
events:
    (4 events):
        timestamps: [...]
        'calendar_month': [2 2 3 5]
...

Parameters:

Name	Type	Description	Default
`tz`	`Union[str, float, int]`	timezone name (see `pytz.all_timezones`) or UTC offset in hours.	`0`

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature with the month each timestamp in `sampling` belongs to.

calendar_second #

calendar_second(
    tz: Union[str, float, int] = 0
) -> EventSetOrNode

Obtains the second the timestamps in an EventSet's sampling are in.

Features in the input are ignored, only the timestamps are used and they must be unix timestamps (is_unix_timestamp=True).

Output feature contains numbers between 0 and 59.

By default, the timezone is UTC unless the tz argument is specified, as an offset in hours or a timezone name. See EventSet.calendar_hour() for an example using timezones.

Usage example

>>> from datetime import datetime
>>> a = tp.event_set(
...    timestamps=[datetime(2020,1,1,18,30,55), datetime(2020,1,1,23,59,0)],
...    name='random_hours'
... )
>>> b = a.calendar_second()
>>> b
indexes: ...
features: [('calendar_second', int32)]
events:
    (2 events):
        timestamps: [...]
        'calendar_second': [55 0]
...

Parameters:

Name	Type	Description	Default
`tz`	`Union[str, float, int]`	timezone name (see `pytz.all_timezones`) or UTC offset in hours.	`0`

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature with the second each timestamp in `sampling` belongs to.

calendar_year #

calendar_year(
    tz: Union[str, float, int] = 0
) -> EventSetOrNode

Obtains the year the timestamps in an EventSet's sampling are in.

Features in the input are ignored, only the timestamps are used and they must be unix timestamps (is_unix_timestamp=True).

By default, the timezone is UTC unless the tz argument is specified, as an offset in hours or a timezone name. See EventSet.calendar_hour() for an example using timezones.

Usage example

>>> a = tp.event_set(
...    timestamps=["2021-02-04", "2022-02-20", "2023-03-01", "2023-05-07"],
...    name='random_moments'
... )
>>> b = a.calendar_year()
>>> b
indexes: ...
features: [('calendar_year', int32)]
events:
    (4 events):
        timestamps: [...]
        'calendar_year': [2021 2022 2023 2023]
...

Parameters:

Name	Type	Description	Default
`tz`	`Union[str, float, int]`	timezone name (see `pytz.all_timezones`) or UTC offset in hours.	`0`

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature with the year each timestamp in `sampling` belongs to.

cast #

cast(
    target: TargetDtypes, check_overflow: bool = True
) -> EventSetOrNode

Casts the data types of an EventSet's features.

Features not impacted by cast are kept.

Usage example

>>> a = tp.event_set(
...     timestamps=[1, 2],
...     features={"A": [0, 2], "B": ['a', 'b'], "C": [5.0, 5.5]},
... )

>>> # Cast all input features to the same dtype
>>> b = a[["A", "C"]].cast(tp.float32)
>>> b
indexes: []
features: [('A', float32), ('C', float32)]
events:
    (2 events):
        timestamps: [1. 2.]
        'A': [0. 2.]
        'C': [5.  5.5]
...


>>> # Cast by feature name
>>> b = a.cast({'A': bool, 'C': int})
>>> b
indexes: []
features: [('A', bool_), ('B', str_), ('C', int64)]
events:
    (2 events):
        timestamps: [1. 2.]
        'A': [False  True]
        'B': [b'a' b'b']
        'C': [5  5]
...

>>> # Map original_dtype -> target_dtype
>>> b = a.cast({float: int, int: float})
>>> b
indexes: []
features: [('A', float64), ('B', str_), ('C', int64)]
events:
    (2 events):
        timestamps: [1. 2.]
        'A': [0. 2.]
        'B': [b'a' b'b']
        'C': [5  5]
...

Parameters:

Name	Type	Description	Default
`target`	`TargetDtypes`	Single dtype or a map. Providing a single dtype will cast all columns to it. The mapping keys can be either feature names or the original dtypes (and not both types mixed), and the values are the target dtypes for them. All dtypes must be Temporian types (see `dtype.py`).	required
`check_overflow`	`bool`	Flag to check overflow when casting to a dtype with a shorter range (e.g: `INT64`->`INT32`). Note that this check adds some computation overhead. Defaults to `True`.	`True`

Returns:

Type	Description
`EventSetOrNode`	New EventSet (or the same if no features actually changed dtype), with the same feature names as the input one, but with the new dtypes as specified in `target`.

Raises:

Type	Description
`ValueError`	If `check_overflow=True` and some value is out of the range of the `target` dtype.
`ValueError`	If trying to cast a non-numeric string to numeric dtype.
`ValueError`	If `target` is neither a dtype nor a mapping.
`ValueError`	If `target` is a mapping, but some of the keys are not a dtype nor a feature in `input.feature_names`, or if those types are mixed.

check_same_sampling #

check_same_sampling(other: EventSet)

Checks if two EventSets have the same sampling.

cos #

cos() -> EventSetOrNode

Calculates the cosine of an EventSet's features.

Can only be used on floating point features.

Example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3, 4, 5],
...     features={"M": [0, np.pi/3, np.pi/2, np.pi, 2*np.pi]},
... )
>>> a.cos()
indexes: ...
        timestamps: [1. 2. 3. 4. 5.]
        'M': [ 1.0000e+00  5.0000e-01  6.1232e-17 -1.0000e+00  1.0000e+00]
...

Returns:

Type	Description
`EventSetOrNode`	EventSetOrNode with cosine of input features.

cumprod #

cumprod(
    sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode

Computes the cumulative product of values over each feature in an EventSet.

This operation only supports floating-point features.

Missing (NaN) values are not accounted for. The output will be NaN until the input contains at least one numeric value.

Warning: The cumprod function leverages an infinite window length for its calculations, which may lead to considerable computational overhead with increasing dataset sizes.

Example

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 3],
...     features={"value": [1.0, 2.0, 10.0, 12.0]},
... )

>>> b = a.cumprod()
>>> b
indexes: ...
    (4 events):
        timestamps: [0. 1. 2. 3.]
        'value': [  1.   2.  20. 240.]
...

Examples with sampling

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 5, 6, 7],
...     features={"value": [1, 2, 10, 12, np.nan, 2]},
... )

>>> # Cumulative product at 5 and 10
>>> b = tp.event_set(timestamps=[5, 10])
>>> c = a.cumprod(sampling=b)
>>> c
indexes: ...
    (2 events):
        timestamps: [ 5. 10.]
        'value': [240. 480.]
...

>>> # Product all values in the EventSet
>>> c = a.cumprod(sampling=a.end())
>>> c
indexes: ...
    (1 events):
        timestamps: [7.]
        'value': [480.]
...

Parameters:

Name	Type	Description	Default
`sampling`	`Optional[EventSetOrNode]`	Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used.	`None`

Returns:

Type	Description
`EventSetOrNode`	Cumulative product of each feature.

cumsum #

cumsum(
    sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode

Computes the cumulative sum of values over each feature in an EventSet.

Foreach timestamp, calculate the sum of the feature from the beginning. Shorthand for moving_sum(event, window_length=np.inf).

Missing (NaN) values are not accounted for. The output will be NaN until the input contains at least one numeric value.

If sampling is specified or window_length is an EventSet, the moving window is sampled at each timestamp in them, else it is sampled on the input's.

Example

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 5, 6, 7],
...     features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )

>>> b = a.cumsum()
>>> b
indexes: ...
    (6 events):
        timestamps: [0. 1. 2. 5. 6. 7.]
        'value': [ 0. 1.  6.  16.  31.  51.]
...

Examples with sampling

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 5, 6, 7],
...     features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )

>>> # Cumulative sum at 5 and 10
>>> b = tp.event_set(timestamps=[5, 10])
>>> c = a.cumsum(sampling=b)
>>> c
indexes: ...
    (2 events):
        timestamps: [ 5. 10.]
        'value': [16. 51.]
...

>>> # Sum all values in the EventSet
>>> c = a.cumsum(sampling=a.end())
>>> c
indexes: ...
    (1 events):
        timestamps: [7.]
        'value': [51.]
...

Parameters:

Name	Type	Description	Default
`sampling`	`Optional[EventSetOrNode]`	Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used.	`None`

Returns:

Type	Description
`EventSetOrNode`	Cumulative sum of each feature.

drop #

drop(
    feature_names: Union[str, List[str]]
) -> EventSetOrNode

Removes a subset of features from an EventSet.

Usage example

>>> a = tp.event_set(
...     timestamps=[1, 2],
...     features={"A": [1, 2], "B": ['s', 'm'], "C": [5.0, 5.5]},
... )

>>> # Drop single feature
>>> bc = a.drop('A')
>>> bc
indexes: []
features: [('B', str_), ('C', float64)]
events:
    (2 events):
        timestamps: [1. 2.]
        'B': [b's' b'm']
        'C': [5.  5.5]
...

>>> # Drop multiple features
>>> c = a.drop(['A', 'B'])
>>> c
indexes: []
features: [('C', float64)]
events:
    (2 events):
        timestamps: [1. 2.]
        'C': [5.  5.5]
...

Parameters:

Name	Type	Description	Default
`feature_names`	`Union[str, List[str]]`	Name or list of names of the features to drop from the input.	required

Returns:

Type	Description
`EventSetOrNode`	EventSet containing all features execpt the ones dropped.

drop_index #

drop_index(
    indexes: Optional[Union[str, List[str]]] = None,
    keep: bool = True,
) -> EventSetOrNode

Removes indexes from an EventSet.

Usage example

>>> a = tp.event_set(
...     timestamps=[1, 2, 1, 0, 1, 1],
...     features={
...         "f1": [1, 1, 1, 2, 2, 2],
...         "f2": [1, 1, 2, 1, 1, 2],
...         "f3": [1, 1, 1, 1, 1, 1]
...     },
...     indexes=["f1", "f2"]
... )

>>> # Both f1 and f2 are indices
>>> a
indexes: [('f1', int64), ('f2', int64)]
features: [('f3', int64)]
events:
    f1=1 f2=1 (2 events):
        timestamps: [1. 2.]
        'f3': [1 1]
    f1=1 f2=2 (1 events):
        timestamps: [1.]
        'f3': [1]
    f1=2 f2=1 (2 events):
        timestamps: [0. 1.]
        'f3': [1 1]
    f1=2 f2=2 (1 events):
        timestamps: [1.]
        'f3': [1]
...

>>> # Drop "f2", remove it from features
>>> b = a.drop_index("f2", keep=False)
>>> b
indexes: [('f1', int64)]
features: [('f3', int64)]
events:
    f1=1 (3 events):
        timestamps: [1. 1. 2.]
        'f3': [1 1 1]
    f1=2 (3 events):
        timestamps: [0. 1. 1.]
        'f3': [1 1 1]
...

>>> # Drop both indices, keep them as features
>>> b = a.drop_index(["f2", "f1"])
>>> b
indexes: []
features: [('f3', int64), ('f2', int64), ('f1', int64)]
events:
    (6 events):
        timestamps: [0. 1. 1. 1. 1. 2.]
        'f3': [1 1 1 1 1 1]
        'f2': [2 1 1 2 2 1]
        'f1': [1 2 1 2 1 1]
...

Parameters:

Name	Type	Description	Default
`indexes`	`Optional[Union[str, List[str]]]`	Index column(s) to be removed from the input. This can be a single column name (`str`) or a list of column names (`List[str]`). If not specified or set to `None`, all indexes in the input will be removed. Defaults to `None`.	`None`
`keep`	`bool`	Flag indicating whether the removed indexes should be kept as features in the output EventSet. Defaults to `True`.	`True`

Returns:

Type	Description
`EventSetOrNode`	EventSet with the specified indexes removed. If `keep` is set to
`EventSetOrNode`	`True`, the removed indexes will be included as features in it.

Raises:

Type	Description
`ValueError`	If an empty list is provided as the `index_names` argument.
`KeyError`	If any of the specified `index_names` are missing from the input's index.
`ValueError`	If a feature name coming from the indexes already exists in the input, and the `keep` flag is set to `True`.

end #

end() -> EventSetOrNode

Generates a single timestamp at the end of an EventSet, per index key.

Usage example

>>> a = tp.event_set(
...     timestamps=[5, 6, 7, 1],
...     features={"f": [50, 60, 70, 10], "idx": [1, 1, 1, 2]},
...     indexes=["idx"]
... )

>>> a_end = a.end()
>>> a_end
indexes: [('idx', int64)]
features: []
events:
    idx=1 (1 events):
        timestamps: [7.]
    idx=2 (1 events):
        timestamps: [1.]
...

Returns:

Type	Description
`EventSetOrNode`	A feature-less EventSet with a single timestamp per index group.

enumerate #

enumerate() -> EventSetOrNode

Create an int64 feature with the ordinal position of each event in an EventSet.

Each index group is enumerated independently.

Usage

>>> a = tp.event_set(
...    timestamps=[-1, 2, 3, 5, 0],
...    features={"cat": ["A", "A", "A", "A", "B"]},
...    indexes=["cat"],
... )
>>> b = a.enumerate()
>>> b
indexes: [('cat', str_)]
features: [('enumerate', int64)]
events:
    cat=b'A' (4 events):
        timestamps: [-1.  2.  3.  5.]
        'enumerate': [0 1 2 3]
    cat=b'B' (1 events):
        timestamps: [0.]
        'enumerate': [0]
...

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature with each event's ordinal position in its index group.

equal #

equal(other: Any) -> EventSetOrNode

Checks element-wise equality of an EventSet to another one or to a single value.

Each feature is compared element-wise to the feature in other in the same position. Note that it will always return False on NaN elements.

Inputs must have the same sampling and the same number of features.

Example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
...     timestamps=[1, 2, 3],
...     features={"f2": [-10, 100, 5]},
...     same_sampling_as=a
... )

>>> # WARN: Don't use this for element-wise comparison
>>> a == b
False

>>> # Element-wise comparison to a scalar value
>>> c = a.equal(100)
>>> c
indexes: []
features: [('f1', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [False True False]
...

>>> # Element-wise comparison between two EventSets
>>> c = a.equal(b)
>>> c
indexes: []
features: [('f1', bool_)]
events:
    (3 events):
        timestamps: [1. 2. 3.]
        'f1': [False True False]
...

Parameters:

Name	Type	Description	Default
`other`	`Any`	Second EventSet or single value to compare.	required

Returns:

Type	Description
`EventSetOrNode`	EventSet with boolean features.

experimental_fast_fourier_transform #

experimental_fast_fourier_transform(
    *,
    num_events: int,
    hop_size: Optional[int] = None,
    window: Optional[str] = None,
    num_spectral_lines: Optional[int] = None
) -> EventSetOrNode

Computes the Fast Fourier Transform of an EventSet with a single tp.float32 feature.

WARNING: This operator is experimental. The implementation is not yet optimized for speed, and the operator signature might change in the future.

The window length is defined in number of events, instead of timestamp duration like most other operators. The 'num_events' argument needs to be specified by kwarg i.e. fast_fourier_transform(num_events=5) instead of fast_fourier_transform(5).

The operator returns the amplitude of each spectral line as separate tp.float32 features named "a0", "a1", "a2", etc. By default, num_events // 2 spectral line are returned.

Usage

>>> a = tp.event_set(
...    timestamps=[1,2,3,4,5,6],
...    features={"x": [4.,3.,2.,6.,2.,1.]},
... )
>>> b = a.experimental_fast_fourier_transform(num_events=4, window="hamming")
>>> b
indexes: []
features: [('a0', float64), ('a1', float64)]
events:
     (2 events):
        timestamps: [4. 6.]
        'a0': [4.65 6.4 ]
        'a1': [2.1994 4.7451]
...

Parameters:

Name	Type	Description	Default
`num_events`	`int`	Size of the FFT expressed as a number of events.	required
`window`	`Optional[str]`	Optional window function applied before the FFT. if None, no window is applied. Supported values are: "hamming".	`None`
`hop_size`	`Optional[int]`	Step, in number of events, between consecutive outputs. Default to num_events//2.	`None`
`num_spectral_lines`	`Optional[int]`	Number of returned spectral lines. If set, the operators returns the `num_spectral_lines` low frequency spectral lines. `num_spectral_lines` should be between 1 and `num_events`.	`None`

Returns:

Type	Description
`EventSetOrNode`	EventSet containing the amplitude of each frequency band of the Fourier Transform.

fillna #

fillna(value: float = 0.0) -> EventSetOrNode

Replaces all the NaN values with value.

Features that cannot contain NaN values (e.g. integer or bytes features) are not impacted.

Usage example

>>> import math
>>> a = tp.event_set(
...     timestamps=[0, 1, 3],
...     features={
...         "f1": [0., 10., math.nan],
...         "f2": ["a","b",""]},
... )

>>> a.fillna()
indexes: []
features: [('f1', float64), ('f2', str_)]
events:
     (3 events):
        timestamps: [0. 1. 3.]
        'f1': [ 0. 10.  0.]
        'f2': [b'a' b'b' b'']
...

Parameters:

Name	Type	Description	Default
`value`	`float`	Value to replace Nans with?	`0.0`

Returns:

Type	Description
`EventSetOrNode`	EventSet without NaNs.

filter #

filter(
    condition: Optional[EventSetOrNode] = None,
) -> EventSetOrNode

Filters out events in an EventSet for which a condition is false.

Each timestamp in the input is only kept if the corresponding value for that timestamp in condition is True.

the input and condition must have the same sampling, and condition must have one single feature, of boolean type.

filter(x) is equivalent to filter(x,x). filter(x) can be used to convert a boolean mask into a timestamps.

Usage example

>>> a = tp.event_set(
...     timestamps=[0, 1, 5, 6],
...     features={"f1": [0, 10, 50, 60], "f2": [50, 100, 500, 600]},
... )

>>> # Example boolean condition
>>> condition = a["f1"] > 20
>>> condition
indexes: ...
        timestamps: [0. 1. 5. 6.]
        'f1': [False False  True  True]
...

>>> # Filter only True timestamps
>>> filtered = a.filter(condition)
>>> filtered
indexes: ...
        timestamps: [5. 6.]
        'f1': [50 60]
        'f2': [500 600]
...

Parameters:

Name	Type	Description	Default
`condition`	`Optional[EventSetOrNode]`	EventSet with a single boolean feature.	`None`

Returns:

Type	Description
`EventSetOrNode`	Filtered EventSet.

filter_empty_index #

filter_empty_index() -> EventSetOrNode

Filters out indexes without events.

Usage example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3, 4],
...     features={
...         "i1": [1, 1, 2, 2],
...         "f1": [10, 11, 12, 13],
...     },
...     indexes=["i1"]
... )

>>> filtered = a.filter(a["f1"] <= 11).filter_empty_index()
>>> filtered
indexes: [('i1', int64)]
features: [('f1', int64)]
events:
    i1=1 (2 events):
        timestamps: [1. 2.]
        'f1': [10 11]
...

Returns:

Type	Description
`EventSetOrNode`	Filtered EventSet.

filter_moving_count #

filter_moving_count(
    window_length: Duration,
) -> EventSetOrNode

Filters out events such that no more than one output event is within a tailing time window of window_length.

Filtering is applied in chronological order: An event received at time t is filtered out if there is a non-filtered out event in (t-window_length, t].

This operator is different from (evset.moving_count(window_length) == 0).filter(). In filter_moving_count a filtered event does not block following events.

Usage example

>>> a = tp.event_set(timestamps=[1, 2, 3])
>>> b = a.filter_moving_count(window_length=1.5)
>>> b
indexes: []
features: []
events:
     (2 events):
        timestamps: [1. 3.]
...

Returns:

Type	Description
`EventSetOrNode`	EventSet without features with the filtered events.

get_arbitrary_index_data #

get_arbitrary_index_data() -> Optional[IndexData]

Gets data from an arbitrary index key.

If the EventSet is empty, return None.

get_arbitrary_index_key #

get_arbitrary_index_key() -> Optional[IndexKey]

Gets an arbitrary index key.

If the EventSet is empty, return None.

get_index_value #

get_index_value(
    index_key: IndexKey, normalize: bool = True
) -> IndexData

Gets the value for a specified index key.

The index key must be a tuple of values corresponding to the indexes of the EventSet.

isnan #

isnan() -> EventSetOrNode

Returns boolean features, True in the NaN elements of the EventSet.

Note that for int and bool this will always be False since those types don't support NaNs. It only makes actual sense to use on float (or tp.float32) features.

join #

join(
    other: EventSetOrNode,
    how: str = "left",
    on: Optional[str] = None,
) -> EventSetOrNode

Join EventSets with different samplings.

Join features from two EventSets based on timestamps. Optionally, join on timestamps and an extra int64 feature. Joined EventSets should have the same index and non-overlapping feature names.

To concatenate EventSets with the same sampling, use tp.glue() instead. tp.glue() is almost free while EventSet.join() can be expensive.

To resample an EventSets according to another EventSets's sampling, use EventSet.resample() instead.

Example:

```python
>>> a = tp.event_set(timestamps=[0, 1, 2], features={"A": [0, 10, 20]})
>>> b = tp.event_set(timestamps=[0, 2, 4], features={"B": [0., 2., 4.]})

>>> # Left join
>>> c = a.join(b)
>>> c
indexes: []
features: [('A', int64), ('B', float64)]
events:
    (3 events):
        timestamps: [0. 1. 2.]
        'A': [ 0 10 20]
        'B': [ 0. nan 2.]
...

```

Example with an index and feature join:

```python
>>> a = tp.event_set(
...     timestamps=[0, 1, 1, 1],
...     features={
...         "idx": [1, 1, 2, 2],
...         "match": [1, 2, 4, 5],
...         "A": [10, 20, 40, 50],
...     },
...     indexes=["idx"]
... )
>>> b = tp.event_set(
...     timestamps=[0, 1, 0, 1, 1, 1],
...     features={
...         "idx": [1, 1, 2, 2, 2, 2],
...         "match": [1, 2, 3, 4, 5, 6],
...         "B": [10., 20., 30., 40., 50., 60.],
...     },
...     indexes=["idx"]
... )

>>> # Join by index and 'match'
>>> c = a.join(b, on="match")
>>> c
indexes: [('idx', int64)]
features: [('match', int64), ('A', int64), ('B', float64)]
events:
    idx=1 (2 events):
        timestamps: [0. 1.]
        'match': [1 2]
        'A': [10 20]
        'B': [10. 20.]
    idx=2 (2 events):
        timestamps: [1. 1.]
        'match': [4 5]
        'A': [40 50]
        'B': [40. 50.]
...

```

Parameters:

Name	Type	Description	Default
`other`	`EventSetOrNode`	Right EventSet to join.	required
`how`	`str`	Whether to perform a `"left"`, `"inner"`, or `"outer"` join. Currently, only `"left"` join is supported.	`'left'`
`on`	`Optional[str]`	Optional extra int64 feature name to join on.	`None`

Returns:

Type	Description
`EventSetOrNode`	The joined EventSets.

lag #

lag(duration: Duration) -> EventSetOrNode

Adds a delay to an EventSet's timestamps.

In other words, shifts the timestamp values forwards in time.

Usage example

>>> a = tp.event_set(
...     timestamps=[0, 1, 5, 6],
...     features={"value": [0, 1, 5, 6]},
... )

>>> b = a.lag(tp.duration.seconds(2))
>>> b
indexes: ...
    (4 events):
        timestamps: [2. 3. 7. 8.]
        'value': [0 1 5 6]
...

Parameters:

Name	Type	Description	Default
`duration`	`Duration`	Duration to lag by.	required

Returns:

Type	Description
`EventSetOrNode`	Lagged EventSet.

leak #

leak(duration: Duration) -> EventSetOrNode

Subtracts a duration from an EventSet's timestamps.

In other words, shifts the timestamp values backward in time.

Note that this operator moves future data into the past, and should be used with caution to prevent unwanted future leakage. For instance, this op should generally not be used to compute the input features of a model.

Usage example

>>> a = tp.event_set(
...     timestamps=[0, 1, 5, 6],
...     features={"value": [0, 1, 5, 6]},
... )

>>> b = a.leak(tp.duration.seconds(2))
>>> b
indexes: ...
    (4 events):
        timestamps: [-2. -1. 3. 4.]
        'value': [0 1 5 6]
...

Parameters:

Name	Type	Description	Default
`duration`	`Duration`	Duration to leak by.	required

Returns:

Type	Description
`EventSetOrNode`	Leaked EventSet.

log #

log() -> EventSetOrNode

Calculates the natural logarithm of an EventSet's features.

Can only be used on floating point features.

Example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3, 4, 5],
...     features={"M": [np.e, 1., 2., 10., -1.]},
... )
>>> a.log()
indexes: ...
        timestamps: [1. 2. 3. 4. 5.]
        'M': [1. 0. 0.6931 2.3026 nan]
...

Returns:

Type	Description
`EventSetOrNode`	EventSetOr with logarithm of input features.

map #

map(
    func: MapFunction,
    output_dtypes: Optional[TargetDtypes] = None,
    receive_extras: bool = False,
) -> EventSetOrNode

Applies a function on each value of an EventSet's features.

The function receives the scalar value, and if receive_extras is True, also a MapExtras object containing information about the value's position in the EventSet. The MapExtras object should not be modified by the function, since it is shared across all calls.

If the output of the functon has a different dtype than the input, the output_dtypes argument must be specified.

This operator is slow. When possible, existing operators should be used.

A Temporian graph with a map operator is not serializable.

Usage example with lambda function

>>> a = tp.event_set(
...     timestamps=[0, 1, 2],
...     features={"value": [10, 20, 30]},
... )

>>> b = a.map(lambda v: v + 1)
>>> b
indexes: ...
    (3 events):
        timestamps: [0. 1. 2.]
        'value': [11 21 31]
...

Usage example with output_dtypes:

>>> a = tp.event_set(
...     timestamps=[0, 1, 2],
...     features={"a": [10, 20, 30], "b": ["100", "200", "300"]},
... )

>>> def f(value):
...     if value.dtype == np.int64:
...         return float(value) + 1
...     else:
...         return int(value) + 2

>>> b = a.map(f, output_dtypes={"a": float, "b": int})
>>> b
indexes: ...
    (3 events):
        timestamps: [0. 1. 2.]
        'a': [11. 21. 31.]
        'b': [102 202 302]
...

Usage example with MapExtras:

>>> a = tp.event_set(
...     timestamps=[0, 1, 2],
...     features={"value": [10, 20, 30]},
... )

>>> def f(value, extras):
...     return f"{extras.feature_name}-{extras.timestamp}-{value}"

>>> b = a.map(f, output_dtypes=str, receive_extras=True)
>>> b
indexes: ...
    (3 events):
        timestamps: [0. 1. 2.]
        'value': [b'value-0.0-10' b'value-1.0-20' b'value-2.0-30']
...

Parameters:

Name	Type	Description	Default
`func`	`MapFunction`	The function to apply on each value.	required
`output_dtypes`	`Optional[TargetDtypes]`	Expected dtypes of the output feature(s) after applying the function to them. If not provided, the output dtypes will be expected to be the same as the input ones. If a single dtype, all features will be expected to have that dtype. If a mapping, the keys can be either feature names or the input dtypes (and not both types mixed), and the values are the target dtypes for them. All dtypes must be Temporian types (see `dtype.py`).	`None`
`receive_extras`	`bool`	Whether the function should receive a `MapExtras` object as second argument.	`False`

Returns:

Type	Description
`EventSetOrNode`	EventSet with the function applied on each value.

memory_usage #

memory_usage() -> int

Gets the approximated memory usage of the EventSet in bytes.

Takes into account garbage collector overhead.

moving_count #

moving_count(
    window_length: WindowLength,
    sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode

Gets the number of events in a sliding window.

Create a tp.int32 feature containing the number of events in the time window (t - window_length, t].

sampling can't be specified if a variable window_length is specified (i.e. if window_length is an EventSet).

If sampling is specified or window_length is an EventSet, the moving window is sampled at each timestamp in them, else it is sampled on the input's.

Example without sampling

>>> a = tp.event_set(timestamps=[0, 1, 2, 5, 6, 7])
>>> b = a.moving_count(tp.duration.seconds(2))
>>> b
indexes: ...
    (6 events):
        timestamps: [0. 1. 2. 5. 6. 7.]
        'count': [1 2 2 1 2 2]
...

Example with sampling

>>> a = tp.event_set(timestamps=[0, 1, 2, 5])
>>> b = tp.event_set(timestamps=[-1, 0, 1, 2, 3, 4, 5, 6, 7])
>>> c = a.moving_count(tp.duration.seconds(2), sampling=b)
>>> c
indexes: ...
    (9 events):
        timestamps: [-1. 0. 1. 2. 3. 4. 5. 6. 7.]
        'count': [0 1 2 2 1 0 1 1 0]
...

Example with variable window length

>>> a = tp.event_set(timestamps=[0, 1, 2, 5])
>>> b = tp.event_set(
...     timestamps=[0, 3, 3, 3, 9],
...     features={
...         "w": [1, 0.5, 3.5, 2.5, 5],
...     },
... )
>>> c = a.moving_count(window_length=b)
>>> c
indexes: []
features: [('count', int32)]
events:
    (5 events):
        timestamps: [0. 3. 3. 3. 9.]
        'count': [1 0 3 2 1]
...

Example with index

>>> a = tp.event_set(
...     timestamps=[1, 2, 3, 0, 1, 2],
...     features={
...         "idx": ["i1", "i1", "i1", "i2", "i2", "i2"],
...     },
...     indexes=["idx"],
... )
>>> b = a.moving_count(tp.duration.seconds(2))
>>> b
indexes: [('idx', str_)]
features: [('count', int32)]
events:
    idx=b'i1' (3 events):
        timestamps: [1. 2. 3.]
        'count': [1 2 2]
    idx=b'i2' (3 events):
        timestamps: [0. 1. 2.]
        'count': [1 2 2]
...

Parameters:

Name	Type	Description	Default
`window_length`	`WindowLength`	Sliding window's length.	required
`sampling`	`Optional[EventSetOrNode]`	Timestamps to sample the sliding window's value at. If not provided, timestamps in `input` are used.	`None`

Returns:

Type	Description
`EventSetOrNode`	EventSet containing the count of events in `input` in a moving window.

moving_max #

moving_max(
    window_length: WindowLength,
    sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode

Computes the maximum in a sliding window over an EventSet.

For each t in sampling, and for each index and feature independently, returns at time t the max of non-nan values for the feature in the window (t - window_length, t].

sampling can't be specified if a variable window_length is specified (i.e. if window_length is an EventSet).

If sampling is specified or window_length is an EventSet, the moving window is sampled at each timestamp in them, else it is sampled on the input's.

If the window does not contain any values (e.g., all the values are missing, or the window does not contain any sampling), outputs missing values.

Example

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 5, 6, 7],
...     features={"value": [np.nan, 1, 5, 1, 15, 20]},
... )

>>> b = a.moving_max(tp.duration.seconds(4))
>>> b
indexes: ...
    (6 events):
        timestamps: [0. 1. 2. 5. 6. 7.]
        'value': [nan 1. 5. 5. 15. 20.]
...

See EventSet.moving_count() for examples with external sampling and indices.

Parameters:

Name	Type	Description	Default
`window_length`	`WindowLength`	Sliding window's length.	required
`sampling`	`Optional[EventSetOrNode]`	Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used.	`None`

Returns:

Type	Description
`EventSetOrNode`	EventSet containing the max of each feature in the input.

moving_min #

moving_min(
    window_length: WindowLength,
    sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode

Computes the minimum of values in a sliding window over an EventSet.

For each t in sampling, and for each index and feature independently, returns at time t the minimum of non-nan values for the feature in the window (t - window_length, t].

sampling can't be specified if a variable window_length is specified (i.e. if window_length is an EventSet).

If sampling is specified or window_length is an EventSet, the moving window is sampled at each timestamp in them, else it is sampled on the input's.

If the window does not contain any values (e.g., all the values are missing, or the window does not contain any sampling), outputs missing values.

Example

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 5, 6, 7],
...     features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )

>>> b = a.moving_min(tp.duration.seconds(4))
>>> b
indexes: ...
    (6 events):
        timestamps: [0. 1. 2. 5. 6. 7.]
        'value': [nan 1. 1. 5. 10. 10.]
...

See EventSet.moving_count() for examples of moving window operations with external sampling and indices.

Parameters:

Name	Type	Description	Default
`window_length`	`WindowLength`	Sliding window's length.	required
`sampling`	`Optional[EventSetOrNode]`	Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used.	`None`

Returns:

Type	Description
`EventSetOrNode`	EventSet containing the minimum of each feature in the input.

moving_product #

moving_product(
    window_length: WindowLength,
    sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode

Computes the product of values in a sliding window over an EventSet.

This operation only supports floating-point features.

For each t in sampling, and for each feature independently, returns at time t the product of non-zero and non-NaN values for the feature in the window (t - window_length, t].

sampling can't be specified if a variable window_length is specified (i.e., if window_length is an EventSet).

If sampling is specified or window_length is an EventSet, the moving window is sampled at each timestamp in them, else it is sampled on the input's.

Zeros result in the accumulator's result being 0 for the window. NaN values are ignored in the calculation of the product. If the window does not contain any NaN, zero or any non-zero values (e.g., all values are missing), the output for that window is an empty array.

Example

>>> a = tp.event_set(
...     timestamps=[0, 1, 2],
...     features={"value": [np.nan, 1, 5]},
... )

>>> b = a.moving_product(tp.duration.seconds(1))
>>> b
indexes: ...
    (3 events):
        timestamps: [0. 1. 2.]
        'value': [nan 1. 5.]
...

See EventSet.moving_count() for examples of moving window operations with external sampling and indices.

Parameters:

Name	Type	Description	Default
`window_length`	`WindowLength`	Sliding window's length.	required
`sampling`	`Optional[EventSetOrNode]`	Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used.	`None`

Returns:

Type	Description
`EventSetOrNode`	EventSet containing the moving product of each feature in the input,
`EventSetOrNode`	considering non-zero and non-NaN values only.

moving_standard_deviation #

moving_standard_deviation(
    window_length: WindowLength,
    sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode

Computes the standard deviation of values in a sliding window over an EventSet.

For each t in sampling, and for each feature independently, returns at time t the standard deviation for the feature in the window (t - window_length, t].

sampling can't be specified if a variable window_length is specified (i.e. if window_length is an EventSet).

If sampling is specified or window_length is an EventSet, the moving window is sampled at each timestamp in them, else it is sampled on the input's.

Missing values (such as NaNs) are ignored.

If the window does not contain any values (e.g., all the values are missing, or the window does not contain any sampling), outputs missing values.

Example

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 5, 6, 7],
...     features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )

>>> b = a.moving_standard_deviation(tp.duration.seconds(4))
>>> b
indexes: ...
    (6 events):
        timestamps: [0. 1. 2. 5. 6. 7.]
        'value': [ nan 0.  2.  2.5  2.5  4.0825]
...

See EventSet.moving_count() for examples of moving window operations with external sampling and indices.

Parameters:

Name	Type	Description	Default
`window_length`	`WindowLength`	Sliding window's length.	required
`sampling`	`Optional[EventSetOrNode]`	Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used.	`None`

Returns:

Type	Description
`EventSetOrNode`	EventSet containing the moving standard deviation of each feature in the input.

moving_sum #

moving_sum(
    window_length: WindowLength,
    sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode

Computes the sum of values in a sliding window over an EventSet.

For each t in sampling, and for each feature independently, returns at time t the sum of values for the feature in the window (t - window_length, t].

sampling can't be specified if a variable window_length is specified (i.e. if window_length is an EventSet).

If sampling is specified or window_length is an EventSet, the moving window is sampled at each timestamp in them, else it is sampled on the input's.

Missing values (such as NaNs) are ignored.

If the window does not contain any values (e.g., all the values are missing, or the window does not contain any sampling), outputs missing values.

Example

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 5, 6, 7],
...     features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )

>>> b = a.moving_sum(tp.duration.seconds(4))
>>> b
indexes: ...
    (6 events):
        timestamps: [0. 1. 2. 5. 6. 7.]
        'value': [ 0. 1.  6.  15.  25.  45.]
...

See EventSet.moving_count() for examples of moving window operations with external sampling and indices.

Parameters:

Name	Type	Description	Default
`window_length`	`WindowLength`	Sliding window's length.	required
`sampling`	`Optional[EventSetOrNode]`	Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used.	`None`

Returns:

Type	Description
`EventSetOrNode`	EventSet containing the moving sum of each feature in the input.

node #

node(force_new_node: bool = False) -> EventSetNode

Creates an EventSetNode able to consume this EventSet.

If called multiple times with force_new_node=False (default), the same node is returned.

Usage example

>>> my_evset = tp.event_set(
...     timestamps=[1, 2, 3, 4],
...     features={
...         "feature_1": [0.5, 0.6, np.nan, 0.9],
...         "feature_2": ["red", "blue", "red", "blue"],
...     },
... )
>>> my_node = my_evset.node()

Parameters:

Name	Type	Description	Default
`force_new_node`	`bool`	If false (default), return the same node each time `node` is called. If true, a new node is created each time.	`False`

Returns:

Type	Description
`EventSetNode`	An EventSetNode able to consume this EventSet.

notnan #

notnan() -> EventSetOrNode

Returns boolean features, False in the NaN elements of an EventSet.

Equivalent to ~evset.isnan(...).

Note that for int and bool this will always be True since those types don't support NaNs. It only makes actual sense to use on float (or tp.float32) features.

num_events #

num_events() -> int

Total number of events.

num_indexes #

num_indexes() -> int

Total number of index values.

plot #

plot(*args, **wargs) -> Any

Plots the EventSet. See tp.plot() for details.

Example usage:

```python
>>> evset = tp.event_set(timestamps=[1, 2, 3], features={"f1": [0, 42, 10]})
>>> evset.plot()

```

prefix #

prefix(prefix: str) -> EventSetOrNode

Adds a prefix to the names of the features in an EventSet.

Usage example

>>> a = tp.event_set(
...    timestamps=[0, 1],
...    features={"f1": [0, 2], "f2": [5, 6]}
... )
>>> b = a * 5

>>> # Prefix before glue to avoid duplicated names
>>> c = tp.glue(a.prefix("original_"), b.prefix("result_"))
>>> c
indexes: ...
        'original_f1': [0 2]
        'original_f2': [5 6]
        'result_f1': [ 0 10]
        'result_f2': [25 30]
...

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to add in front of the feature names.	required

Returns:

Type	Description
`EventSetOrNode`	Prefixed EventSet.

propagate #

propagate(
    sampling: EventSetOrNode, resample: bool = False
) -> EventSetOrNode

Propagates feature values over another EventSet's index.

Given the input and sampling where the input's indexes are a subset of sampling's (e.g., the indexes of the input are ["x"], and the indexes of sampling are ["x","y"]), duplicates the features of the input over the indexes of sampling.

Index values in self but not in sampling are removed. An index value without timestamps is created for each index values in sampling but not in self.

Example use case

>>> products = tp.event_set(
...     timestamps=[1, 2, 3, 1, 2, 3],
...     features={
...         "product": [1, 1, 1, 2, 2, 2],
...         "sales": [100., 200., 500., 1000., 2000., 5000.]
...     },
...     indexes=["product"],
... )
>>> store = tp.event_set(
...     timestamps=[1, 2, 3, 4, 5],
...     features={
...         "sales": [10000., 20000., 30000., 5000., 1000.]
...     },
... )

>>> # First attempt: divide to calculate fraction of total store sales
>>> products / store
Traceback (most recent call last):
    ...
ValueError: Arguments don't have the same index. ...

>>> # Second attempt: propagate index
>>> store_prop = store.propagate(products)
>>> products / store_prop
Traceback (most recent call last):
    ...
ValueError: Arguments should have the same sampling. ...

>>> # Third attempt: propagate + resample
>>> store_resample = store.propagate(products, resample=True)
>>> div = products / store_resample
>>> div
indexes: [('product', int64)]
features: [('sales', float64)]
events:
    product=1 (3 events):
        timestamps: [1. 2. 3.]
        'sales': [0.01   0.01   0.0167]
    product=2 (3 events):
        timestamps: [1. 2. 3.]
        'sales': [0.1    0.1    0.1667]
...

Parameters:

Name	Type	Description	Default
`sampling`	`EventSetOrNode`	EventSet with the indexes to propagate to.	required
`resample`	`bool`	If true, apply a `EventSet.resample()` before propagating, for the output to have the same sampling as `sampling`.	`False`

Returns:

Type	Description
`EventSetOrNode`	EventSet propagated over `sampling`'s index.

rename #

rename(
    features: Optional[
        Union[str, Dict[str, str], List[str]]
    ] = None,
    indexes: Optional[
        Union[str, Dict[str, str], List[str]]
    ] = None,
) -> EventSetOrNode

Renames an EventSet's features and index.

If the input has a single feature, then the features can be a single string with the new name.

If the input has multiple features, then features can either be (1) a dictionary mapping old names to the new names, or (2) a list of new names of the same size as evtset.schema.feature_names().

The indexes renaming follows the same criteria, accepting a single string, a mapping, or a list.

Usage example

>>> a = tp.event_set(
...    timestamps=[0, 1],
...    features={"f1": [0, 2], "f2": [5, 6]}
... )
>>> b = 5 * a

>>> # Rename single feature
>>> b_1 = b["f1"].rename("f1_result")
>>> b_1
indexes: []
features: [('f1_result', int64)]
events:
    (2 events):
        timestamps: [0. 1.]
        'f1_result': [ 0 10]
...

>>> # Rename multiple features with a dictionary
>>> b_rename = b.rename({"f1": "5xf1", "f2": "5xf2"})
>>> b_rename
indexes: []
features: [('5xf1', int64), ('5xf2', int64)]
events:
    (2 events):
        timestamps: [0. 1.]
        '5xf1': [ 0 10]
        '5xf2': [25 30]
...

>>> # Rename multiple features with a list
>>> b_rename = b.rename(["5xf1", "5xf2"])
>>> b_rename
indexes: []
features: [('5xf1', int64), ('5xf2', int64)]
events:
    (2 events):
        timestamps: [0. 1.]
        '5xf1': [ 0 10]
        '5xf2': [25 30]
...

Parameters:

Name	Type	Description	Default
`features`	`Optional[Union[str, Dict[str, str], List[str]]]`	New feature name or mapping from old names to new names.	`None`
`indexes`	`Optional[Union[str, Dict[str, str], List[str]]]`	New index name or mapping from old names to new names.	`None`

Returns:

Type	Description
`EventSetOrNode`	EventSet with renamed features and index.

resample #

resample(sampling: EventSetOrNode) -> EventSetOrNode

Resamples an EventSet at each timestamp of another EventSet.

If a timestamp in sampling does not have a corresponding timestamp in the input, the last timestamp in the input is used instead. If this timestamp is anterior to an value in the input, the value is replaced by dtype.MissingValue(...).

Example

>>> a = tp.event_set(
...     timestamps=[1, 5, 8, 9],
...     features={"f1": [1.0, 2.0, 3.0, 4.0]}
... )
>>> b = tp.event_set(timestamps=[-1, 1, 6, 10])
>>> c = a.resample(b)
>>> c
indexes: ...
        timestamps: [-1.  1.  6. 10.]
        'f1': [nan  1.  2.  4.]
...

Parameters:

Name	Type	Description	Default
`sampling`	`EventSetOrNode`	EventSet to use the sampling of.	required

Returns:

Type	Description
`EventSetOrNode`	Resampled EventSet, with same sampling as `sampling`.

select #

select(
    feature_names: Union[str, List[str]]
) -> EventSetOrNode

Selects a subset of features from an EventSet.

Usage example

>>> a = tp.event_set(
...     timestamps=[1, 2],
...     features={"A": [1, 2], "B": ['s', 'm'], "C": [5.0, 5.5]},
... )

>>> # Select single feature
>>> b = a.select('B')
>>> # Equivalent
>>> b = a['B']
>>> b
indexes: []
features: [('B', str_)]
events:
    (2 events):
        timestamps: [1. 2.]
        'B': [b's' b'm']
...

>>> # Select multiple features
>>> bc = a.select(['B', 'C'])
>>> # Equivalent
>>> bc = a[['B', 'C']]
>>> bc
indexes: []
features: [('B', str_), ('C', float64)]
events:
    (2 events):
        timestamps: [1. 2.]
        'B': [b's' b'm']
        'C': [5.  5.5]
...

Parameters:

Name	Type	Description	Default
`feature_names`	`Union[str, List[str]]`	Name or list of names of the features to select from the input.	required

Returns:

Type	Description
`EventSetOrNode`	EventSet containing only the selected features.

select_index_values #

select_index_values(
    keys: Optional[IndexKeyList] = None,
    *,
    number: Optional[int] = None,
    fraction: Optional[float] = None
) -> EventSetOrNode

Selects a subset of index values from an EventSet.

Exactly one of keys, number, or fraction should be provided.

If number or fraction is specified, the index values are selected randomly.

If fraction is specified and fraction * len(index keys) doesn't result in an integer, the number of index values selected is rounded down.

If used in compiled or graph mode, the specified keys are compiled as-is along with the operator, which means that they must be available when loading and running the graph on new data.

Example with keys with a single index and a single key:

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 3],
...     features={
...         "f": [10, 20, 30, 40],
...         "x": ["A", "B", "A", "B"],
...     },
...     indexes=["x"],
... )
>>> b = a.select_index_values("A")
>>> b
indexes: [('x', str_)]
features: [('f', int64)]
events:
    x=b'A' (2 events):
        timestamps: [0. 2.]
        'f': [10 30]
...

Example with keys with multiple indexes and keys:

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 3],
...     features={
...         "f": [10, 20, 30, 40],
...         "x": [1, 1, 2, 2],
...         "y": ["A", "B", "A", "B"],
...     },
...     indexes=["x", "y"],
... )
>>> b = a.select_index_values([(1, "A"), (2, "B")])
>>> b
indexes: [('x', int64), ('y', str_)]
features: [('f', int64)]
events:
    x=1 y=b'A' (1 events):
        timestamps: [0.]
        'f': [10]
    x=2 y=b'B' (1 events):
        timestamps: [3.]
        'f': [40]
...

Example with number:

>>> import random
>>> random.seed(0)

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 3],
...     features={
...         "f": [10, 20, 30, 40],
...         "x": [1, 1, 2, 2],
...         "y": ["A", "B", "A", "B"],
...     },
...     indexes=["x", "y"],
... )
>>> b = a.select_index_values(number=2)
>>> b
indexes: [('x', int64), ('y', str_)]
features: [('f', int64)]
events:
    x=1 y=b'A' (1 events):
        timestamps: [0.]
        'f': [10]
    x=2 y=b'A' (1 events):
        timestamps: [2.]
        'f': [30]
...

Example with fraction:

>>> import random
>>> random.seed(0)

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 3],
...     features={
...         "f": [10, 20, 30, 40],
...         "x": [1, 1, 2, 2],
...         "y": ["A", "B", "A", "B"],
...     },
...     indexes=["x", "y"],
... )
>>> b = a.select_index_values(fraction=0.75)
>>> b
indexes: [('x', int64), ('y', str_)]
features: [('f', int64)]
events:
    x=1 y=b'A' (1 events):
        timestamps: [0.]
        'f': [10]
    x=2 y=b'A' (1 events):
        timestamps: [2.]
        'f': [30]
...

Parameters:

Name	Type	Description	Default
`keys`	`Optional[IndexKeyList]`	index key or list of index keys to select from the EventSet.	`None`
`number`	`Optional[int]`	number of index values to select. If `number` is greater than the number of index values, all the index values are selected.	`None`
`fraction`	`Optional[float]`	fraction of index values to select, expressed as a float between 0 and 1.	`None`

Returns:

Type	Description
`EventSetOrNode`	EventSet with a subset of the index values.

set_index #

set_index(indexes: Union[str, List[str]]) -> EventSetOrNode

Replaces the index in an EventSet.

Usage example

>>> a = tp.event_set(
...     timestamps=[1, 2, 1, 0, 1, 1],
...     features={
...         "f1": [1, 1, 1, 2, 2, 2],
...         "f2": [1, 1, 2, 1, 1, 2],
...         "f3": [1, 1, 1, 1, 1, 1]
...     },
...     indexes=["f1"],
... )

>>> # "f1" is the current index
>>> a
indexes: [('f1', int64)]
features: [('f2', int64), ('f3', int64)]
events:
    f1=1 (3 events):
        timestamps: [1. 1. 2.]
        'f2': [1 2 1]
        'f3': [1 1 1]
    f1=2 (3 events):
        timestamps: [0. 1. 1.]
        'f2': [1 1 2]
        'f3': [1 1 1]
...

>>> # Set "f2" as the only index, remove "f1"
>>> b = a.set_index("f2")
>>> b
indexes: [('f2', int64)]
features: [('f3', int64), ('f1', int64)]
events:
    f2=1 (4 events):
        timestamps: [0. 1. 1. 2.]
        'f3': [1 1 1 1]
        'f1': [2 1 2 1]
    f2=2 (2 events):
        timestamps: [1. 1.]
        'f3': [1 1]
        'f1': [1 2]
...

>>> # Set both "f1" and "f2" as indices
>>> b = a.set_index(["f1", "f2"])
>>> b
indexes: [('f1', int64), ('f2', int64)]
features: [('f3', int64)]
events:
    f1=1 f2=1 (2 events):
        timestamps: [1. 2.]
        'f3': [1 1]
    f1=1 f2=2 (1 events):
        timestamps: [1.]
        'f3': [1]
    f1=2 f2=1 (2 events):
        timestamps: [0. 1.]
        'f3': [1 1]
    f1=2 f2=2 (1 events):
        timestamps: [1.]
        'f3': [1]
...

Parameters:

Name	Type	Description	Default
`indexes`	`Union[str, List[str]]`	List of index / feature names (strings) used as the new indexes. These names should be either indexes or features in the input.	required

Returns:

Type	Description
`EventSetOrNode`	EventSet with the updated indexes.

Raises:

Type	Description
`KeyError`	If any of the specified `indexes` are not found in the input.

set_index_value #

set_index_value(
    index_key: IndexKey,
    value: IndexData,
    normalize: bool = True,
) -> None

Sets the value for a specified index key.

The index key must be a tuple of values corresponding to the indexes of the EventSet.

simple_moving_average #

simple_moving_average(
    window_length: WindowLength,
    sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode

Computes the average of values in a sliding window over an EventSet.

For each t in sampling, and for each feature independently, returns at time t the average value of the feature in the window (t - window_length, t].

sampling can't be specified if a variable window_length is specified (i.e. if window_length is an EventSet).

If sampling is specified or window_length is an EventSet, the moving window is sampled at each timestamp in them, else it is sampled on the input's.

Missing values (such as NaNs) are ignored.

If the window does not contain any values (e.g., all the values are missing, or the window does not contain any timestamp), outputs missing values.

Example

>>> a = tp.event_set(
...     timestamps=[0, 1, 2, 5, 6, 7],
...     features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )

>>> b = a.simple_moving_average(tp.duration.seconds(4))
>>> b
indexes: ...
    (6 events):
        timestamps: [0. 1. 2. 5. 6. 7.]
        'value': [ nan 1.  3. 7.5  12.5  15. ]
...

See EventSet.moving_count() for examples of moving window operations with external sampling and indices.

Parameters:

Name	Type	Description	Default
`window_length`	`WindowLength`	Sliding window's length.	required
`sampling`	`Optional[EventSetOrNode]`	Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used.	`None`

Returns:

Type	Description
`EventSetOrNode`	EventSet containing the moving average of each feature in the input.

sin #

sin() -> EventSetOrNode

Calculates the sine of an EventSet's features.

Can only be used on floating point features.

Example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3, 4, 5],
...     features={"M": [0, np.pi/2, np.pi, 3*np.pi/2, 2*np.pi]},
... )
>>> a.sin()
indexes: ...
        timestamps: [1. 2. 3. 4. 5.]
        'M': [ 0.0000e+00  1.0000e+00  1.2246e-16 -1.0000e+00 -2.4493e-16]
...

Returns:

Type	Description
`EventSetOrNode`	EventSetOrNode with sine of input features.

since_last #

since_last(
    steps: int = 1,
    sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode

Computes the amount of time since the last previous timestamp in an EventSet.

If a number of steps is provided, compute elapsed time after moving back that number of previous events.

Basic example with 1 and 2 steps

>>> a = tp.event_set(timestamps=[1, 5, 8, 8, 9])

>>> # Default: time since previous event
>>> b = a.since_last()
>>> b
indexes: ...
        timestamps: [1. 5. 8. 8. 9.]
        'since_last': [nan  4.  3.  0.  1.]
...

>>> # Time since 2 previous events
>>> b = a.since_last(steps=2)
>>> b
indexes: ...
        timestamps: [1. 5. 8. 8. 9.]
        'since_last': [nan  nan  7.  3.  1.]
...

If sampling is provided, the output will correspond to the time elapsed between each timestamp in sampling and the latest previous or equal timestamp in the input.

Example with sampling

>>> a = tp.event_set(timestamps=[1, 4, 5, 7])
>>> b = tp.event_set(timestamps=[-1, 2, 4, 6, 10])

>>> # Time elapsed between each sampling event
>>> # and the latest previous event in a
>>> c = a.since_last(sampling=b)
>>> c
indexes: ...
        timestamps: [-1. 2. 4. 6. 10.]
        'since_last': [nan  1.  0.  1. 3.]
...

>>> # 2 steps with sampling
>>> c = a.since_last(steps=2, sampling=b)
>>> c
indexes: ...
        timestamps: [-1. 2. 4. 6. 10.]
        'since_last': [nan  nan  3.  2. 5.]
...

Parameters:

Name	Type	Description	Default
`steps`	`int`	Number of previous events to compute elapsed time with.	`1`
`sampling`	`Optional[EventSetOrNode]`	EventSet to use the sampling of.	`None`

Returns:

Type	Description
`EventSetOrNode`	Resulting EventSet, with same sampling as `sampling` if provided, or as the input if not.

tan #

tan() -> EventSetOrNode

Calculates the tangent of an EventSet's features.

Can only be used on floating point features.

Example

>>> a = tp.event_set(
...     timestamps=[1, 2, 3, 4],
...     features={"M": [0, np.pi/4, np.pi/3, np.pi/6]},
... )
>>> a.tan()
indexes: ...
        timestamps: [1. 2. 3. 4.]
        'M': [0.     1.     1.7321 0.5774]
...

Returns:

Type	Description
`EventSetOrNode`	EventSetOrNode with tangent of input features.

tick #

tick(
    interval: Duration,
    align: bool = True,
    after_last: bool = True,
    before_first: bool = False,
) -> EventSetOrNode

Generates timestamps at regular intervals in the range of a guide EventSet.

Example with align

>>> a = tp.event_set(timestamps=[5, 9, 16])
>>> b = a.tick(interval=tp.duration.seconds(3), align=True)
>>> b
indexes: ...
        timestamps: [ 6. 9. 12. 15. 18.]
...

Example without align

>>> a = tp.event_set(timestamps=[5, 9, 16])
>>> b = a.tick(interval=tp.duration.seconds(3), align=False)
>>> b
indexes: ...
        timestamps: [ 5. 8. 11. 14. 17.]
...

Example with before_first

>>> a = tp.event_set(timestamps=[5, 9, 16])
>>> b = a.tick(interval=tp.duration.seconds(3), align=True, before_first=True)
>>> b
indexes: ...
        timestamps: [ 3. 6. 9. 12. 15. 18.]
...

Example without after_last

>>> a = tp.event_set(timestamps=[5, 9, 16])
>>> b = a.tick(interval=tp.duration.seconds(3), align=True, after_last=False)
>>> b
indexes: ...
        timestamps: [ 6. 9. 12. 15.]
...

Args: interval: Tick interval. align: If false, the first tick is generated at the first timestamp (similar to EventSet.begin()). If true (default), ticks are generated on timestamps that are multiple of interval. after_last: If True, a tick after the last timestamp is included. before_first: If True, a tick before the first timestamp is included.

Returns:

Type	Description
`EventSetOrNode`	A feature-less EventSet with regular timestamps.

tick_calendar #

tick_calendar(
    second: Optional[Union[int, Literal["*"]]] = None,
    minute: Optional[Union[int, Literal["*"]]] = None,
    hour: Optional[Union[int, Literal["*"]]] = None,
    mday: Optional[Union[int, Literal["*"]]] = None,
    month: Optional[Union[int, Literal["*"]]] = None,
    wday: Optional[Union[int, Literal["*"]]] = None,
    after_last: bool = True,
    before_first: bool = False,
) -> EventSetOrNode

Generates events periodically at fixed times or dates e.g. each month.

Events are generated in the range of the input EventSet independently for each index.

The usability is inspired in the crontab format, where arguments can take a value of '*' to tick at all values, or a fixed integer to tick only at that precise value.

Non-specified values (None), are set to '*' if a finer resolution argument is specified, or fixed to the first valid value if a lower resolution is specified. For example, setting only tick_calendar(hour='*') is equivalent to: tick_calendar(second=0, minute=0, hour='*', mday='*', month='*') , resulting in one tick at every exact hour of every day/month/year in the input guide range.

The datetime timezone is always assumed to be UTC.

Examples:

>>> # Every day (at 00:00:00) in the period (exactly one year)
>>> a = tp.event_set(timestamps=["2021-01-01", "2021-12-31 23:59:59"])
>>> b = a.tick_calendar(hour=0)
>>> b
indexes: ...
events:
    (366 events):
        timestamps: [...]
...


>>> # Every day at 2:30am
>>> b = a.tick_calendar(hour=2, minute=30)
>>> tp.glue(b.calendar_hour(), b.calendar_minute())
indexes: ...
events:
    (366 events):
        timestamps: [...]
        'calendar_hour': [2 2 2 ... 2 2 2]
        'calendar_minute': [30 30 30 ... 30 30 30]
...


>>> # Day 5 of every month (at 00:00)
>>> b = a.tick_calendar(mday=5)
>>> b.calendar_day_of_month()
indexes: ...
events:
    (13 events):
        timestamps: [...]
        'calendar_day_of_month': [5 5 5 ... 5 5 5]
...


>>> # 1st of February of every year
>>> a = tp.event_set(timestamps=["2020-01-01", "2021-12-31"])
>>> b = a.tick_calendar(month=2)
>>> tp.glue(b.calendar_day_of_month(), b.calendar_month())
indexes: ...
events:
    (3 events):
        timestamps: [...]
        'calendar_day_of_month': [1 1 1]
        'calendar_month': [2 2 2]
...

>>> # Every second in the period  (2 hours -> 7200 seconds)
>>> a = tp.event_set(timestamps=["2020-01-01 00:00:00",
...                              "2020-01-01 01:59:59"])
>>> b = a.tick_calendar(second='*')
>>> b
indexes: ...
events:
    (7200 events):
        timestamps: [...]
...

>>> # Every second of the minute 30 of every hour (00:30 and 01:30)
>>> a = tp.event_set(timestamps=["2020-01-01 00:00",
...                              "2020-01-01 02:00"])
>>> b = a.tick_calendar(second='*', minute=30)
>>> b
indexes: ...
events:
    (121 events):
        timestamps: [...]
...

>>> # Not allowed: intermediate arguments (minute, hour) not specified
>>> b = a.tick_calendar(second=1, mday=1)  # ambiguous meaning
Traceback (most recent call last):
    ...
ValueError: Can't set argument to None because previous and
following arguments were specified. Set to '*' or an integer ...

>>> # not after_last
>>> a = tp.event_set(timestamps=["2020-02-01", "2020-04-01"])
>>> b = a.tick_calendar(mday=10, after_last=False)
>>> tp.glue(b.calendar_day_of_month(), b.calendar_month())
indexes: ...
events:
    (2 events):
        timestamps: [...]
        'calendar_day_of_month': [10 10]
        'calendar_month': [2 3]
...

>>> # before_first
>>> a = tp.event_set(timestamps=["2020-02-01", "2020-04-01"])
>>> b = a.tick_calendar(mday=10, before_first=True)
>>> tp.glue(b.calendar_day_of_month(), b.calendar_month())
indexes: ...
events:
    (4 events):
        timestamps: [...]
        'calendar_day_of_month': [10 10 10 10]
        'calendar_month': [1 2 3 4]
...

Parameters:

Name	Type	Description	Default
`second`	`Optional[Union[int, Literal['*']]]`	'*' (any second), None (auto) or number in range `[0-59]` to tick at specific second of each minute.	`None`
`minute`	`Optional[Union[int, Literal['*']]]`	'*' (any minute), None (auto) or number in range `[0-59]` to tick at specific minute of each hour.	`None`
`hour`	`Optional[Union[int, Literal['*']]]`	'*' (any hour), None (auto), or number in range `[0-23]` to tick at specific hour of each day.	`None`
`mday`	`Optional[Union[int, Literal['*']]]`	'*' (any day), None (auto) or number in range `[1-31]` to tick at specific day of each month. Note that months without some particular day may not have any tick (e.g: day 31 on February).	`None`
`month`	`Optional[Union[int, Literal['*']]]`	'*' (any month), None (auto) or number in range `[1-12]` to tick at one particular month of each year.	`None`
`wday`	`Optional[Union[int, Literal['*']]]`	'*' (any day), None (auto) or number in range `[0-6]` (Sun-Sat) to tick at particular day of week. Can only be specified if `day_of_month` is `None`.	`None`
`after_last`	`bool`	If True, a tick after the last timestamp is included. Useful for window operations where you want the timestamps to be included in the range of the ticks.	`True`
`before_first`	`bool`	If True, a tick before the first timestamp is included. Useful for window operations where you want the timestamps to be included in the range of the ticks.	`False`

Returns:

Type	Description
`EventSetOrNode`	A feature-less EventSet with timestamps at specified interval.

timestamps #

timestamps() -> EventSetOrNode

Converts an EventSet's timestamps into a float64 feature.

Features in the input EventSet are ignored, only the timestamps are used.

Datetime timestamps are converted to unix timestamps.

Integer timestamps example

>>> from datetime import datetime
>>> a = tp.event_set(timestamps=[1, 2, 3, 5])
>>> b = a.timestamps()
>>> b
indexes: []
features: [('timestamps', float64)]
events:
    (4 events):
        timestamps: [1. 2. 3. 5.]
        'timestamps': [1. 2. 3. 5.]
...

Unix timestamps and filter example

>>> from datetime import datetime, timezone
>>> a = tp.event_set(
...    timestamps=[datetime(1970,1,1,0,0,30), datetime(2023,1,1,1,0,0)],
... )
>>> b = a.timestamps()

>>> # Filter using the timestamps
>>> max_date = datetime(2020, 1, 1, tzinfo=timezone.utc).timestamp()
>>> c = b.filter(b < max_date)

>>> # Operate like any other feature
>>> d = c * 5
>>> e = tp.glue(c.rename('filtered'), d.rename('multiplied'))
>>> e
indexes: []
features: [('filtered', float64), ('multiplied', float64)]
events:
    (1 events):
        timestamps: ['1970-01-01T00:00:30']
        'filtered': [30.]
        'multiplied': [150.]
...

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature named `timestamps` with each event's timestamp.

unique_timestamps #

unique_timestamps() -> EventSetOrNode

Removes events with duplicated timestamps from an EventSet.

Returns a feature-less EventSet where each timestamp from the original one only appears once. If the input is indexed, the unique operation is applied independently for each index.

Usage example

>>> a = tp.event_set(timestamps=[5, 9, 9, 16], features={'f': [1,2,3,4]})
>>> b = a.unique_timestamps()
>>> b
indexes: []
features: []
events:
    (3 events):
        timestamps: [ 5. 9. 16.]
...

Returns:

Type	Description
`EventSetOrNode`	EventSet without features with unique timestamps in the input.

until_next #

until_next(
    sampling: EventSetOrNode, timeout: Duration
) -> EventSetOrNode

Gets the duration until the next sampling event for each input event.

If no sampling event is observed before timeout time-units, returns NaN.

until_next is different from since_last in that since_last returns one value for each sampling (sampling events are after input events), while until_next returns one value for each input value (here again, sampling events are after input events).

The output EventSet has one event for each event in input, but with its timestamp moved forward to the nearest future event in sampling. If no timestamp in sampling is closer than timeout, it is moved by timeout into the future instead.

until_next is useful to measure the time it takes for an issue (input) to be detected by an alert (sampling).

Basic example with 1 and 2 steps

>>> a = tp.event_set(timestamps=[0, 10, 11, 20, 30])
>>> b = tp.event_set(timestamps=[1, 12, 21, 22, 42])
>>> c = a.until_next(sampling=b, timeout=5)
>>> c
indexes: []
features: [('until_next', float64)]
events:
    (5 events):
        timestamps: [ 1. 12. 12. 21. 35.]
        'until_next': [ 1.  2.  1.  1. nan]
...

Parameters:

Name	Type	Description	Default
`sampling`	`EventSetOrNode`	EventSet to use the sampling of.	required
`timeout`	`Duration`	Maximum amount of time to wait. If no sampling is observed before the timeout expires, the output feature value is NaN.	required

Returns:

Type	Description
`EventSetOrNode`	Resulting EventSet.

where #

where(
    on_true: Union[EventSetOrNode, Any],
    on_false: Union[EventSetOrNode, Any],
) -> EventSetOrNode

Choose event-wise feature values from on_true or on_false depending on the boolean value of self.

Given an input EventSet with a single boolean feature, create a new one using the same sampling, and choosing values from on_true when the input is True, otherwise take value from on_false.

Both on_true and on_false can be single values or EventSets with the same sampling as the boolean input and one single feature. In any case, both sources must have the same data type, or be explicitly casted to the same type beforehand.

Example with single values

>>> a = tp.event_set(timestamps=[5, 9, 9],
...                  features={'f': [True, True, False]})
>>> b = a.where(on_true='hello', on_false='goodbye')
>>> b
indexes: ...
events:
    (3 events):
        timestamps: [5. 9. 9.]
        'f': [b'hello' b'hello' b'goodbye']
...

Example with EventSets

>>> a = tp.event_set(timestamps=[5, 9, 10],
...                  features={'condition': [True, True, False],
...                            'yes': [1, 2, 3],
...                            'no': [-1, -2, -3]})

>>> b = a['condition'].where(a['yes'], a['no'])
>>> b
indexes: ...
events:
    (3 events):
        timestamps: [ 5. 9. 10.]
        'condition': [ 1 2 -3]
...

Example setting to NaN based on condition

>>> a = tp.event_set(timestamps=[5, 6, 7, 8, 9],
...                  features={'f': [1, 2, -3, -4, 5]})

>>> # Set values < 0 to nan (cast to float to support nan)
>>> b = (a['f'] >= 0).where(a['f'].cast(float), np.nan)
>>> b
indexes: ...
events:
    (5 events):
        timestamps: [5. 6. 7. 8. 9.]
        'f': [ 1. 2. nan nan 5.]
...

Parameters:

Name	Type	Description	Default
`on_true`	`Union[EventSetOrNode, Any]`	Source of values from when the condition is True.	required
`on_false`	`Union[EventSetOrNode, Any]`	Source of values from when the condition is False.	required

Returns:

Type	Description
`EventSetOrNode`	EventSet with a single feature and same sampling as input.

temporian.EventSet #

creator property #

__add__ #

__and__ #

__bool__ #

__floordiv__ #

__ge__ #

__getitem__ #

__gt__ #

__invert__ #

__le__ #

__lt__ #

__mod__ #

__mul__ #

__ne__ #

__neg__ #

__or__ #

__pow__ #

__repr__ #

__setitem__ #

__sub__ #

__truediv__ #

__xor__ #

abs #

add_index #

after #

arccos #

arcsin #

arctan #

assign #

before #

begin #

calendar_day_of_month #

calendar_day_of_week #

calendar_day_of_year #

calendar_hour #

calendar_iso_week #

calendar_minute #

calendar_month #

calendar_second #

calendar_year #

cast #

check_same_sampling #

cos #

cumprod #

cumsum #

drop #

drop_index #

end #

enumerate #

equal #

experimental_fast_fourier_transform #

fillna #

filter #

filter_empty_index #

filter_moving_count #

get_arbitrary_index_data #

get_arbitrary_index_key #

get_index_value #

isnan #

join #

lag #

leak #

log #

map #

memory_usage #

moving_count #

moving_max #

moving_min #

moving_product #

moving_standard_deviation #

moving_sum #

node #

notnan #

num_events #

num_indexes #

plot #

prefix #

propagate #

rename #

creator `property` #

add #

and #

bool #

floordiv #

ge #

getitem #

gt #

invert #

le #

lt #

mod #

mul #

ne #

neg #

or #

pow #

repr #

setitem #

sub #

truediv #

xor #