temporian.EventSet #
Actual temporal data.
Use tp.event_set()
to create an EventSet manually,
or tp.from_pandas()
to create an EventSet from a
pandas DataFrame.
creator
property
#
creator: Optional[Operator]
Creator.
The creator is the operator that outputted this EventSet. Manually
created EventSets have a None
creator.
__add__ #
__add__(other: Any) -> EventSetOrNode
Adds an EventSet
or a scalar value to
self
element-wise.
If an EventSet, each feature in self
is added to the feature in
other
in the same position. self
and other
must have the same
sampling, index, number of features and dtype for the features in the
same positions.
If a scalar, other
is added to each item in each feature in self
.
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200], "f2": [10, -10, 5]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f3": [-1, 1, 2], "f4": [1, -1, 5]},
... same_sampling_as=a
... )
>>> c = a + b
>>> c
indexes: []
features: [('f1', int64), ('f2', int64)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ -1 101 202]
'f2': [ 11 -11 10]
...
Example with scalar value
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200], "f2": [10, -10, 5]}
... )
>>> b = a + 3
>>> b
indexes: ...
timestamps: [1. 2. 3.]
'f1': [ 3 103 203]
'f2': [13 -7 8]
...
>>> b = 3 + a
>>> b
indexes: ...
timestamps: [1. 2. 3.]
'f1': [ 3 103 203]
'f2': [13 -7 8]
...
Cast dtypes example
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200], "f2": [10., -10., 5.]}
... )
>>> # Cannot add: f1 is int64 but f2 is float64
>>> c = a["f1"] + a["f2"]
Traceback (most recent call last):
...
ValueError: ... corresponding features should have the same dtype. ...
>>> # Cast f1 to float
>>> c = a["f1"].cast(tp.float64) + a["f2"]
>>> c
indexes: []
features: [('f1', float64)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ 10. 90. 205.]
...
Resample example
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"fa": [1, 2, 3]},
... )
>>> b = tp.event_set(
... timestamps=[-1, 1.5, 3, 5],
... features={"fb": [-10, 15, 30, 50]},
... )
>>> # Cannot add different samplings
>>> c = a + b
Traceback (most recent call last):
...
ValueError: ... should have the same sampling. ...
>>> # Resample a to match b timestamps
>>> c = a.resample(b) + b
>>> c
indexes: []
features: [('fa', int64)]
events:
(4 events):
timestamps: [-1. 1.5 3. 5. ]
'fa': [-10 16 33 53]
...
Reindex example
>>> a = tp.event_set(
... timestamps=[1, 2, 3, 4],
... features={
... "cat": [1, 1, 2, 2],
... "M": [10, 20, 30, 40]
... },
... indexes=["cat"]
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3, 4],
... features={
... "cat": [1, 1, 2, 2],
... "N": [10, 20, 30, 40]
... },
... )
>>> # Cannot add with different index (only 'a' is indexed by 'cat')
>>> c = a + b
Traceback (most recent call last):
...
ValueError: Arguments don't have the same index. ...
>>> # Add index 'cat' to b
>>> b = b.add_index("cat")
>>> # Make explicit same samplings and add
>>> c = a + b.resample(a)
>>> c
indexes: [('cat', int64)]
features: [('M', int64)]
events:
cat=1 (2 events):
timestamps: [1. 2.]
'M': [20 40]
cat=2 (2 events):
timestamps: [3. 4.]
'M': [60 80]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet or scalar value. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Result of the operation. |
__and__ #
__and__(other: Any) -> EventSetOrNode
Computes logical and (self & other
) element-wise with another
EventSet
.
Each feature in self
is compared element-wise to the feature in
other
in the same position.
self
and other
must have the same sampling, the same number of
features, and all feature types must be bool
(see cast example below).
Example
>>> a = tp.event_set(timestamps=[1, 2, 3], features={"f1": [100, 150, 200]})
>>> # Sample boolean features
>>> b = a > 100
>>> c = a < 200
>>> d = b & c
>>> d
indexes: []
features: [('f1', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [False True False]
...
Example casting integer to boolean
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 1, 1], "f2": [1, 1, 0]}
... )
>>> b = a.cast(bool)
>>> c = b["f1"] & b["f2"]
>>> c
indexes: []
features: [('f1', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [False True False]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet with only boolean features. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with result of the comparison. |
__floordiv__ #
__floordiv__(other: Any) -> EventSetOrNode
Divides self
by an EventSet
or a scalar
value and takes the floor of the result, element-wise.
If an EventSet, each feature in self
is divided by the feature in
other
in the same position. self
and other
must have the same
sampling, index, number of features and dtype for the features in the
same positions.
If a scalar, each item in each feature in self
is divided by other
.
See examples in EventSet.__add__()
to
see how to match samplings, dtypes and index, in order to apply
arithmetic operators in different EventSets.
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [10, 3, 150]},
... same_sampling_as=a
... )
>>> c = a // b
>>> c
indexes: []
features: [('f1', int64)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ 0 33 1]
...
Example with scalar value
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [1, 100, 200], "f2": [10., -10., 5.]}
... )
>>> b = a // 3
>>> b
indexes: ...
timestamps: [1. 2. 3.]
'f1': [ 0 33 66]
'f2': [ 3. -4. 1.]
...
>>> c = 300 // a
>>> c
indexes: ...
timestamps: [1. 2. 3.]
'f1': [300 3 1]
'f2': [ 30. -30. 60.]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet or scalar value. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Result of the operation. |
__ge__ #
__ge__(other: Any) -> EventSetOrNode
Computes greater equal (self >= other
) element-wise with another
EventSet
or a scalar value.
If an EventSet, each feature in self
is compared element-wise to the
feature in other
in the same position. self
and other
must have
the same sampling and the same number of features.
If a scalar value, each item in each feature in input
is compared to
value
.
Note that it will always return False on NaN elements.
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [-10, 100, 5]},
... same_sampling_as=a
... )
>>> c = a >= b
>>> c
indexes: []
features: [('f1', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ True True True]
...
Example with scalar
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200], "f2": [-10, 100, 5]}
... )
>>> b = a >= 100
>>> b
indexes: []
features: [('f1', bool_), ('f2', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [False True True]
'f2': [False True False]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet or scalar value. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Result of the comparison. |
__getitem__ #
__getitem__(feature_names: Union[str, List[str]])
Creates an EventSet with a subset of the features.
__gt__ #
__gt__(other: Any) -> EventSetOrNode
Computes greater (self > other
) element-wise with another
EventSet
or a scalar value.
If an EventSet, each feature in self
is compared element-wise to the
feature in other
in the same position. self
and other
must have
the same sampling and the same number of features.
If a scalar value, each item in each feature in self
is compared to
other
.
Note that it will always return False on NaN elements.
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [-10, 100, 5]},
... same_sampling_as=a
... )
>>> c = a > b
>>> c
indexes: []
features: [('f1', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ True False True]
...
Example with scalar
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200], "f2": [-10, 100, 5]}
... )
>>> b = a != 100
>>> b
indexes: []
features: [('f1', bool_), ('f2', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ True False True]
'f2': [ True False True]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet or scalar value. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Result of the comparison. |
__invert__ #
__invert__() -> EventSetOrNode
Inverts a boolean EventSet
element-wise.
Swaps False <-> True.
Does not work on integers, they should be cast to
tp.bool_
beforehand, using
EventSet.cast()
.
Example
>>> a = tp.event_set(
... timestamps=[1, 2],
... features={"M": [1, 5], "N": [1.0, 5.5]},
... )
>>> # Boolean EventSet
>>> b = a < 2
>>> b
indexes: ...
'M': [ True False]
'N': [ True False]
...
>>> # Inverted EventSet
>>> c = ~b
>>> c
indexes: ...
'M': [False True]
'N': [False True]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Inverted EventSet. |
__le__ #
__le__(other: Any) -> EventSetOrNode
Computes less equal (self <= other
) element-wise with another
EventSet
or a scalar value.
If an EventSet, each feature in self
is compared element-wise to the
feature in other
in the same position. self
and other
must have
the same sampling and the same number of features.
If a scalar value, each item in each feature in input
is compared to
value
.
Note that it will always return False on NaN elements.
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [-10, 100, 5]},
... same_sampling_as=a
... )
>>> c = a <= b
>>> c
indexes: []
features: [('f1', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [False True False]
...
Example with scalar
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200], "f2": [-10, 100, 5]}
... )
>>> b = a <= 100
>>> b
indexes: []
features: [('f1', bool_), ('f2', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ True True False]
'f2': [ True True True]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet or scalar value. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Result of the comparison. |
__lt__ #
__lt__(other: Any) -> EventSetOrNode
Computes less (self < other
) element-wise with another
EventSet
or a scalar value.
If an EventSet, each feature in self
is compared element-wise to the
feature in other
in the same position. self
and other
must have
the same sampling and the same number of features.
If a scalar value, each item in each feature in input
is compared to
value
.
Note that it will always return False on NaN elements.
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [-10, 100, 5]},
... same_sampling_as=a
... )
>>> c = a < b
>>> c
indexes: []
features: [('f1', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [False False False]
...
Example with scalar
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200], "f2": [-10, 100, 5]}
... )
>>> b = a < 100
>>> b
indexes: []
features: [('f1', bool_), ('f2', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ True False False]
'f2': [ True False True]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet or scalar value. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Result of the comparison. |
__mod__ #
__mod__(other: Any) -> EventSetOrNode
Computes modulo or remainder of division with another
EventSet
or a scalar value.
If an EventSet, each feature in self
is reduced modulo the feature in
other
in the same position. self
and other
must have the same
sampling, index, number of features and dtype for the features in the
same positions.
If a scalar, each item in each feature in self
is reduced modulo
other
.
See examples in EventSet.__add__()
to
see how to match samplings, dtypes and index, in order to apply
arithmetic operators in different EventSets.
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 7, 200]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [10, 5, 150]},
... same_sampling_as=a
... )
>>> c = a % b
>>> c
indexes: []
features: [('f1', int64)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ 0 2 50]
...
Example with scalar value
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [1, 100, 200], "f2": [10., -10., 5.]}
... )
>>> b = a % 3
>>> b
indexes: ...
timestamps: [1. 2. 3.]
'f1': [1 1 2]
'f2': [1. 2. 2.]
...
>>> c = 300 % a
>>> c
indexes: ...
timestamps: [1. 2. 3.]
'f1': [ 0 0 100]
'f2': [ 0. -0. 0.]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet or scalar value. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Result of the operation. |
__mul__ #
__mul__(other: Any) -> EventSetOrNode
Multiplies an EventSet
or a scalar value with
self
element-wise.
If an EventSet, each feature in self
is multiplied with the feature in
other
in the same position. self
and other
must have the same
sampling, index, number of features and dtype for the features in the
same positions.
If a scalar, each item in each feature in self
is multiplied with
other
.
See examples in EventSet.__add__()
to
see how to match samplings, dtypes and index, in order to apply
arithmetic operators in different EventSets.
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [10, 3, 2]},
... same_sampling_as=a
... )
>>> c = a * b
>>> c
indexes: []
features: [('f1', int64)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ 0 300 400]
...
Example with scalar value
```python
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200], "f2": [10, -10, 5]}
... )
>>> b = a * 2
>>> b
indexes: ...
timestamps: [1. 2. 3.]
'f1': [ 0 200 400]
'f2': [ 20 -20 10]
...
>>> b = 2 * a
>>> b
indexes: ...
timestamps: [1. 2. 3.]
'f1': [ 0 200 400]
'f2': [ 20 -20 10]
...
```
Args: other: EventSet or scalar value.
Returns: Result of the operation.
__ne__ #
__ne__(other: Any) -> EventSetOrNode
Computes not equal (self != other
) element-wise with another
EventSet
or a scalar value.
If an EventSet, each feature in self
is compared element-wise to
the feature in other
in the same position. self
and other
must have the same sampling and the same number of features.
If a scalar value, each item in each feature in self
is compared to
other
.
Note that it will always return True on NaNs (even if both are).
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [-10, 100, 5]},
... same_sampling_as=a
... )
>>> c = a != b
>>> c
indexes: []
features: [('f1', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ True False True]
...
Example with scalar value
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200], "f2": [-10, 100, 5]}
... )
>>> b = a != 100
>>> b
indexes: []
features: [('f1', bool_), ('f2', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ True False True]
'f2': [ True False True]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet or scalar value. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Result of the comparison. |
__neg__ #
__neg__() -> EventSetOrNode
Negates an EventSet
element-wise.
Example
>>> a = tp.event_set(
... timestamps=[1, 2],
... features={"M": [1, -5], "N": [-1.0, 5.5]},
... )
>>> -a
indexes: ...
'M': [-1 5]
'N': [ 1. -5.5]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Negated EventSet. |
__or__ #
__or__(other: Any) -> EventSetOrNode
Computes logical or (self | other
) element-wise with another
EventSet
.
Each feature in self
is compared element-wise to the feature in
other
in the same position.
self
and other
must have the same sampling, the same number of
features, and all feature types must be bool
.
See cast example in EventSet.__and__()
.
Example
>>> a = tp.event_set(timestamps=[1, 2, 3], features={"f1": [100, 150, 200]})
>>> # Sample boolean features
>>> b = a <= 100
>>> c = a >= 200
>>> d = b | c
>>> d
indexes: []
features: [('f1', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ True False True]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet with only boolean features. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with result of the comparison. |
__pow__ #
__pow__(other: Any) -> EventSetOrNode
Computes power with another
EventSet
or a scalar value element-wise.
If an EventSet, each feature in self
is raised to the feature in
other
in the same position. self
and other
must have the same
sampling, index, number of features and dtype for the features in the
same positions.
If a scalar, each item in each feature in self
is raised to
other
.
See examples in EventSet.__add__()
to
see how to match samplings, dtypes and index, in order to apply
arithmetic operators in different EventSets.
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [5, 2, 4]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [0, 3, 2]},
... same_sampling_as=a
... )
>>> c = a ** b
>>> c
indexes: []
features: [('f1', int64)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ 1 8 16]
...
Example with scalar value
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 2, 3], "f2": [1., 2., 3.]}
... )
>>> b = a ** 3
>>> b
indexes: ...
timestamps: [1. 2. 3.]
'f1': [ 0 8 27]
'f2': [ 1. 8. 27.]
...
>>> c = 3 ** a
>>> c
indexes: ...
timestamps: [1. 2. 3.]
'f1': [ 1 9 27]
'f2': [ 3. 9. 27.]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet or scalar value. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Result of the operation. |
__setitem__ #
__setitem__(feature_names: Any, value: Any) -> None
Fails, features cannot be assigned.
__sub__ #
__sub__(other: Any) -> EventSetOrNode
Subtracts an EventSet
or a scalar value from
self
element-wise.
If an EventSet, each feature in self
is subtracted from the feature in
other
in the same position. self
and other
must have the same
sampling, index, number of features and dtype for the features in the
same positions.
If a scalar, other
is subtracted from each item in each feature in
self
.
See examples in EventSet.__add__()
to
see how to match samplings, dtypes and index, in order to apply
arithmetic operators in different EventSets.
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [10, 20, -5]},
... same_sampling_as=a
... )
>>> c = a - b
>>> c
indexes: []
features: [('f1', int64)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [-10 80 205]
...
Example with scalar value
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200], "f2": [10, -10, 5]}
... )
>>> b = a - 3
>>> b
indexes: ...
timestamps: [1. 2. 3.]
'f1': [ -3 97 197]
'f2': [ 7 -13 2]
...
>>> c = 3 - a
>>> c
indexes: ...
timestamps: [1. 2. 3.]
'f1': [ 3 -97 -197]
'f2': [-7 13 -2]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet or scalar value. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Result of the operation. |
__truediv__ #
__truediv__(other: Any) -> EventSetOrNode
Divides self
by an EventSet
or a scalar
value element-wise.
If an EventSet, each feature in self
is divided by the feature in
other
in the same position. self
and other
must have the same
sampling, index, number of features and dtype for the features in the
same positions.
If a scalar, each item in each feature in self
is divided by other
.
This operator cannot be used in features with dtypes int32
or int64
.
Cast to float before (see example) or use
EventSet.__floordiv__()
instead.
See examples in EventSet.__add__()
to
see how to match samplings, dtypes and index, in order to apply
arithmetic operators in different EventSets.
Example with EventSet
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0.0, 100.0, 200.0]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [10.0, 20.0, 50.0]},
... same_sampling_as=a
... )
>>> c = a / b
>>> c
indexes: []
features: [('f1', float64)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [0. 5. 4.]
...
Example casting integer features
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [10, 20, 50]},
... same_sampling_as=a
... )
>>> # Cannot divide int64 features
>>> c = a / b
Traceback (most recent call last):
...
ValueError: Cannot use the divide operator on feature f1 of type int64. ...
>>> # Cast to tp.float64 or tp.float32 before
>>> c = a.cast(tp.float64) / b.cast(tp.float64)
>>> c
indexes: []
features: [('f1', float64)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [0. 5. 4.]
...
Example with scalar value
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0., 100., 200.], "f2": [10., -10., 5.]}
... )
>>> b = a / 2
>>> b
indexes: ...
timestamps: [1. 2. 3.]
'f1': [ 0. 50. 100.]
'f2': [ 5. -5. 2.5]
...
>>> c = 1000 / a
>>> c
indexes: ...
timestamps: [1. 2. 3.]
'f1': [inf 10. 5.]
'f2': [ 100. -100. 200.]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet or scalar value. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Result of the operation. |
__xor__ #
__xor__(other: Any) -> EventSetOrNode
Computes logical xor (self ^ other
) element-wise with another
EventSet
.
Each feature in self
is compared element-wise to the feature in
other
in the same position.
self
and other
must have the same sampling, the same number of
features, and all feature types must be bool
.
See cast example in EventSet.__and__()
.
Example
>>> a = tp.event_set(timestamps=[1, 2, 3], features={"f1": [100, 150, 200]})
>>> # Sample boolean features
>>> b = a > 100
>>> c = a < 200
>>> d = b ^ c
>>> d
indexes: []
features: [('f1', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [ True False True]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
EventSet with only boolean features. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with result of the comparison. |
abs #
abs() -> EventSetOrNode
Gets the absolute value of an EventSet
's
features.
Example
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"M":[np.nan, -1., 2.], "N": [-1, -3, 5]},
... )
>>> a.abs()
indexes: ...
'M': [nan 1. 2.]
'N': [1 3 5]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with positive valued features. |
add_index #
add_index(indexes: Union[str, List[str]]) -> EventSetOrNode
Adds indexes to an EventSet
.
Usage example
>>> a = tp.event_set(
... timestamps=[1, 2, 1, 0, 1, 1],
... features={
... "f1": [1, 1, 1, 2, 2, 2],
... "f2": [1, 1, 2, 1, 1, 2],
... "f3": [1, 1, 1, 1, 1, 1]
... },
... )
>>> # No index
>>> a
indexes: []
features: [('f1', int64), ('f2', int64), ('f3', int64)]
events:
(6 events):
timestamps: [0. 1. 1. 1. 1. 2.]
'f1': [2 1 1 2 2 1]
'f2': [1 1 2 1 2 1]
'f3': [1 1 1 1 1 1]
...
>>> # Add only "f1" as index
>>> b = a.add_index("f1")
>>> b
indexes: [('f1', int64)]
features: [('f2', int64), ('f3', int64)]
events:
f1=1 (3 events):
timestamps: [1. 1. 2.]
'f2': [1 2 1]
'f3': [1 1 1]
f1=2 (3 events):
timestamps: [0. 1. 1.]
'f2': [1 1 2]
'f3': [1 1 1]
...
>>> # Add "f1" and "f2" as indices
>>> b = a.add_index(["f1", "f2"])
>>> b
indexes: [('f1', int64), ('f2', int64)]
features: [('f3', int64)]
events:
f1=1 f2=1 (2 events):
timestamps: [1. 2.]
'f3': [1 1]
f1=1 f2=2 (1 events):
timestamps: [1.]
'f3': [1]
f1=2 f2=1 (2 events):
timestamps: [0. 1.]
'f3': [1 1]
f1=2 f2=2 (1 events):
timestamps: [1.]
'f3': [1]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
indexes |
Union[str, List[str]]
|
List of feature names (strings) that should be added to the indexes. These feature names should already exist in the input. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with the extended index. |
Raises:
Type | Description |
---|---|
KeyError
|
If any of the specified |
after #
after(
timestamp: Union[int, float, datetime]
) -> EventSetOrNode
Filters events EventSet
that happened after a
particular timestamp.
The timestamp can be a datetime if the EventSet's timestamps are unix timestamps.
The comparison is strict, meaning that the obtained timestamps would be
greater than (>
) the provided timestamp.
This operation is equivalent to:
input.filter(input.timestamps() < timestamp)
Usage example
>>> a = tp.event_set(
... timestamps=[0, 1, 5, 6],
... features={"f1": [0, 10, 50, 60]},
... )
>>> a.after(4)
indexes: []
features: [('f1', int64)]
events:
(2 events):
timestamps: [5. 6.]
'f1': [50 60]
...
>>> from datetime import datetime
>>> a = tp.event_set(
... timestamps=[datetime(2022, 1, 1), datetime(2022, 1, 2)],
... features={"f1": [1, 2]},
... )
>>> a.after(datetime(2022, 1, 1, 12))
indexes: []
features: [('f1', int64)]
events:
(1 events):
timestamps: ['2022-01-02T00:00:00']
'f1': [2]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timestamp |
Union[int, float, datetime]
|
EventSet with a single boolean feature. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Filtered EventSet. |
arccos #
arccos() -> EventSetOrNode
Calculates the inverse cosine of an EventSet
's features.
Can only be used on floating point features.
Example
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"M": [1.0, 0, -1.0]},
... )
>>> a.arccos()
indexes: ...
timestamps: [1. 2. 3.]
'M': [0. 1.5708 3.1416]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSetOrNode with inverse cosine of input features. |
arcsin #
arcsin() -> EventSetOrNode
Calculates the inverse sine of an EventSet
's features.
Can only be used on floating point features.
Example
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"M": [0, 0.5, -0.5]},
... )
>>> a.arcsin()
indexes: ...
timestamps: [1. 2. 3.]
'M': [ 0. 0.5236 -0.5236]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSetOrNode with inverse sine of input features. |
arctan #
arctan() -> EventSetOrNode
Calculates the inverse tangent of an EventSet
's features.
Can only be used on floating point features.
Example
>>> a = tp.event_set(
... timestamps=[1, 2, 3, 4],
... features={"M": [0, 1.0, -1.0, 5.0]},
... )
>>> a.arctan()
indexes: ...
timestamps: [1. 2. 3. 4.]
'M': [ 0. 0.7854 -0.7854 1.3734]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSetOrNode with inverse tangent of input features. |
assign #
assign(**others: EventSetOrNode) -> EventSetOrNode
Assign new features to an EventSet.
If the name provided already exists on the EventSet, the feature is overriden.
Usage example
>>> a = tp.event_set(
... timestamps=[1, 2],
... features={'A': [1, 2]},
... )
>>> b = tp.event_set(
... timestamps=[1, 2],
... features={'B': [3, 4]},
... same_sampling_as=a,
... )
>>> ab = a.assign(new_name=b)
>>> ab
indexes: []
features: [('A', int64), ('new_name', int64)]
events:
(2 events):
timestamps: [1. 2.]
'A': [1 2]
'new_name': [3 4]
...
>>> ab = a.assign(B=b, B2=b['B'] * 2)
>>> ab
indexes: []
features: [('A', int64), ('B', int64), ('B2', int64)]
events:
(2 events):
timestamps: [1. 2.]
'A': [1 2]
'B': [3 4]
'B2': [6 8]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
**others |
EventSetOrNode
|
The argument name is going to be used as the new feature name. The EventSets need to have a single feature |
{}
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with the added feature. |
before #
before(
timestamp: Union[int, float, datetime]
) -> EventSetOrNode
Filters events EventSet
that happened before
a particular timestamp.
The timestamp can be a datetime if the EventSet's timestamps are unix timestamps.
The comparison is strict, meaning that the obtained timestamps would be
less than (<
) the provided timestamp.
This operation is equivalent to:
input.filter(input.timestamps() < timestamp)
Usage example
>>> a = tp.event_set(
... timestamps=[0, 1, 5, 6],
... features={"f1": [0, 10, 50, 60]},
... )
>>> a.before(5)
indexes: []
features: [('f1', int64)]
events:
(2 events):
timestamps: [0. 1.]
'f1': [ 0 10]
...
>>> from datetime import datetime
>>> a = tp.event_set(
... timestamps=[datetime(2022, 1, 1), datetime(2022, 1, 2)],
... features={"f1": [1, 2]},
... )
>>> a.before(datetime(2022, 1, 1, 12))
indexes: []
features: [('f1', int64)]
events:
(1 events):
timestamps: ['2022-01-01T00:00:00']
'f1': [1]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
timestamp |
Union[int, float, datetime]
|
EventSet with a single boolean feature. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Filtered EventSet. |
begin #
begin() -> EventSetOrNode
Generates a single timestamp at the beginning of the
EventSet
, per index group.
Usage example
>>> a = tp.event_set(
... timestamps=[5, 6, 7, -1],
... features={"f": [50, 60, 70, -10], "idx": [1, 1, 1, 2]},
... indexes=["idx"]
... )
>>> a_ini = a.begin()
>>> a_ini
indexes: [('idx', int64)]
features: []
events:
idx=1 (1 events):
timestamps: [5.]
idx=2 (1 events):
timestamps: [-1.]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
A feature-less EventSet with a single timestamp per index group. |
calendar_day_of_month #
calendar_day_of_month(
tz: Union[str, float, int] = 0
) -> EventSetOrNode
Obtains the day of month the timestamps in an
EventSet
's sampling are in.
Features in the input are ignored, only the timestamps are used and
they must be unix timestamps (is_unix_timestamp=True
).
Output feature contains numbers between 1 and 31.
By default, the timezone is UTC unless the tz
argument is specified,
as an offset in hours or a timezone name. See
EventSet.calendar_hour()
for an
example using timezones.
Usage example
>>> a = tp.event_set(
... timestamps=["2023-02-04", "2023-02-20", "2023-03-01", "2023-05-07"],
... )
>>> b = a.calendar_day_of_month()
>>> b
indexes: ...
features: [('calendar_day_of_month', int32)]
events:
(4 events):
timestamps: [...]
'calendar_day_of_month': [ 4 20 1 7]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tz |
Union[str, float, int]
|
timezone name (see |
0
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature with the day of the month each timestamp
in |
calendar_day_of_week #
calendar_day_of_week(
tz: Union[str, float, int] = 0
) -> EventSetOrNode
Obtains the day of the week the timestamps in an
EventSet
's sampling are in.
Features in the input are ignored, only the timestamps are used and
they must be unix timestamps (is_unix_timestamp=True
).
Output feature contains numbers from 0 (Monday) to 6 (Sunday).
By default, the timezone is UTC unless the tz
argument is specified,
as an offset in hours or a timezone name. See
EventSet.calendar_hour()
for an
example using timezones.
Usage example
>>> a = tp.event_set(
... timestamps=["2023-06-19", "2023-06-21", "2023-06-25", "2023-07-03"],
... )
>>> b = a.calendar_day_of_week()
>>> b
indexes: ...
features: [('calendar_day_of_week', int32)]
events:
(4 events):
timestamps: [...]
'calendar_day_of_week': [0 2 6 0]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tz |
Union[str, float, int]
|
timezone name (see |
0
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature with the day of the week each timestamp
in |
calendar_day_of_year #
calendar_day_of_year(
tz: Union[str, float, int] = 0
) -> EventSetOrNode
Obtains the day of year the timestamps in an
EventSet
's sampling are in.
Features in the input are ignored, only the timestamps are used and
they must be unix timestamps (is_unix_timestamp=True
).
Output feature contains numbers between 1 and 366.
By default, the timezone is UTC unless the tz
argument is specified,
as an offset in hours or a timezone name. See
EventSet.calendar_hour()
for an
example using timezones.
Usage example
>>> a = tp.event_set(
... timestamps=["2020-01-01", "2021-06-01", "2022-12-31", "2024-12-31"],
... )
>>> b = a.calendar_day_of_year()
>>> b
indexes: ...
features: [('calendar_day_of_year', int32)]
events:
(4 events):
timestamps: [...]
'calendar_day_of_year': [ 1 152 365 366]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tz |
Union[str, float, int]
|
timezone name (see |
0
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature with the day of the year each timestamp
in |
calendar_hour #
calendar_hour(
tz: Union[str, float, int] = 0
) -> EventSetOrNode
Obtains the hour the timestamps in an
EventSet
's sampling are in.
Features in the input are ignored, only the timestamps are used and
they must be unix timestamps (is_unix_timestamp=True
).
Output feature contains numbers between 0 and 23.
By default, the timezone is UTC unless the tz
argument is specified,
as an offset in hours or a timezone name. See
EventSet.calendar_hour()
for an
example using timezones.
Basic example with UTC datetimes
>>> from datetime import datetime
>>> a = tp.event_set(
... timestamps=[datetime(2020,1,1,18,30), datetime(2020,1,1,23,59)],
... )
>>> b = a.calendar_hour()
>>> b
indexes: ...
features: [('calendar_hour', int32)]
events:
(2 events):
timestamps: [...]
'calendar_hour': [18 23]
...
Example with timezone
>>> # UTC datetimes (unless datetime(tzinfo=...) is used)
>>> a = tp.event_set(timestamps=["2020-01-01 09:00",
... "2020-01-01 15:00"])
>>> # Option 1: specify UTC-3 offset in hours
>>> a.calendar_hour(tz=-3)
indexes: ...
'calendar_hour': [ 6 12]
...
>>> # Option 2: specify timezone name (see pytz.all_timezones)
>>> a.calendar_hour(tz="America/Montevideo")
indexes: ...
'calendar_hour': [ 6 12]
...
>>> # No timezone specified, get UTC hour
>>> a.calendar_hour()
indexes: ...
'calendar_hour': [ 9 15]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tz |
Union[str, float, int]
|
timezone name (see |
0
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature with the hour each timestamp in |
calendar_iso_week #
calendar_iso_week(
tz: Union[str, float, int] = 0
) -> EventSetOrNode
Obtains the ISO week the timestamps in an
EventSet
's sampling are in.
Features in the input are ignored, only the timestamps are used and
they must be unix timestamps (is_unix_timestamp=True
).
Output feature contains numbers between 1 and 53.
By default, the timezone is UTC unless the tz
argument is specified,
as an offset in hours or a timezone name. See
EventSet.calendar_hour()
for an
example using timezones.
Usage example
>>> a = tp.event_set(
... # Note: 2023-01-01 is Sunday in the same week as 2022-12-31
... timestamps=["2022-12-31", "2023-01-01", "2023-01-02", "2023-12-20"],
... )
>>> b = a.calendar_iso_week()
>>> b
indexes: ...
features: [('calendar_iso_week', int32)]
events:
(4 events):
timestamps: [...]
'calendar_iso_week': [52 52 1 51]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tz |
Union[str, float, int]
|
timezone name (see |
0
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature with the ISO week each timestamp in
|
calendar_minute #
calendar_minute(
tz: Union[str, float, int] = 0
) -> EventSetOrNode
Obtain the minute the timestamps in an
EventSet
's sampling are in.
Features in the input are ignored, only the timestamps are used and
they must be unix timestamps (is_unix_timestamp=True
).
Output feature contains numbers between 0 and 59.
By default, the timezone is UTC unless the tz
argument is specified,
as an offset in hours or a timezone name. See
EventSet.calendar_hour()
for an
example using timezones.
Usage example
>>> from datetime import datetime
>>> a = tp.event_set(
... timestamps=[datetime(2020,1,1,18,30), datetime(2020,1,1,23,59)],
... name='random_hours'
... )
>>> b = a.calendar_minute()
>>> b
indexes: ...
features: [('calendar_minute', int32)]
events:
(2 events):
timestamps: [...]
'calendar_minute': [30 59]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tz |
Union[str, float, int]
|
timezone name (see |
0
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature with the minute each timestamp in
|
calendar_month #
calendar_month(
tz: Union[str, float, int] = 0
) -> EventSetOrNode
Obtains the month the timestamps in an
EventSet
's sampling are in.
Features in the input are ignored, only the timestamps are used and
they must be unix timestamps (is_unix_timestamp=True
).
Output feature contains numbers between 1 and 12.
By default, the timezone is UTC unless the tz
argument is specified,
as an offset in hours or a timezone name. See
EventSet.calendar_hour()
for an
example using timezones.
Usage example
>>> a = tp.event_set(
... timestamps=["2023-02-04", "2023-02-20", "2023-03-01", "2023-05-07"],
... name='special_events'
... )
>>> b = a.calendar_month()
>>> b
indexes: ...
features: [('calendar_month', int32)]
events:
(4 events):
timestamps: [...]
'calendar_month': [2 2 3 5]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tz |
Union[str, float, int]
|
timezone name (see |
0
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature with the month each timestamp in
|
calendar_second #
calendar_second(
tz: Union[str, float, int] = 0
) -> EventSetOrNode
Obtains the second the timestamps in an
EventSet
's sampling are in.
Features in the input are ignored, only the timestamps are used and
they must be unix timestamps (is_unix_timestamp=True
).
Output feature contains numbers between 0 and 59.
By default, the timezone is UTC unless the tz
argument is specified,
as an offset in hours or a timezone name. See
EventSet.calendar_hour()
for an
example using timezones.
Usage example
>>> from datetime import datetime
>>> a = tp.event_set(
... timestamps=[datetime(2020,1,1,18,30,55), datetime(2020,1,1,23,59,0)],
... name='random_hours'
... )
>>> b = a.calendar_second()
>>> b
indexes: ...
features: [('calendar_second', int32)]
events:
(2 events):
timestamps: [...]
'calendar_second': [55 0]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tz |
Union[str, float, int]
|
timezone name (see |
0
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature with the second each timestamp in
|
calendar_year #
calendar_year(
tz: Union[str, float, int] = 0
) -> EventSetOrNode
Obtains the year the timestamps in an
EventSet
's sampling are in.
Features in the input are ignored, only the timestamps are used and
they must be unix timestamps (is_unix_timestamp=True
).
By default, the timezone is UTC unless the tz
argument is specified,
as an offset in hours or a timezone name. See
EventSet.calendar_hour()
for an
example using timezones.
Usage example
>>> a = tp.event_set(
... timestamps=["2021-02-04", "2022-02-20", "2023-03-01", "2023-05-07"],
... name='random_moments'
... )
>>> b = a.calendar_year()
>>> b
indexes: ...
features: [('calendar_year', int32)]
events:
(4 events):
timestamps: [...]
'calendar_year': [2021 2022 2023 2023]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tz |
Union[str, float, int]
|
timezone name (see |
0
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature with the year each timestamp in
|
cast #
cast(
target: TargetDtypes, check_overflow: bool = True
) -> EventSetOrNode
Casts the data types of an EventSet
's features.
Features not impacted by cast are kept.
Usage example
>>> a = tp.event_set(
... timestamps=[1, 2],
... features={"A": [0, 2], "B": ['a', 'b'], "C": [5.0, 5.5]},
... )
>>> # Cast all input features to the same dtype
>>> b = a[["A", "C"]].cast(tp.float32)
>>> b
indexes: []
features: [('A', float32), ('C', float32)]
events:
(2 events):
timestamps: [1. 2.]
'A': [0. 2.]
'C': [5. 5.5]
...
>>> # Cast by feature name
>>> b = a.cast({'A': bool, 'C': int})
>>> b
indexes: []
features: [('A', bool_), ('B', str_), ('C', int64)]
events:
(2 events):
timestamps: [1. 2.]
'A': [False True]
'B': [b'a' b'b']
'C': [5 5]
...
>>> # Map original_dtype -> target_dtype
>>> b = a.cast({float: int, int: float})
>>> b
indexes: []
features: [('A', float64), ('B', str_), ('C', int64)]
events:
(2 events):
timestamps: [1. 2.]
'A': [0. 2.]
'B': [b'a' b'b']
'C': [5 5]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target |
TargetDtypes
|
Single dtype or a map. Providing a single dtype will cast all
columns to it. The mapping keys can be either feature names or the
original dtypes (and not both types mixed), and the values are the
target dtypes for them. All dtypes must be Temporian types (see
|
required |
check_overflow |
bool
|
Flag to check overflow when casting to a dtype with a
shorter range (e.g: |
True
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
New EventSet (or the same if no features actually changed dtype),
with the same feature names as the input one, but with the new
dtypes as specified in |
Raises:
Type | Description |
---|---|
ValueError
|
If |
ValueError
|
If trying to cast a non-numeric string to numeric dtype. |
ValueError
|
If |
ValueError
|
If |
check_same_sampling #
check_same_sampling(other: EventSet)
Checks if two EventSets have the same sampling.
cos #
cos() -> EventSetOrNode
Calculates the cosine of an EventSet
's features.
Can only be used on floating point features.
Example
>>> a = tp.event_set(
... timestamps=[1, 2, 3, 4, 5],
... features={"M": [0, np.pi/3, np.pi/2, np.pi, 2*np.pi]},
... )
>>> a.cos()
indexes: ...
timestamps: [1. 2. 3. 4. 5.]
'M': [ 1.0000e+00 5.0000e-01 6.1232e-17 -1.0000e+00 1.0000e+00]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSetOrNode with cosine of input features. |
cumprod #
cumprod(
sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode
Computes the cumulative product of values over each feature in an
EventSet
.
This operation only supports floating-point features.
Missing (NaN) values are not accounted for. The output will be NaN until the input contains at least one numeric value.
Warning: The cumprod
function leverages an infinite window length for
its calculations, which may lead to considerable computational overhead
with increasing dataset sizes.
Example
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 3],
... features={"value": [1.0, 2.0, 10.0, 12.0]},
... )
>>> b = a.cumprod()
>>> b
indexes: ...
(4 events):
timestamps: [0. 1. 2. 3.]
'value': [ 1. 2. 20. 240.]
...
Examples with sampling
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 5, 6, 7],
... features={"value": [1, 2, 10, 12, np.nan, 2]},
... )
>>> # Cumulative product at 5 and 10
>>> b = tp.event_set(timestamps=[5, 10])
>>> c = a.cumprod(sampling=b)
>>> c
indexes: ...
(2 events):
timestamps: [ 5. 10.]
'value': [240. 480.]
...
>>> # Product all values in the EventSet
>>> c = a.cumprod(sampling=a.end())
>>> c
indexes: ...
(1 events):
timestamps: [7.]
'value': [480.]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sampling |
Optional[EventSetOrNode]
|
Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Cumulative product of each feature. |
cumsum #
cumsum(
sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode
Computes the cumulative sum of values over each feature in an
EventSet
.
Foreach timestamp, calculate the sum of the feature from the beginning.
Shorthand for moving_sum(event, window_length=np.inf)
.
Missing (NaN) values are not accounted for. The output will be NaN until the input contains at least one numeric value.
If sampling
is specified or window_length
is an EventSet, the moving
window is sampled at each timestamp in them, else it is sampled on the
input's.
Example
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 5, 6, 7],
... features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )
>>> b = a.cumsum()
>>> b
indexes: ...
(6 events):
timestamps: [0. 1. 2. 5. 6. 7.]
'value': [ 0. 1. 6. 16. 31. 51.]
...
Examples with sampling
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 5, 6, 7],
... features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )
>>> # Cumulative sum at 5 and 10
>>> b = tp.event_set(timestamps=[5, 10])
>>> c = a.cumsum(sampling=b)
>>> c
indexes: ...
(2 events):
timestamps: [ 5. 10.]
'value': [16. 51.]
...
>>> # Sum all values in the EventSet
>>> c = a.cumsum(sampling=a.end())
>>> c
indexes: ...
(1 events):
timestamps: [7.]
'value': [51.]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sampling |
Optional[EventSetOrNode]
|
Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Cumulative sum of each feature. |
drop #
drop(
feature_names: Union[str, List[str]]
) -> EventSetOrNode
Removes a subset of features from an EventSet
.
Usage example
>>> a = tp.event_set(
... timestamps=[1, 2],
... features={"A": [1, 2], "B": ['s', 'm'], "C": [5.0, 5.5]},
... )
>>> # Drop single feature
>>> bc = a.drop('A')
>>> bc
indexes: []
features: [('B', str_), ('C', float64)]
events:
(2 events):
timestamps: [1. 2.]
'B': [b's' b'm']
'C': [5. 5.5]
...
>>> # Drop multiple features
>>> c = a.drop(['A', 'B'])
>>> c
indexes: []
features: [('C', float64)]
events:
(2 events):
timestamps: [1. 2.]
'C': [5. 5.5]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_names |
Union[str, List[str]]
|
Name or list of names of the features to drop from the input. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet containing all features execpt the ones dropped. |
drop_index #
drop_index(
indexes: Optional[Union[str, List[str]]] = None,
keep: bool = True,
) -> EventSetOrNode
Removes indexes from an EventSet
.
Usage example
>>> a = tp.event_set(
... timestamps=[1, 2, 1, 0, 1, 1],
... features={
... "f1": [1, 1, 1, 2, 2, 2],
... "f2": [1, 1, 2, 1, 1, 2],
... "f3": [1, 1, 1, 1, 1, 1]
... },
... indexes=["f1", "f2"]
... )
>>> # Both f1 and f2 are indices
>>> a
indexes: [('f1', int64), ('f2', int64)]
features: [('f3', int64)]
events:
f1=1 f2=1 (2 events):
timestamps: [1. 2.]
'f3': [1 1]
f1=1 f2=2 (1 events):
timestamps: [1.]
'f3': [1]
f1=2 f2=1 (2 events):
timestamps: [0. 1.]
'f3': [1 1]
f1=2 f2=2 (1 events):
timestamps: [1.]
'f3': [1]
...
>>> # Drop "f2", remove it from features
>>> b = a.drop_index("f2", keep=False)
>>> b
indexes: [('f1', int64)]
features: [('f3', int64)]
events:
f1=1 (3 events):
timestamps: [1. 1. 2.]
'f3': [1 1 1]
f1=2 (3 events):
timestamps: [0. 1. 1.]
'f3': [1 1 1]
...
>>> # Drop both indices, keep them as features
>>> b = a.drop_index(["f2", "f1"])
>>> b
indexes: []
features: [('f3', int64), ('f2', int64), ('f1', int64)]
events:
(6 events):
timestamps: [0. 1. 1. 1. 1. 2.]
'f3': [1 1 1 1 1 1]
'f2': [2 1 1 2 2 1]
'f1': [1 2 1 2 1 1]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
indexes |
Optional[Union[str, List[str]]]
|
Index column(s) to be removed from the input. This can be a
single column name ( |
None
|
keep |
bool
|
Flag indicating whether the removed indexes should be kept
as features in the output EventSet. Defaults to |
True
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with the specified indexes removed. If |
EventSetOrNode
|
|
Raises:
Type | Description |
---|---|
ValueError
|
If an empty list is provided as the |
KeyError
|
If any of the specified |
ValueError
|
If a feature name coming from the indexes already exists in
the input, and the |
end #
end() -> EventSetOrNode
Generates a single timestamp at the end of an
EventSet
, per index key.
Usage example
>>> a = tp.event_set(
... timestamps=[5, 6, 7, 1],
... features={"f": [50, 60, 70, 10], "idx": [1, 1, 1, 2]},
... indexes=["idx"]
... )
>>> a_end = a.end()
>>> a_end
indexes: [('idx', int64)]
features: []
events:
idx=1 (1 events):
timestamps: [7.]
idx=2 (1 events):
timestamps: [1.]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
A feature-less EventSet with a single timestamp per index group. |
enumerate #
enumerate() -> EventSetOrNode
Create an int64
feature with the ordinal position of each event in an
EventSet
.
Each index group is enumerated independently.
Usage
>>> a = tp.event_set(
... timestamps=[-1, 2, 3, 5, 0],
... features={"cat": ["A", "A", "A", "A", "B"]},
... indexes=["cat"],
... )
>>> b = a.enumerate()
>>> b
indexes: [('cat', str_)]
features: [('enumerate', int64)]
events:
cat=b'A' (4 events):
timestamps: [-1. 2. 3. 5.]
'enumerate': [0 1 2 3]
cat=b'B' (1 events):
timestamps: [0.]
'enumerate': [0]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature with each event's ordinal position in its index group. |
equal #
equal(other: Any) -> EventSetOrNode
Checks element-wise equality of an EventSet
to another one or to a single value.
Each feature is compared element-wise to the feature in
other
in the same position.
Note that it will always return False on NaN elements.
Inputs must have the same sampling and the same number of features.
Example
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f1": [0, 100, 200]}
... )
>>> b = tp.event_set(
... timestamps=[1, 2, 3],
... features={"f2": [-10, 100, 5]},
... same_sampling_as=a
... )
>>> # WARN: Don't use this for element-wise comparison
>>> a == b
False
>>> # Element-wise comparison to a scalar value
>>> c = a.equal(100)
>>> c
indexes: []
features: [('f1', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [False True False]
...
>>> # Element-wise comparison between two EventSets
>>> c = a.equal(b)
>>> c
indexes: []
features: [('f1', bool_)]
events:
(3 events):
timestamps: [1. 2. 3.]
'f1': [False True False]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Any
|
Second EventSet or single value to compare. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with boolean features. |
experimental_fast_fourier_transform #
experimental_fast_fourier_transform(
*,
num_events: int,
hop_size: Optional[int] = None,
window: Optional[str] = None,
num_spectral_lines: Optional[int] = None
) -> EventSetOrNode
Computes the Fast Fourier Transform of an
EventSet
with a single tp.float32 feature.
WARNING: This operator is experimental. The implementation is not yet optimized for speed, and the operator signature might change in the future.
The window length is defined in number of events, instead of timestamp duration like most other operators. The 'num_events' argument needs to be specified by kwarg i.e. fast_fourier_transform(num_events=5) instead of fast_fourier_transform(5).
The operator returns the amplitude of each spectral line as
separate tp.float32 features named "a0", "a1", "a2", etc. By default,
num_events // 2
spectral line are returned.
Usage
>>> a = tp.event_set(
... timestamps=[1,2,3,4,5,6],
... features={"x": [4.,3.,2.,6.,2.,1.]},
... )
>>> b = a.experimental_fast_fourier_transform(num_events=4, window="hamming")
>>> b
indexes: []
features: [('a0', float64), ('a1', float64)]
events:
(2 events):
timestamps: [4. 6.]
'a0': [4.65 6.4 ]
'a1': [2.1994 4.7451]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_events |
int
|
Size of the FFT expressed as a number of events. |
required |
window |
Optional[str]
|
Optional window function applied before the FFT. if None, no window is applied. Supported values are: "hamming". |
None
|
hop_size |
Optional[int]
|
Step, in number of events, between consecutive outputs. Default to num_events//2. |
None
|
num_spectral_lines |
Optional[int]
|
Number of returned spectral lines. If set, the
operators returns the |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet containing the amplitude of each frequency band of the Fourier Transform. |
fillna #
fillna(value: float = 0.0) -> EventSetOrNode
Replaces all the NaN values with value
.
Features that cannot contain NaN values (e.g. integer or bytes features) are not impacted.
Usage example
>>> import math
>>> a = tp.event_set(
... timestamps=[0, 1, 3],
... features={
... "f1": [0., 10., math.nan],
... "f2": ["a","b",""]},
... )
>>> a.fillna()
indexes: []
features: [('f1', float64), ('f2', str_)]
events:
(3 events):
timestamps: [0. 1. 3.]
'f1': [ 0. 10. 0.]
'f2': [b'a' b'b' b'']
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
value |
float
|
Value to replace Nans with? |
0.0
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet without NaNs. |
filter #
filter(
condition: Optional[EventSetOrNode] = None,
) -> EventSetOrNode
Filters out events in an EventSet
for which a
condition is false.
Each timestamp in the input is only kept if the corresponding value for that
timestamp in condition
is True
.
the input and condition
must have the same sampling, and condition
must
have one single feature, of boolean type.
filter(x) is equivalent to filter(x,x). filter(x) can be used to convert a boolean mask into a timestamps.
Usage example
>>> a = tp.event_set(
... timestamps=[0, 1, 5, 6],
... features={"f1": [0, 10, 50, 60], "f2": [50, 100, 500, 600]},
... )
>>> # Example boolean condition
>>> condition = a["f1"] > 20
>>> condition
indexes: ...
timestamps: [0. 1. 5. 6.]
'f1': [False False True True]
...
>>> # Filter only True timestamps
>>> filtered = a.filter(condition)
>>> filtered
indexes: ...
timestamps: [5. 6.]
'f1': [50 60]
'f2': [500 600]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
condition |
Optional[EventSetOrNode]
|
EventSet with a single boolean feature. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Filtered EventSet. |
filter_empty_index #
filter_empty_index() -> EventSetOrNode
Filters out indexes without events.
Usage example
>>> a = tp.event_set(
... timestamps=[1, 2, 3, 4],
... features={
... "i1": [1, 1, 2, 2],
... "f1": [10, 11, 12, 13],
... },
... indexes=["i1"]
... )
>>> filtered = a.filter(a["f1"] <= 11).filter_empty_index()
>>> filtered
indexes: [('i1', int64)]
features: [('f1', int64)]
events:
i1=1 (2 events):
timestamps: [1. 2.]
'f1': [10 11]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Filtered EventSet. |
filter_moving_count #
filter_moving_count(
window_length: Duration,
) -> EventSetOrNode
Filters out events such that no more than one output event is within
a tailing time window of window_length
.
Filtering is applied in chronological order: An event received at time t is filtered out if there is a non-filtered out event in (t-window_length, t].
This operator is different from (evset.moving_count(window_length)
== 0).filter()
. In filter_moving_count
a filtered event does not
block following events.
Usage example
>>> a = tp.event_set(timestamps=[1, 2, 3])
>>> b = a.filter_moving_count(window_length=1.5)
>>> b
indexes: []
features: []
events:
(2 events):
timestamps: [1. 3.]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet without features with the filtered events. |
get_arbitrary_index_data #
get_arbitrary_index_data() -> Optional[IndexData]
Gets data from an arbitrary index key.
If the EventSet is empty, return None.
get_arbitrary_index_key #
get_arbitrary_index_key() -> Optional[IndexKey]
Gets an arbitrary index key.
If the EventSet is empty, return None.
get_index_value #
get_index_value(
index_key: IndexKey, normalize: bool = True
) -> IndexData
Gets the value for a specified index key.
The index key must be a tuple of values corresponding to the indexes of the EventSet.
isnan #
isnan() -> EventSetOrNode
Returns boolean features, True
in the NaN elements of the
EventSet
.
Note that for int
and bool
this will always be False
since those types
don't support NaNs. It only makes actual sense to use on float
(or
tp.float32
) features.
See also evset.notnan()
.
Example
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"M":[np.nan, 5., np.nan], "N": [-1, 0, 5]},
... )
>>> b = a.isnan()
>>> b
indexes: ...
'M': [ True False True]
'N': [False False False]
...
>>> # Count nans
>>> b["M"].cast(int).cumsum()
indexes: ...
timestamps: [1. 2. 3.]
'M': [1 1 2]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with boolean features. |
join #
join(
other: EventSetOrNode,
how: str = "left",
on: Optional[str] = None,
) -> EventSetOrNode
Join EventSets
with different samplings.
Join features from two EventSets based on timestamps. Optionally, join on
timestamps and an extra int64
feature. Joined EventSets should have the
same index and non-overlapping feature names.
To concatenate EventSets with the same sampling, use
tp.glue()
instead. tp.glue()
is
almost free while EventSet.join()
can be expensive.
To resample an EventSets according to another EventSets's sampling, use
EventSet.resample()
instead.
Example:
```python
>>> a = tp.event_set(timestamps=[0, 1, 2], features={"A": [0, 10, 20]})
>>> b = tp.event_set(timestamps=[0, 2, 4], features={"B": [0., 2., 4.]})
>>> # Left join
>>> c = a.join(b)
>>> c
indexes: []
features: [('A', int64), ('B', float64)]
events:
(3 events):
timestamps: [0. 1. 2.]
'A': [ 0 10 20]
'B': [ 0. nan 2.]
...
```
Example with an index and feature join:
```python
>>> a = tp.event_set(
... timestamps=[0, 1, 1, 1],
... features={
... "idx": [1, 1, 2, 2],
... "match": [1, 2, 4, 5],
... "A": [10, 20, 40, 50],
... },
... indexes=["idx"]
... )
>>> b = tp.event_set(
... timestamps=[0, 1, 0, 1, 1, 1],
... features={
... "idx": [1, 1, 2, 2, 2, 2],
... "match": [1, 2, 3, 4, 5, 6],
... "B": [10., 20., 30., 40., 50., 60.],
... },
... indexes=["idx"]
... )
>>> # Join by index and 'match'
>>> c = a.join(b, on="match")
>>> c
indexes: [('idx', int64)]
features: [('match', int64), ('A', int64), ('B', float64)]
events:
idx=1 (2 events):
timestamps: [0. 1.]
'match': [1 2]
'A': [10 20]
'B': [10. 20.]
idx=2 (2 events):
timestamps: [1. 1.]
'match': [4 5]
'A': [40 50]
'B': [40. 50.]
...
```
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
EventSetOrNode
|
Right EventSet to join. |
required |
how |
str
|
Whether to perform a |
'left'
|
on |
Optional[str]
|
Optional extra int64 feature name to join on. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
The joined EventSets. |
lag #
lag(duration: Duration) -> EventSetOrNode
Adds a delay to an EventSet
's timestamps.
In other words, shifts the timestamp values forwards in time.
Usage example
>>> a = tp.event_set(
... timestamps=[0, 1, 5, 6],
... features={"value": [0, 1, 5, 6]},
... )
>>> b = a.lag(tp.duration.seconds(2))
>>> b
indexes: ...
(4 events):
timestamps: [2. 3. 7. 8.]
'value': [0 1 5 6]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
duration |
Duration
|
Duration to lag by. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Lagged EventSet. |
leak #
leak(duration: Duration) -> EventSetOrNode
Subtracts a duration from an EventSet
's
timestamps.
In other words, shifts the timestamp values backward in time.
Note that this operator moves future data into the past, and should be used with caution to prevent unwanted future leakage. For instance, this op should generally not be used to compute the input features of a model.
Usage example
>>> a = tp.event_set(
... timestamps=[0, 1, 5, 6],
... features={"value": [0, 1, 5, 6]},
... )
>>> b = a.leak(tp.duration.seconds(2))
>>> b
indexes: ...
(4 events):
timestamps: [-2. -1. 3. 4.]
'value': [0 1 5 6]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
duration |
Duration
|
Duration to leak by. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Leaked EventSet. |
log #
log() -> EventSetOrNode
Calculates the natural logarithm of an EventSet
's
features.
Can only be used on floating point features.
Example
>>> a = tp.event_set(
... timestamps=[1, 2, 3, 4, 5],
... features={"M": [np.e, 1., 2., 10., -1.]},
... )
>>> a.log()
indexes: ...
timestamps: [1. 2. 3. 4. 5.]
'M': [1. 0. 0.6931 2.3026 nan]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSetOr with logarithm of input features. |
map #
map(
func: MapFunction,
output_dtypes: Optional[TargetDtypes] = None,
receive_extras: bool = False,
) -> EventSetOrNode
Applies a function on each value of an
EventSet
's features.
The function receives the scalar value, and if receive_extras
is True,
also a MapExtras
object containing
information about the value's position in the EventSet. The MapExtras
object should not be modified by the function, since it is shared across
all calls.
If the output of the functon has a different dtype than the input, the
output_dtypes
argument must be specified.
This operator is slow. When possible, existing operators should be used.
A Temporian graph with a map
operator is not serializable.
Usage example with lambda function
>>> a = tp.event_set(
... timestamps=[0, 1, 2],
... features={"value": [10, 20, 30]},
... )
>>> b = a.map(lambda v: v + 1)
>>> b
indexes: ...
(3 events):
timestamps: [0. 1. 2.]
'value': [11 21 31]
...
Usage example with output_dtypes
:
>>> a = tp.event_set(
... timestamps=[0, 1, 2],
... features={"a": [10, 20, 30], "b": ["100", "200", "300"]},
... )
>>> def f(value):
... if value.dtype == np.int64:
... return float(value) + 1
... else:
... return int(value) + 2
>>> b = a.map(f, output_dtypes={"a": float, "b": int})
>>> b
indexes: ...
(3 events):
timestamps: [0. 1. 2.]
'a': [11. 21. 31.]
'b': [102 202 302]
...
Usage example with MapExtras
:
>>> a = tp.event_set(
... timestamps=[0, 1, 2],
... features={"value": [10, 20, 30]},
... )
>>> def f(value, extras):
... return f"{extras.feature_name}-{extras.timestamp}-{value}"
>>> b = a.map(f, output_dtypes=str, receive_extras=True)
>>> b
indexes: ...
(3 events):
timestamps: [0. 1. 2.]
'value': [b'value-0.0-10' b'value-1.0-20' b'value-2.0-30']
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func |
MapFunction
|
The function to apply on each value. |
required |
output_dtypes |
Optional[TargetDtypes]
|
Expected dtypes of the output feature(s) after
applying the function to them. If not provided, the output
dtypes will be expected to be the same as the input ones. If a
single dtype, all features will be expected to have that dtype.
If a mapping, the keys can be either feature names or the
input dtypes (and not both types mixed), and the values are the
target dtypes for them. All dtypes must be Temporian types (see
|
None
|
receive_extras |
bool
|
Whether the function should receive a
|
False
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with the function applied on each value. |
memory_usage #
memory_usage() -> int
Gets the approximated memory usage of the EventSet in bytes.
Takes into account garbage collector overhead.
moving_count #
moving_count(
window_length: WindowLength,
sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode
Gets the number of events in a sliding window.
Create a tp.int32 feature containing the number of events in the time window (t - window_length, t].
sampling
can't be specified if a variable window_length
is
specified (i.e. if window_length
is an EventSet).
If sampling
is specified or window_length
is an EventSet, the moving
window is sampled at each timestamp in them, else it is sampled on the
input's.
Example without sampling
>>> a = tp.event_set(timestamps=[0, 1, 2, 5, 6, 7])
>>> b = a.moving_count(tp.duration.seconds(2))
>>> b
indexes: ...
(6 events):
timestamps: [0. 1. 2. 5. 6. 7.]
'count': [1 2 2 1 2 2]
...
Example with sampling
>>> a = tp.event_set(timestamps=[0, 1, 2, 5])
>>> b = tp.event_set(timestamps=[-1, 0, 1, 2, 3, 4, 5, 6, 7])
>>> c = a.moving_count(tp.duration.seconds(2), sampling=b)
>>> c
indexes: ...
(9 events):
timestamps: [-1. 0. 1. 2. 3. 4. 5. 6. 7.]
'count': [0 1 2 2 1 0 1 1 0]
...
Example with variable window length
>>> a = tp.event_set(timestamps=[0, 1, 2, 5])
>>> b = tp.event_set(
... timestamps=[0, 3, 3, 3, 9],
... features={
... "w": [1, 0.5, 3.5, 2.5, 5],
... },
... )
>>> c = a.moving_count(window_length=b)
>>> c
indexes: []
features: [('count', int32)]
events:
(5 events):
timestamps: [0. 3. 3. 3. 9.]
'count': [1 0 3 2 1]
...
Example with index
>>> a = tp.event_set(
... timestamps=[1, 2, 3, 0, 1, 2],
... features={
... "idx": ["i1", "i1", "i1", "i2", "i2", "i2"],
... },
... indexes=["idx"],
... )
>>> b = a.moving_count(tp.duration.seconds(2))
>>> b
indexes: [('idx', str_)]
features: [('count', int32)]
events:
idx=b'i1' (3 events):
timestamps: [1. 2. 3.]
'count': [1 2 2]
idx=b'i2' (3 events):
timestamps: [0. 1. 2.]
'count': [1 2 2]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
window_length |
WindowLength
|
Sliding window's length. |
required |
sampling |
Optional[EventSetOrNode]
|
Timestamps to sample the sliding window's value at. If not
provided, timestamps in |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet containing the count of events in |
moving_max #
moving_max(
window_length: WindowLength,
sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode
Computes the maximum in a sliding window over an
EventSet
.
For each t in sampling, and for each index and feature independently, returns at time t the max of non-nan values for the feature in the window (t - window_length, t].
sampling
can't be specified if a variable window_length
is
specified (i.e. if window_length
is an EventSet).
If sampling
is specified or window_length
is an EventSet, the moving
window is sampled at each timestamp in them, else it is sampled on the
input's.
If the window does not contain any values (e.g., all the values are missing, or the window does not contain any sampling), outputs missing values.
Example
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 5, 6, 7],
... features={"value": [np.nan, 1, 5, 1, 15, 20]},
... )
>>> b = a.moving_max(tp.duration.seconds(4))
>>> b
indexes: ...
(6 events):
timestamps: [0. 1. 2. 5. 6. 7.]
'value': [nan 1. 5. 5. 15. 20.]
...
See EventSet.moving_count()
for
examples with external sampling and indices.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
window_length |
WindowLength
|
Sliding window's length. |
required |
sampling |
Optional[EventSetOrNode]
|
Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet containing the max of each feature in the input. |
moving_min #
moving_min(
window_length: WindowLength,
sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode
Computes the minimum of values in a sliding window over an
EventSet
.
For each t in sampling, and for each index and feature independently, returns at time t the minimum of non-nan values for the feature in the window (t - window_length, t].
sampling
can't be specified if a variable window_length
is
specified (i.e. if window_length
is an EventSet).
If sampling
is specified or window_length
is an EventSet, the moving
window is sampled at each timestamp in them, else it is sampled on the
input's.
If the window does not contain any values (e.g., all the values are missing, or the window does not contain any sampling), outputs missing values.
Example
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 5, 6, 7],
... features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )
>>> b = a.moving_min(tp.duration.seconds(4))
>>> b
indexes: ...
(6 events):
timestamps: [0. 1. 2. 5. 6. 7.]
'value': [nan 1. 1. 5. 10. 10.]
...
See EventSet.moving_count()
for
examples of moving window operations with external sampling and indices.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
window_length |
WindowLength
|
Sliding window's length. |
required |
sampling |
Optional[EventSetOrNode]
|
Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet containing the minimum of each feature in the input. |
moving_product #
moving_product(
window_length: WindowLength,
sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode
Computes the product of values in a sliding window over an
EventSet
.
This operation only supports floating-point features.
For each t in sampling, and for each feature independently, returns at time t the product of non-zero and non-NaN values for the feature in the window (t - window_length, t].
sampling
can't be specified if a variable window_length
is
specified (i.e., if window_length
is an EventSet).
If sampling
is specified or window_length
is an EventSet, the moving
window is sampled at each timestamp in them, else it is sampled on the
input's.
Zeros result in the accumulator's result being 0 for the window. NaN values are ignored in the calculation of the product. If the window does not contain any NaN, zero or any non-zero values (e.g., all values are missing), the output for that window is an empty array.
Example
>>> a = tp.event_set(
... timestamps=[0, 1, 2],
... features={"value": [np.nan, 1, 5]},
... )
>>> b = a.moving_product(tp.duration.seconds(1))
>>> b
indexes: ...
(3 events):
timestamps: [0. 1. 2.]
'value': [nan 1. 5.]
...
See EventSet.moving_count()
for
examples of moving window operations with external sampling and indices.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
window_length |
WindowLength
|
Sliding window's length. |
required |
sampling |
Optional[EventSetOrNode]
|
Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet containing the moving product of each feature in the input, |
EventSetOrNode
|
considering non-zero and non-NaN values only. |
moving_standard_deviation #
moving_standard_deviation(
window_length: WindowLength,
sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode
Computes the standard deviation of values in a sliding window over an
EventSet
.
For each t in sampling, and for each feature independently, returns at time t the standard deviation for the feature in the window (t - window_length, t].
sampling
can't be specified if a variable window_length
is
specified (i.e. if window_length
is an EventSet).
If sampling
is specified or window_length
is an EventSet, the moving
window is sampled at each timestamp in them, else it is sampled on the
input's.
Missing values (such as NaNs) are ignored.
If the window does not contain any values (e.g., all the values are missing, or the window does not contain any sampling), outputs missing values.
Example
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 5, 6, 7],
... features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )
>>> b = a.moving_standard_deviation(tp.duration.seconds(4))
>>> b
indexes: ...
(6 events):
timestamps: [0. 1. 2. 5. 6. 7.]
'value': [ nan 0. 2. 2.5 2.5 4.0825]
...
See EventSet.moving_count()
for
examples of moving window operations with external sampling and indices.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
window_length |
WindowLength
|
Sliding window's length. |
required |
sampling |
Optional[EventSetOrNode]
|
Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet containing the moving standard deviation of each feature in the input. |
moving_sum #
moving_sum(
window_length: WindowLength,
sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode
Computes the sum of values in a sliding window over an
EventSet
.
For each t in sampling, and for each feature independently, returns at time t the sum of values for the feature in the window (t - window_length, t].
sampling
can't be specified if a variable window_length
is
specified (i.e. if window_length
is an EventSet).
If sampling
is specified or window_length
is an EventSet, the moving
window is sampled at each timestamp in them, else it is sampled on the
input's.
Missing values (such as NaNs) are ignored.
If the window does not contain any values (e.g., all the values are missing, or the window does not contain any sampling), outputs missing values.
Example
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 5, 6, 7],
... features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )
>>> b = a.moving_sum(tp.duration.seconds(4))
>>> b
indexes: ...
(6 events):
timestamps: [0. 1. 2. 5. 6. 7.]
'value': [ 0. 1. 6. 15. 25. 45.]
...
See EventSet.moving_count()
for
examples of moving window operations with external sampling and indices.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
window_length |
WindowLength
|
Sliding window's length. |
required |
sampling |
Optional[EventSetOrNode]
|
Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet containing the moving sum of each feature in the input. |
node #
node(force_new_node: bool = False) -> EventSetNode
Creates an EventSetNode
able to consume
this EventSet.
If called multiple times with force_new_node=False
(default), the same
node is returned.
Usage example
>>> my_evset = tp.event_set(
... timestamps=[1, 2, 3, 4],
... features={
... "feature_1": [0.5, 0.6, np.nan, 0.9],
... "feature_2": ["red", "blue", "red", "blue"],
... },
... )
>>> my_node = my_evset.node()
Parameters:
Name | Type | Description | Default |
---|---|---|---|
force_new_node |
bool
|
If false (default), return the same node each time
|
False
|
Returns:
Type | Description |
---|---|
EventSetNode
|
An EventSetNode able to consume this EventSet. |
notnan #
notnan() -> EventSetOrNode
Returns boolean features, False
in the NaN elements of an
EventSet
.
Equivalent to ~evset.isnan(...)
.
Note that for int
and bool
this will always be True
since those types
don't support NaNs. It only makes actual sense to use on float
(or
tp.float32
) features.
See also evset.isnan()
.
Example
>>> a = tp.event_set(
... timestamps=[1, 2, 3],
... features={"M":[np.nan, 5., np.nan], "N": [-1, 0, 5]},
... )
>>> b = a.notnan()
>>> b
indexes: ...
'M': [False True False]
'N': [ True True True]
...
>>> # Filter only rows where "M" is not nan
>>> a.filter(b["M"])
indexes: ...
'M': [5.]
'N': [0]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with boolean features. |
plot #
plot(*args, **wargs) -> Any
Plots the EventSet. See tp.plot()
for details.
Example usage:
```python
>>> evset = tp.event_set(timestamps=[1, 2, 3], features={"f1": [0, 42, 10]})
>>> evset.plot()
```
prefix #
prefix(prefix: str) -> EventSetOrNode
Adds a prefix to the names of the features in an
EventSet
.
Usage example
>>> a = tp.event_set(
... timestamps=[0, 1],
... features={"f1": [0, 2], "f2": [5, 6]}
... )
>>> b = a * 5
>>> # Prefix before glue to avoid duplicated names
>>> c = tp.glue(a.prefix("original_"), b.prefix("result_"))
>>> c
indexes: ...
'original_f1': [0 2]
'original_f2': [5 6]
'result_f1': [ 0 10]
'result_f2': [25 30]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prefix |
str
|
Prefix to add in front of the feature names. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Prefixed EventSet. |
propagate #
propagate(
sampling: EventSetOrNode, resample: bool = False
) -> EventSetOrNode
Propagates feature values over another EventSet
's
index.
Given the input and sampling
where the input's indexes are a subset of
sampling
's (e.g., the indexes of the input are ["x"]
, and the indexes of
sampling
are ["x","y"]
), duplicates the features of the input over the
indexes of sampling
.
Index values in self
but not in sampling
are removed. An index value
without timestamps is created for each index values in sampling
but
not in self
.
Example use case
>>> products = tp.event_set(
... timestamps=[1, 2, 3, 1, 2, 3],
... features={
... "product": [1, 1, 1, 2, 2, 2],
... "sales": [100., 200., 500., 1000., 2000., 5000.]
... },
... indexes=["product"],
... )
>>> store = tp.event_set(
... timestamps=[1, 2, 3, 4, 5],
... features={
... "sales": [10000., 20000., 30000., 5000., 1000.]
... },
... )
>>> # First attempt: divide to calculate fraction of total store sales
>>> products / store
Traceback (most recent call last):
...
ValueError: Arguments don't have the same index. ...
>>> # Second attempt: propagate index
>>> store_prop = store.propagate(products)
>>> products / store_prop
Traceback (most recent call last):
...
ValueError: Arguments should have the same sampling. ...
>>> # Third attempt: propagate + resample
>>> store_resample = store.propagate(products, resample=True)
>>> div = products / store_resample
>>> div
indexes: [('product', int64)]
features: [('sales', float64)]
events:
product=1 (3 events):
timestamps: [1. 2. 3.]
'sales': [0.01 0.01 0.0167]
product=2 (3 events):
timestamps: [1. 2. 3.]
'sales': [0.1 0.1 0.1667]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sampling |
EventSetOrNode
|
EventSet with the indexes to propagate to. |
required |
resample |
bool
|
If true, apply a |
False
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet propagated over |
rename #
rename(
features: Optional[
Union[str, Dict[str, str], List[str]]
] = None,
indexes: Optional[
Union[str, Dict[str, str], List[str]]
] = None,
) -> EventSetOrNode
Renames an EventSet
's features and index.
If the input has a single feature, then the features
can be a
single string with the new name.
If the input has multiple features, then features
can either be (1) a
dictionary mapping old names to the new names, or (2) a list of new
names of the same size as evtset.schema.feature_names()
.
The indexes renaming follows the same criteria, accepting a single string, a mapping, or a list.
Usage example
>>> a = tp.event_set(
... timestamps=[0, 1],
... features={"f1": [0, 2], "f2": [5, 6]}
... )
>>> b = 5 * a
>>> # Rename single feature
>>> b_1 = b["f1"].rename("f1_result")
>>> b_1
indexes: []
features: [('f1_result', int64)]
events:
(2 events):
timestamps: [0. 1.]
'f1_result': [ 0 10]
...
>>> # Rename multiple features with a dictionary
>>> b_rename = b.rename({"f1": "5xf1", "f2": "5xf2"})
>>> b_rename
indexes: []
features: [('5xf1', int64), ('5xf2', int64)]
events:
(2 events):
timestamps: [0. 1.]
'5xf1': [ 0 10]
'5xf2': [25 30]
...
>>> # Rename multiple features with a list
>>> b_rename = b.rename(["5xf1", "5xf2"])
>>> b_rename
indexes: []
features: [('5xf1', int64), ('5xf2', int64)]
events:
(2 events):
timestamps: [0. 1.]
'5xf1': [ 0 10]
'5xf2': [25 30]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
features |
Optional[Union[str, Dict[str, str], List[str]]]
|
New feature name or mapping from old names to new names. |
None
|
indexes |
Optional[Union[str, Dict[str, str], List[str]]]
|
New index name or mapping from old names to new names. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with renamed features and index. |
resample #
resample(sampling: EventSetOrNode) -> EventSetOrNode
Resamples an EventSet
at each timestamp of
another EventSet
.
If a timestamp in sampling
does not have a corresponding timestamp in
the input, the last timestamp in the input is used instead. If this timestamp
is anterior to an value in the input, the value is replaced by
dtype.MissingValue(...)
.
Example
>>> a = tp.event_set(
... timestamps=[1, 5, 8, 9],
... features={"f1": [1.0, 2.0, 3.0, 4.0]}
... )
>>> b = tp.event_set(timestamps=[-1, 1, 6, 10])
>>> c = a.resample(b)
>>> c
indexes: ...
timestamps: [-1. 1. 6. 10.]
'f1': [nan 1. 2. 4.]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sampling |
EventSetOrNode
|
EventSet to use the sampling of. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Resampled EventSet, with same sampling as |
select #
select(
feature_names: Union[str, List[str]]
) -> EventSetOrNode
Selects a subset of features from an EventSet
.
Usage example
>>> a = tp.event_set(
... timestamps=[1, 2],
... features={"A": [1, 2], "B": ['s', 'm'], "C": [5.0, 5.5]},
... )
>>> # Select single feature
>>> b = a.select('B')
>>> # Equivalent
>>> b = a['B']
>>> b
indexes: []
features: [('B', str_)]
events:
(2 events):
timestamps: [1. 2.]
'B': [b's' b'm']
...
>>> # Select multiple features
>>> bc = a.select(['B', 'C'])
>>> # Equivalent
>>> bc = a[['B', 'C']]
>>> bc
indexes: []
features: [('B', str_), ('C', float64)]
events:
(2 events):
timestamps: [1. 2.]
'B': [b's' b'm']
'C': [5. 5.5]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_names |
Union[str, List[str]]
|
Name or list of names of the features to select from the input. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet containing only the selected features. |
select_index_values #
select_index_values(
keys: Optional[IndexKeyList] = None,
*,
number: Optional[int] = None,
fraction: Optional[float] = None
) -> EventSetOrNode
Selects a subset of index values from an
EventSet
.
Exactly one of keys
, number
, or fraction
should be provided.
If number
or fraction
is specified, the index values are selected
randomly.
If fraction
is specified and fraction * len(index keys)
doesn't
result in an integer, the number of index values selected is rounded
down.
If used in compiled or graph mode, the specified keys are compiled as-is along with the operator, which means that they must be available when loading and running the graph on new data.
Example with keys
with a single index and a single key:
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 3],
... features={
... "f": [10, 20, 30, 40],
... "x": ["A", "B", "A", "B"],
... },
... indexes=["x"],
... )
>>> b = a.select_index_values("A")
>>> b
indexes: [('x', str_)]
features: [('f', int64)]
events:
x=b'A' (2 events):
timestamps: [0. 2.]
'f': [10 30]
...
Example with keys
with multiple indexes and keys:
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 3],
... features={
... "f": [10, 20, 30, 40],
... "x": [1, 1, 2, 2],
... "y": ["A", "B", "A", "B"],
... },
... indexes=["x", "y"],
... )
>>> b = a.select_index_values([(1, "A"), (2, "B")])
>>> b
indexes: [('x', int64), ('y', str_)]
features: [('f', int64)]
events:
x=1 y=b'A' (1 events):
timestamps: [0.]
'f': [10]
x=2 y=b'B' (1 events):
timestamps: [3.]
'f': [40]
...
Example with number
:
>>> import random
>>> random.seed(0)
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 3],
... features={
... "f": [10, 20, 30, 40],
... "x": [1, 1, 2, 2],
... "y": ["A", "B", "A", "B"],
... },
... indexes=["x", "y"],
... )
>>> b = a.select_index_values(number=2)
>>> b
indexes: [('x', int64), ('y', str_)]
features: [('f', int64)]
events:
x=1 y=b'A' (1 events):
timestamps: [0.]
'f': [10]
x=2 y=b'A' (1 events):
timestamps: [2.]
'f': [30]
...
Example with fraction
:
>>> import random
>>> random.seed(0)
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 3],
... features={
... "f": [10, 20, 30, 40],
... "x": [1, 1, 2, 2],
... "y": ["A", "B", "A", "B"],
... },
... indexes=["x", "y"],
... )
>>> b = a.select_index_values(fraction=0.75)
>>> b
indexes: [('x', int64), ('y', str_)]
features: [('f', int64)]
events:
x=1 y=b'A' (1 events):
timestamps: [0.]
'f': [10]
x=2 y=b'A' (1 events):
timestamps: [2.]
'f': [30]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
keys |
Optional[IndexKeyList]
|
index key or list of index keys to select from the EventSet. |
None
|
number |
Optional[int]
|
number of index values to select. If |
None
|
fraction |
Optional[float]
|
fraction of index values to select, expressed as a float between 0 and 1. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a subset of the index values. |
set_index #
set_index(indexes: Union[str, List[str]]) -> EventSetOrNode
Replaces the index in an EventSet
.
Usage example
>>> a = tp.event_set(
... timestamps=[1, 2, 1, 0, 1, 1],
... features={
... "f1": [1, 1, 1, 2, 2, 2],
... "f2": [1, 1, 2, 1, 1, 2],
... "f3": [1, 1, 1, 1, 1, 1]
... },
... indexes=["f1"],
... )
>>> # "f1" is the current index
>>> a
indexes: [('f1', int64)]
features: [('f2', int64), ('f3', int64)]
events:
f1=1 (3 events):
timestamps: [1. 1. 2.]
'f2': [1 2 1]
'f3': [1 1 1]
f1=2 (3 events):
timestamps: [0. 1. 1.]
'f2': [1 1 2]
'f3': [1 1 1]
...
>>> # Set "f2" as the only index, remove "f1"
>>> b = a.set_index("f2")
>>> b
indexes: [('f2', int64)]
features: [('f3', int64), ('f1', int64)]
events:
f2=1 (4 events):
timestamps: [0. 1. 1. 2.]
'f3': [1 1 1 1]
'f1': [2 1 2 1]
f2=2 (2 events):
timestamps: [1. 1.]
'f3': [1 1]
'f1': [1 2]
...
>>> # Set both "f1" and "f2" as indices
>>> b = a.set_index(["f1", "f2"])
>>> b
indexes: [('f1', int64), ('f2', int64)]
features: [('f3', int64)]
events:
f1=1 f2=1 (2 events):
timestamps: [1. 2.]
'f3': [1 1]
f1=1 f2=2 (1 events):
timestamps: [1.]
'f3': [1]
f1=2 f2=1 (2 events):
timestamps: [0. 1.]
'f3': [1 1]
f1=2 f2=2 (1 events):
timestamps: [1.]
'f3': [1]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
indexes |
Union[str, List[str]]
|
List of index / feature names (strings) used as the new indexes. These names should be either indexes or features in the input. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with the updated indexes. |
Raises:
Type | Description |
---|---|
KeyError
|
If any of the specified |
set_index_value #
set_index_value(
index_key: IndexKey,
value: IndexData,
normalize: bool = True,
) -> None
Sets the value for a specified index key.
The index key must be a tuple of values corresponding to the indexes of the EventSet.
simple_moving_average #
simple_moving_average(
window_length: WindowLength,
sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode
Computes the average of values in a sliding window over an
EventSet
.
For each t in sampling, and for each feature independently, returns at time t the average value of the feature in the window (t - window_length, t].
sampling
can't be specified if a variable window_length
is
specified (i.e. if window_length
is an EventSet).
If sampling
is specified or window_length
is an EventSet, the moving
window is sampled at each timestamp in them, else it is sampled on the
input's.
Missing values (such as NaNs) are ignored.
If the window does not contain any values (e.g., all the values are missing, or the window does not contain any timestamp), outputs missing values.
Example
>>> a = tp.event_set(
... timestamps=[0, 1, 2, 5, 6, 7],
... features={"value": [np.nan, 1, 5, 10, 15, 20]},
... )
>>> b = a.simple_moving_average(tp.duration.seconds(4))
>>> b
indexes: ...
(6 events):
timestamps: [0. 1. 2. 5. 6. 7.]
'value': [ nan 1. 3. 7.5 12.5 15. ]
...
See EventSet.moving_count()
for
examples of moving window operations with external sampling and indices.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
window_length |
WindowLength
|
Sliding window's length. |
required |
sampling |
Optional[EventSetOrNode]
|
Timestamps to sample the sliding window's value at. If not provided, timestamps in the input are used. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet containing the moving average of each feature in the input. |
sin #
sin() -> EventSetOrNode
Calculates the sine of an EventSet
's features.
Can only be used on floating point features.
Example
>>> a = tp.event_set(
... timestamps=[1, 2, 3, 4, 5],
... features={"M": [0, np.pi/2, np.pi, 3*np.pi/2, 2*np.pi]},
... )
>>> a.sin()
indexes: ...
timestamps: [1. 2. 3. 4. 5.]
'M': [ 0.0000e+00 1.0000e+00 1.2246e-16 -1.0000e+00 -2.4493e-16]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSetOrNode with sine of input features. |
since_last #
since_last(
steps: int = 1,
sampling: Optional[EventSetOrNode] = None,
) -> EventSetOrNode
Computes the amount of time since the last previous timestamp in an
EventSet
.
If a number of steps
is provided, compute elapsed time after moving
back that number of previous events.
Basic example with 1 and 2 steps
>>> a = tp.event_set(timestamps=[1, 5, 8, 8, 9])
>>> # Default: time since previous event
>>> b = a.since_last()
>>> b
indexes: ...
timestamps: [1. 5. 8. 8. 9.]
'since_last': [nan 4. 3. 0. 1.]
...
>>> # Time since 2 previous events
>>> b = a.since_last(steps=2)
>>> b
indexes: ...
timestamps: [1. 5. 8. 8. 9.]
'since_last': [nan nan 7. 3. 1.]
...
If sampling
is provided, the output will correspond to the time elapsed
between each timestamp in sampling
and the latest previous or equal
timestamp in the input.
Example with sampling
>>> a = tp.event_set(timestamps=[1, 4, 5, 7])
>>> b = tp.event_set(timestamps=[-1, 2, 4, 6, 10])
>>> # Time elapsed between each sampling event
>>> # and the latest previous event in a
>>> c = a.since_last(sampling=b)
>>> c
indexes: ...
timestamps: [-1. 2. 4. 6. 10.]
'since_last': [nan 1. 0. 1. 3.]
...
>>> # 2 steps with sampling
>>> c = a.since_last(steps=2, sampling=b)
>>> c
indexes: ...
timestamps: [-1. 2. 4. 6. 10.]
'since_last': [nan nan 3. 2. 5.]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
steps |
int
|
Number of previous events to compute elapsed time with. |
1
|
sampling |
Optional[EventSetOrNode]
|
EventSet to use the sampling of. |
None
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Resulting EventSet, with same sampling as |
tan #
tan() -> EventSetOrNode
Calculates the tangent of an EventSet
's features.
Can only be used on floating point features.
Example
>>> a = tp.event_set(
... timestamps=[1, 2, 3, 4],
... features={"M": [0, np.pi/4, np.pi/3, np.pi/6]},
... )
>>> a.tan()
indexes: ...
timestamps: [1. 2. 3. 4.]
'M': [0. 1. 1.7321 0.5774]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSetOrNode with tangent of input features. |
tick #
tick(
interval: Duration,
align: bool = True,
after_last: bool = True,
before_first: bool = False,
) -> EventSetOrNode
Generates timestamps at regular intervals in the range of a guide
EventSet
.
Example with align
>>> a = tp.event_set(timestamps=[5, 9, 16])
>>> b = a.tick(interval=tp.duration.seconds(3), align=True)
>>> b
indexes: ...
timestamps: [ 6. 9. 12. 15. 18.]
...
Example without align
>>> a = tp.event_set(timestamps=[5, 9, 16])
>>> b = a.tick(interval=tp.duration.seconds(3), align=False)
>>> b
indexes: ...
timestamps: [ 5. 8. 11. 14. 17.]
...
Example with before_first
>>> a = tp.event_set(timestamps=[5, 9, 16])
>>> b = a.tick(interval=tp.duration.seconds(3), align=True, before_first=True)
>>> b
indexes: ...
timestamps: [ 3. 6. 9. 12. 15. 18.]
...
Example without after_last
>>> a = tp.event_set(timestamps=[5, 9, 16])
>>> b = a.tick(interval=tp.duration.seconds(3), align=True, after_last=False)
>>> b
indexes: ...
timestamps: [ 6. 9. 12. 15.]
...
Args:
interval: Tick interval.
align: If false, the first tick is generated at the first timestamp
(similar to EventSet.begin()
).
If true (default), ticks are generated on timestamps that are
multiple of interval
.
after_last: If True, a tick after the last timestamp is included.
before_first: If True, a tick before the first timestamp is included.
Returns:
Type | Description |
---|---|
EventSetOrNode
|
A feature-less EventSet with regular timestamps. |
tick_calendar #
tick_calendar(
second: Optional[Union[int, Literal["*"]]] = None,
minute: Optional[Union[int, Literal["*"]]] = None,
hour: Optional[Union[int, Literal["*"]]] = None,
mday: Optional[Union[int, Literal["*"]]] = None,
month: Optional[Union[int, Literal["*"]]] = None,
wday: Optional[Union[int, Literal["*"]]] = None,
after_last: bool = True,
before_first: bool = False,
) -> EventSetOrNode
Generates events periodically at fixed times or dates e.g. each month.
Events are generated in the range of the input
EventSet
independently for each index.
The usability is inspired in the crontab format, where arguments can
take a value of '*'
to tick at all values, or a fixed integer to
tick only at that precise value.
Non-specified values (None
), are set to '*'
if a finer
resolution argument is specified, or fixed to the first valid value if
a lower resolution is specified. For example, setting only
tick_calendar(hour='*')
is equivalent to:
tick_calendar(second=0, minute=0, hour='*', mday='*', month='*')
, resulting in one tick at every exact hour of every day/month/year in
the input guide range.
The datetime timezone is always assumed to be UTC.
Examples:
>>> # Every day (at 00:00:00) in the period (exactly one year)
>>> a = tp.event_set(timestamps=["2021-01-01", "2021-12-31 23:59:59"])
>>> b = a.tick_calendar(hour=0)
>>> b
indexes: ...
events:
(366 events):
timestamps: [...]
...
>>> # Every day at 2:30am
>>> b = a.tick_calendar(hour=2, minute=30)
>>> tp.glue(b.calendar_hour(), b.calendar_minute())
indexes: ...
events:
(366 events):
timestamps: [...]
'calendar_hour': [2 2 2 ... 2 2 2]
'calendar_minute': [30 30 30 ... 30 30 30]
...
>>> # Day 5 of every month (at 00:00)
>>> b = a.tick_calendar(mday=5)
>>> b.calendar_day_of_month()
indexes: ...
events:
(13 events):
timestamps: [...]
'calendar_day_of_month': [5 5 5 ... 5 5 5]
...
>>> # 1st of February of every year
>>> a = tp.event_set(timestamps=["2020-01-01", "2021-12-31"])
>>> b = a.tick_calendar(month=2)
>>> tp.glue(b.calendar_day_of_month(), b.calendar_month())
indexes: ...
events:
(3 events):
timestamps: [...]
'calendar_day_of_month': [1 1 1]
'calendar_month': [2 2 2]
...
>>> # Every second in the period (2 hours -> 7200 seconds)
>>> a = tp.event_set(timestamps=["2020-01-01 00:00:00",
... "2020-01-01 01:59:59"])
>>> b = a.tick_calendar(second='*')
>>> b
indexes: ...
events:
(7200 events):
timestamps: [...]
...
>>> # Every second of the minute 30 of every hour (00:30 and 01:30)
>>> a = tp.event_set(timestamps=["2020-01-01 00:00",
... "2020-01-01 02:00"])
>>> b = a.tick_calendar(second='*', minute=30)
>>> b
indexes: ...
events:
(121 events):
timestamps: [...]
...
>>> # Not allowed: intermediate arguments (minute, hour) not specified
>>> b = a.tick_calendar(second=1, mday=1) # ambiguous meaning
Traceback (most recent call last):
...
ValueError: Can't set argument to None because previous and
following arguments were specified. Set to '*' or an integer ...
>>> # not after_last
>>> a = tp.event_set(timestamps=["2020-02-01", "2020-04-01"])
>>> b = a.tick_calendar(mday=10, after_last=False)
>>> tp.glue(b.calendar_day_of_month(), b.calendar_month())
indexes: ...
events:
(2 events):
timestamps: [...]
'calendar_day_of_month': [10 10]
'calendar_month': [2 3]
...
>>> # before_first
>>> a = tp.event_set(timestamps=["2020-02-01", "2020-04-01"])
>>> b = a.tick_calendar(mday=10, before_first=True)
>>> tp.glue(b.calendar_day_of_month(), b.calendar_month())
indexes: ...
events:
(4 events):
timestamps: [...]
'calendar_day_of_month': [10 10 10 10]
'calendar_month': [1 2 3 4]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
second |
Optional[Union[int, Literal['*']]]
|
'*' (any second), None (auto) or number in range |
None
|
minute |
Optional[Union[int, Literal['*']]]
|
'*' (any minute), None (auto) or number in range |
None
|
hour |
Optional[Union[int, Literal['*']]]
|
'*' (any hour), None (auto), or number in range |
None
|
mday |
Optional[Union[int, Literal['*']]]
|
'*' (any day), None (auto) or number in range |
None
|
month |
Optional[Union[int, Literal['*']]]
|
'*' (any month), None (auto) or number in range |
None
|
wday |
Optional[Union[int, Literal['*']]]
|
'*' (any day), None (auto) or number in range |
None
|
after_last |
bool
|
If True, a tick after the last timestamp is included. Useful for window operations where you want the timestamps to be included in the range of the ticks. |
True
|
before_first |
bool
|
If True, a tick before the first timestamp is included. Useful for window operations where you want the timestamps to be included in the range of the ticks. |
False
|
Returns:
Type | Description |
---|---|
EventSetOrNode
|
A feature-less EventSet with timestamps at specified interval. |
timestamps #
timestamps() -> EventSetOrNode
Converts an EventSet
's timestamps into a
float64
feature.
Features in the input EventSet are ignored, only the timestamps are used.
Datetime timestamps are converted to unix timestamps.
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature named |
unique_timestamps #
unique_timestamps() -> EventSetOrNode
Removes events with duplicated timestamps from an
EventSet
.
Returns a feature-less EventSet where each timestamp from the original one only appears once. If the input is indexed, the unique operation is applied independently for each index.
Usage example
>>> a = tp.event_set(timestamps=[5, 9, 9, 16], features={'f': [1,2,3,4]})
>>> b = a.unique_timestamps()
>>> b
indexes: []
features: []
events:
(3 events):
timestamps: [ 5. 9. 16.]
...
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet without features with unique timestamps in the input. |
until_next #
until_next(
sampling: EventSetOrNode, timeout: Duration
) -> EventSetOrNode
Gets the duration until the next sampling event for each input event.
If no sampling event is observed before timeout
time-units, returns
NaN.
until_next
is different from since_last
in that since_last
returns
one value for each sampling (sampling events are after input events),
while until_next
returns one value for each input value (here again,
sampling events are after input events).
The output EventSet
has one event for each event
in input, but with its timestamp moved forward to the nearest future
event in sampling
. If no timestamp in sampling is closer than timeout,
it is moved by timeout
into the future instead.
until_next
is useful to measure the time it takes for an issue
(input
) to be detected by an alert (sampling
).
Basic example with 1 and 2 steps
>>> a = tp.event_set(timestamps=[0, 10, 11, 20, 30])
>>> b = tp.event_set(timestamps=[1, 12, 21, 22, 42])
>>> c = a.until_next(sampling=b, timeout=5)
>>> c
indexes: []
features: [('until_next', float64)]
events:
(5 events):
timestamps: [ 1. 12. 12. 21. 35.]
'until_next': [ 1. 2. 1. 1. nan]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sampling |
EventSetOrNode
|
EventSet to use the sampling of. |
required |
timeout |
Duration
|
Maximum amount of time to wait. If no sampling is observed before the timeout expires, the output feature value is NaN. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
Resulting EventSet. |
where #
where(
on_true: Union[EventSetOrNode, Any],
on_false: Union[EventSetOrNode, Any],
) -> EventSetOrNode
Choose event-wise feature values from on_true
or on_false
depending on the boolean value of self
.
Given an input EventSet
with a single boolean
feature, create a new one using the same sampling, and choosing values
from on_true
when the input is True
, otherwise take value from
on_false
.
Both on_true
and on_false
can be single values or
EventSets
with the same sampling as the boolean
input and one single feature. In any case, both sources must have the
same data type, or be explicitly casted to the same type beforehand.
Example with single values
>>> a = tp.event_set(timestamps=[5, 9, 9],
... features={'f': [True, True, False]})
>>> b = a.where(on_true='hello', on_false='goodbye')
>>> b
indexes: ...
events:
(3 events):
timestamps: [5. 9. 9.]
'f': [b'hello' b'hello' b'goodbye']
...
Example with EventSets
>>> a = tp.event_set(timestamps=[5, 9, 10],
... features={'condition': [True, True, False],
... 'yes': [1, 2, 3],
... 'no': [-1, -2, -3]})
>>> b = a['condition'].where(a['yes'], a['no'])
>>> b
indexes: ...
events:
(3 events):
timestamps: [ 5. 9. 10.]
'condition': [ 1 2 -3]
...
Example setting to NaN based on condition
>>> a = tp.event_set(timestamps=[5, 6, 7, 8, 9],
... features={'f': [1, 2, -3, -4, 5]})
>>> # Set values < 0 to nan (cast to float to support nan)
>>> b = (a['f'] >= 0).where(a['f'].cast(float), np.nan)
>>> b
indexes: ...
events:
(5 events):
timestamps: [5. 6. 7. 8. 9.]
'f': [ 1. 2. nan nan 5.]
...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
on_true |
Union[EventSetOrNode, Any]
|
Source of values from when the condition is True. |
required |
on_false |
Union[EventSetOrNode, Any]
|
Source of values from when the condition is False. |
required |
Returns:
Type | Description |
---|---|
EventSetOrNode
|
EventSet with a single feature and same sampling as input. |