Introducing SampleStream
How It Works
SampleStream lets you iterate over a query.
SampleStream takes an AQL request, a window size, a number of samples and a step size.
Every time you ask SampleStream for the next sample of data, it returns the data for window size seconds of data, broken into number of samples buckets.
For example, if you wanted to look at events happening over a 30 second window at 1 second resolution, your window size would be 30 and your number of samples would also be 30 (30 samples over 30 seconds = 1 sample per second).
Once you've set up a span to search through, you'll be able to request new samples of data. Each will be step size seconds away from the previous sample.
This makes it quite easy to scan over a period of time.
Differences to a Normal Query
There are some key differences between using SampleStream and a normal ARDI query…
Long Time Frames
In the background, SampleStream only requests part of the data range you've requested (unless the time-frame is very small).
In long time-frames or very high-resolution data, there may be thousands or even millions of individual records to be process. Rather thank asking for them all at once - which can be stressful to the ARDI server and the systems it connects to - it requests a small window ahead of your current position.
This means your application never consumes too much memory or system resources at any one time. If you're searching for an event, it also means you can break out of the loop at any time and you won't request any redundant data from ARDI.
This is particularly useful if you don't know how far away your search target is - in some cases the data you're looking for is only seconds ago, but some applications might have days of the system being stopped/offline. SampleStream allows you to search for events over large durations without overwhelming your systems.
Consistent Time Buckets
ARDIs APIs don't guarantee consistent time gaps in the data you get back from a query. Particularly if you're requesting discrete data (such as on/off signals), the returned time-stamps might be erratic.
The data returned from SampleStream is always consistently spaced.
Discrete Splitting
As part of the previous point, the system also breaks discrete signals into different channels for each value.
For example, if you ask for a property that has a value of ON or OFF, instead of getting a single property back, you'll get two - one for 'on' and one for 'off'.
For each, you'll get the percentage of the time window that the property was equal to the value.
AI Normalisation and Formatting
The samples include specific functions to return their results in formats that are suitable for use in AI, such as Tensorflow.
The output data is normalised and structured into arrays that can be used in RNN or Convolutional Neural Network applications.