Data Gap Filling


  • Time series queries can add extra data points into the gaps between entries. These data points get values extrapolated from the entries on either side of the gap. This is called interpolation.

  • There are two interpolation methods available:

    1. Nearest - add values equal to the value of the nearest entry.
    2. Linear - place the data points on a straight line between the entries on either side.
  • In this page:


Syntax

To add interpolation to a time series query, start by grouping the data by some unit.
For example, suppose you have a time series with an entry for every hour, but several hours are missing (1am, 3pm, etc.), and you want to fill those gaps. You will want to group by 1 hour.
Or suppose you have a time series with an entry for every hour, and you want to fill in the gap with one data point per minute: you will group by 1 minute.
See here to learn about aggregation in queries.

Next, use:

  • For RQL queries: interpolation()
  • For LINQ queries: the Interpolation option in TimeSeriesAggregationOptions.

The two interpolation modes are:

  1. Nearest: add entries with values equal to the closest time series entry before or after this data point. If the data point is exactly in the middle between two entries, the data point gets the value of the earlier entry.

  2. Linear: the data point is placed on a projected line between two entries. For example, if the entry for 1:00 PM has a value 100, and the entry for 2:00 PM has a value 130, the interpolated data point for 1:40 PM will have the value 120.

One data point is added for each aggregated time unit that does not contain any values. When time series entries have multiple values, an interpolation will add one data point for each pair of values found on both sides of the gap.

Examples

var query = session.Advanced.RawQuery<TimeSeriesAggregationResult>(@"
    from People
    select timeseries(
        from HeartRates
        group by 1 second
        with interpolation(linear)
    ");
var query = session.Query<People>()
    .Select(p => RavenQuery.TimeSeries(p, "HeartRates")
        .GroupBy(g => g
            .Hours(1)
            .WithOptions(new TimeSeriesAggregationOptions
            {
                Interpolation = InterpolationType.Linear
            }))
        .ToList());