Data Binding

Contents

Simple Cases

SQL DB

If the time-series data is stored in a table data_table with the contents like this (“long format”):

channel timestamp value
sccm.Alicat.Inj.Gas 2022-09-15 03:19:25.496212+00 0
V.ThermoCo.Diss.AS 2022-09-15 03:19:27.612427+00 6.605405e-05
mbar.IG.Vac.AS 2022-09-15 03:19:31.490579+00 2.26e-07
mbar.IG.Vac.MS 2022-09-15 03:19:31.529545+00 2e-09
mbar.IG.Vac.BS 2022-09-15 03:19:31.610188+00 4e-09

Then the data source part of the configuration looks like:

slowdash_project:
  data_source:
    url: postgresql://USER:PASS@localhost:5432/DBNAME
    time_series:
      schema: data_table[channel]@timestamp=value

Change PostgreSQL to MySQL or SQLite for other SQL DB systems.

InfluxDB

If only one tag is used for the channel names and one field for data values (MEAS,channel=CH1 value=VAL1 TIME):

slowdash_project:
  data_source:
    url: influxdb://ORGANIZATION@localhost:8086/BUCKET/MEASUREMENT
    token: TOKEN

Redis

All the time-series entries will be taken from a database:

slowdash_project:
  data_source:
    url: redis://localhost:6739/DB

Concepts and Terminologies

Data Model

Time-Series Data

Time Representation

Data Indexing (Channels)

Schema Descriptor

For a table storing time-series data, a “schema descriptor” describes which columns are for timestamps, tas(s) and field(s).

Examples

Syntax

Restrictions

Channels and Data-Store Schemata

Note that the numeric data values shown here can be of any other scalar types (string etc.) or objects (histogram etc.).

Case 1: Tag Values for Channel

metric timestamp value
psia.Alicat.Inj.Gas 2022-09-15 03:19:25.419417+00 9.6
degC.Alicat.Inj.Gas 2022-09-15 03:19:25.458695+00 23.42
sccm.Alicat.Inj.Gas 2022-09-15 03:19:25.496212+00 0
V.ThermoCo.Diss.AS 2022-09-15 03:19:27.612427+00 6.605405e-05
V.PS.Diss.AS 2022-09-15 03:19:29.387352+00 0.01
A.PS.Diss.AS 2022-09-15 03:19:29.416561+00 0
mbar.IG.Vac.AS 2022-09-15 03:19:31.490579+00 2.26e-07
mbar.IG.Vac.MS 2022-09-15 03:19:31.529545+00 2e-09
mbar.IG.Vac.VSS 2022-09-15 03:19:31.56965+00 1.3e-08
mbar.IG.Vac.BS 2022-09-15 03:19:31.610188+00 4e-09

Case 2: Field Names for Channel

RunNumber TimeStamp sccm.Alicat.Inj mbar.CC10.Inj K.ThrmCpl.Diss mbar.IG.AS
3098 1664916014 3 1.18467 340.58 5.38333e-05
3097 1664915456 3 1.256 503.275 5.36e-05
3096 1664914833 3 1.36833 745.743 5.38333e-05
3095 1664913608 3 1.447 1154.09 5.44e-05
3094 1664913032 3 1.48933 1501.14 5.46667e-05
3093 1664912407 3 1.533 1799.61 5.52667e-05
3092 1664911835 3 1.576 2060.59 5.56e-05
3091 1664910949 0.1 0.163633 2069.99 3.82667e-06
3090 1664910320 0.1 0.163533 1820.41 2.72333e-06
3089 1664909732 0.1 0.163533 1521.82 2.54e-06

Case 3: Tag Values and Field Names for Channel

metric timestamp value_raw value_cal
psia.Alicat.Inj.Gas 2022-09-15 03:19:25.419417+00 9.6 9.6
degC.Alicat.Inj.Gas 2022-09-15 03:19:25.458695+00 23.42 23.42
sccm.Alicat.Inj.Gas 2022-09-15 03:19:25.496212+00 0 0
V.ThermoCo.Diss.AS 2022-09-15 03:19:27.612427+00 6.605405e-05 6.605405e-05
V.PS.Diss.AS 2022-09-15 03:19:29.387352+00 0.01 0.01
A.PS.Diss.AS 2022-09-15 03:19:29.416561+00 0 0
mbar.IG.Vac.AS 2022-09-15 03:19:31.490579+00 2.26e-07 2.26e-07
mbar.IG.Vac.MS 2022-09-15 03:19:31.529545+00 2e-09 2e-09
mbar.IG.Vac.VSS 2022-09-15 03:19:31.56965+00 1.3e-08 1.3e-08
mbar.IG.Vac.BS 2022-09-15 03:19:31.610188+00 4e-09 4e-09

Case 4: Multiple Tags (Tag + Flag(s)) and Fields

metric set_or_ist timestamp value_raw value_cal
psia.Alicat.Inj.Gas ist 2022-09-15 03:19:25.419417+00 9.6 9.6
degC.Alicat.Inj.Gas ist 2022-09-15 03:19:25.458695+00 23.42 23.42
sccm.Alicat.Inj.Gas ist 2022-09-15 03:19:25.496212+00 0 0
V.ThermoCo.Diss.AS ist 2022-09-15 03:19:27.612427+00 6.605405e-05 6.605405e-05
V.PS.Diss.AS ist 2022-09-15 03:19:29.387352+00 0.01 0.01
A.PS.Diss.AS ist 2022-09-15 03:19:29.416561+00 0 0
mbar.IG.Vac.AS ist 2022-09-15 03:19:31.490579+00 2.26e-07 2.26e-07
mbar.IG.Vac.MS ist 2022-09-15 03:19:31.529545+00 2e-09 2e-09
mbar.IG.Vac.VSS ist 2022-09-15 03:19:31.56965+00 1.3e-08 1.3e-08
mbar.IG.Vac.BS ist 2022-09-15 03:19:31.610188+00 4e-09 4e-09

RDBMS (SQL Database)

Supported Database Systems

DBMS Python Module Used Async Server-side
down sampling
PostgreSQL psycopg2 no yes
PostgreSQL - Async access asyncpg yes yes
MySQL mysql-connector-python no yes
MySQL - Async access aiomysql yes yes
SQLite (none) no no
Others (generic) sqlalchemy no no

Configuration

PostgreSQL

Write SlowDash project configuration file (SlowdashProject.yaml) like below.

slowdash_project:
  data_source:
    url: postgresql://USER:PASS@HOST:PORT/DB
    schema: TABLE [ TAG_COLUMN ] @ TIME_COLUMN

MySQL

Write SlowDash project configuration file (SlowdashProject.yaml) like below.

slowdash_project:
  data_source:
    url: mysql://USER:PASS@HOST:PORT/DB
    schema: TABLE [ TAG_COLUMN ] @ TIME_COLUMN

SQLite

Write SlowDash project configuration file (SlowdashProject.yaml) like below. The SQLite DB file path is relative to the project directory.

slowdash_project:
  data_source:
    url: sqlite:///FILENAME
    schema: TABLE [ TAG_COLUMN ] @ TIME_COLUMN

Other SQL DBMS

slowdash_project:
  data_source:
    type: SQLAlchemy
    url: DBTYPE://USER:PASS@HOST:PORT/DB

Avoiding Channel Scanning Overhead

In order to obtain a list of channels, SlowDash makes a query like SELECT DISTINCT tag FROM table, which can be slow if the table is large. This can be avoided either by manually listing the channels or providing an efficient SQL statement:

By a manual list

      time_series:
        schema: numeric_data[endpoint]@timestamp(with timezone)=value_raw
        tags:
           list: [ "Ch0", "Ch1", "Ch2", "Ch3" ]

By a SQL query

If the database has a channel table, separately from the data table:

      time_series:
        schema: numeric_data[endpoint]@timestamp(with timezone)=value_raw
        tags:
           sql: "select name from endpoint_table"

Time-stamp Considerations

SlowDash can handle several different data time-stamp types:

PostgreSQL

PostgreSQL supports several different time types:

TIMESTAMP WITH TIME ZONE is recommended for use with SlowDash.

MySQL

MySQL supports several different time types:

The MySQL libraries used in SlowDash (aiomysql) cannot handle time-zone with the TIMESTAMP type, therefore if the TIMESTAMP type is used, it must be specified in SlowDash configuration file as unspeficied utc. Using the UNIX timestamps is always safe.

UNIX timestamps as REAL is recommended for use with SlowDash.

SQLite

SQLite does not have dedicated date-time type, and time information can be stored in time-string as TEXT or UNIX timestamps as INTEGER/REAL. If a field is declared as DATETIME, it uses TEXT for time string in UTC.

If the time string in TEXT (or DATETIME) is used, it must be specified in SlowDash configuration file as unspeficied utc. Using the UNIX timestamps is always safe.

UNIX timestamps as REAL is recommended for use with SlowDash.

Time-Series of Scalar Values

To access a table containing time-series data, write the schema in the time_series entry:

slowdash_project:
  data_source:
    url: postgresql://p8_db_user:****@localhost:5432/p8_sc_db
    time_series:
      schema: numeric_data[endpoint]@timestamp(with timezone)=value_raw

Case 1: Tag Values for Channel

      time_series:
        schema: numeric_data[endpoint]@timestamp(with timezone)=value_raw

Case 2: Fields for Channel

      time_series:
        schema: numeric_data@timestamp(with timezone)=value_raw,value_cal

Case 3: Tag and Fields for Channel

      time_series:
        schema: numeric_data[endpoint]@timestamp(with timezone)=value_raw(default),value_cal

Case 4: Tag, Flags and Fields for Channel

[TODO] Flags are currently not supported. To use the case 4 schemata, use the where and suffix options.

      time_series:
        schema: numeric_data[endpoint]@timestamp(with timezone)=value_raw(default),value_cal
        where: ist_or_set='ist'
        suffix: ':ist'

Time-Series of Histograms, Graphs, Tables & Trees

slowdash_project:
  data_source:
    url: postgresql://p8_db_user:****@localhost:5432/p8_sc_db
    object_time_series:
      schema: histograms[channel]@timestamp(unix)=json

Time-Series of Blobs

[TODO] DB system specific?

SQL Query Result as a Table

slowdash_project:
  data_source:
    url: sqlite:///RunTable.db
    view:
      name: RunTable
      sql: select * from RunTable where TimeStamp >= ${FROM_UNIXTIME} and TimeStamp < ${TO_UNIXTIME}

InfluxDB

Time-Series of Scalar Values

For simple cases, just specify the Measurement name in the time_series entry:

slowdash_project:
  data_source:
    url: influxdb://SlowDash@localhost:8086/TestData
    token: MY_TOKEN_HERE
    time_series:
      - measurement: TestTimeSeries

or in a short form:

slowdash_project:
  data_source:
    url: influxdb://SlowDash@localhost:8086/TestData/TestTimeSeries
    token: MY_TOKEN_HERE

Non-minimal Configuration

Case 1: Tag Values for Channel (meas,channel=CH1 value=VAL1 TIME)

If there is only one Tag for channels and one Field for data values, the simle configuration above works. For data containing multiple tags and/or fields, specify the names using schema:

      time_series:
        schema: meas[channel]=value

Case 2: Fields for Channel (meas ch1=VAL1,ch2=VAL2 TIME)

If the data does not have any Tag and all the Fields are used, the simle configuration above works. For data containing multiple tags and/or fields, specify the names using schema:

      time_series:
        schema: meas=ch1,ch2

Case 3: Tag and Fields for Channel (meas,channel=CH1 value_raw=VALRAW,value_cal=VALCAL TIME)

Use schema to describe which tag and fields are used:

      time_series:
        schema: meas[channel]=value_raw(default),value_cal

Specifying Data Period

Channel names (tag values and fields) are obtained through InfluxDB queries; however InfluxDB defaults do not extend back to a long range. The data time range for channel scan can be specified in the configuration:

      time_series:
        schema: meas
        tags:
          scan_date: 2025-01-01
          scan_length: 100d

The tag values and fields are scanned for the time range scan_date - scan_length to scan_date. Default scan_date is now() and default scan_length is 10d. For the scan length, the following length units are available: s (seconds: default), m (minutes), h (hours), d (days), w (weeks), and y (years).

Time-Series of Histograms, Graphs, Tables & Trees

slowdash_project:
  data_source:
    url: influxdb://SlowDash@localhost:8086/TestData
    token: MY_TOKEN_HERE
    object_time_series:
      - measurement: TestTimeSeriesOfObjects

Time-Series of Blobs

Redis

Data Binding

Time-Series of Scalar Values

slowdash_project:
  data_source:
    url: redis://localhost:6379
    time_series: { db: 1 }

or in a short form:

slowdash_project:
  data_source:
    url: redis://localhost:6379/1

Time-Series of Histograms, Graphs, Tables & Trees

Implementation

Configuration File

slowdash_project:
  data_source:
    url: redis://localhost:6379
    object_time_series: { db: 2 }

Writer Utility

import os, sys, time
import numpy as np
import slowpy as slp
    
redis = slp.DataStore_Redis(host='localhost', port=6379, db=1, retention_length=60)
histogram = slp.Histogram('test_histogram_01', 100, 0, 10)
graph = slp.Graph('test_graph_01', ['time', 'value'])

redis.flush_db()

tk, xk = 0, 0
while True:
    for i in range(10):
        histogram.fill(np.random.normal(5, 2))
        
    tk = tk + 1
    xk += np.random.normal(0.1, 1)
    graph.add_point(tk, xk)

    redis.write_object_timeseries(histogram)
    redis.write_object_timeseries(graph)

    time.sleep(1)

Time-series of Blobs

Hash as a Tree

“Current” Histograms, Graphs, Tables & Trees

import redis
r = redis.Redis('localhost', 6379, 1)
hist = {
  'bins': { 'min': 0, 'max': 100 },
  'counts': [ 3, 5, 8, 14, 11, 3, 6, 4, 4, 1 ]
}
r.json().set('hist00', '$', hist)

MongoDB

Time-Series of Scalars

[TODO] implement schema-based binding

Time-Series of Histograms, Graphs, Tables & Trees

Time-Series of Blobs

CouchDB

Schema

Create a CouchDB view with the key being the time-stamp, e.g.:

function (doc) {
    const record = {
        "tag": doc.channel,
        "field01": doc.value01,
        "field02": doc.value02,
        ...
    }
    emit(doc.timestamp, record);
}

Here the record fields can be scalars, JSON, or Blob-ID. Tags are optional. Multiple views can be created. Currently only the UNIX timestamp is accepted for the key.

The SlowDash schema description for CouchDB is similar to that of SQL DB, except that a view is used instead of a table. As timestamps are used for the keys of the view, the time information is not necessary in the schema description.

VIEW_NAME [TAG] = FIELD1, FIELD2, ...

VIEW_NAME is DESIGN_DOCUMENT_NAME/INDEX_NAME.

If a view contains only data fields, and all the data fields are taken into channels, the schema description can be a just view name.

Write the schema in the project configuration file:

slowdash_project:
  data_source:
    url: couchdb://USER:PASS@localhost:5984/MyDB
    time_series: 
      schema: MyDesignDoc/MyIndex

channels are scanned from the data, but old channels that do not appear in a last segment of data might not be found. In that case, explicitly list the channel names:

      time_series: 
        schema: MyDesignDoc/MyIndex[Tag] = Field01, Field02, ...
        tags: 
          list: [ 'Tag01', 'Tag02', ... ]

Time-Series of Scalars

Case 1: Tag Values for Channel

      time_series: 
        schema: MyDesigDoc/MyIndex[Tag]

Case 2: Fields for Channel

      time_series: 
        schema: MyDesignDoc/MyIndex = Field01, Field02,...

Case 3: Tag and Fields for Channel

      time_series: 
        schema: MyDesigDoc/MyIndex[Tag] = Field01(default), Field02, ...

Case 4: Tag, Flags and Fields for Channel

Flags are currently not supported. Modify the CouchDB view definition in a way that a tag includes flags.

Time-Series of Histograms, Graphs, Tables & Trees

Store JSON objects as data values:

function (doc) {
    const record = {
        "spectrum": {
            {
                "labels": [ "Frequency (Hz)", "FFT" ],
                "x": [ ... ],
                "y": [ ... ]
            }
        }
        ...
    };
    emit(doc.timestamp, record);
}

Then specify the schema in object_time_series:

slowdash_project:
  data_source:
    url: couchdb://USER:PASS@localhost:5984/MyDB
    object_time_series: 
      schema: MyDesignDoc/MyIndex

Time-Series of Blobs

function (doc) {
    const record = {
        "photo": {
            mime: doc._attachments[doc.name].content_type, 
            id: doc._id + '/' + doc.name, 
            meta: doc.parameters
        }
        ...
    };
    emit(doc.timestamp, record);
}

View-Rows as a Table

A CouchDB view can be accessed as a table object with view_table:

slowdash_project:
  data_source:
    url: couchdb://USER:PASS@localhost:5984/MyDB
    view_table: 
      name: DataTable
      schema: MyDesignDoc/MyIndex = Field01, Field02, ..

View-Row as a Tree

A row of a CouchDB view can be accessed as a tree object with view_tree:

slowdash_project:
  data_source:
    url: couchdb://USER:PASS@localhost:5984/MyDB
    view_tree: 
      name: DataRecord
      schema: MyDesignDoc/MyIndex = Field01, Field02, ..

Database information as a Tree

Current database information such as the file size can be obtained as a tree data object. Just make a database_info entry in the data_source definition with a name:

slowdash_project:
  data_source:
    url: couchdb://USER:PASS@localhost:5984/MyDB
    database_info: 
      name: DBInfo

Local CSV Files

Time-Series of Scalar Values

To access a table containing time-series data, write the schema in the time_series entry:

slowdash_project:
  data_source:
    url: csv:///PATH/TO/DIRECTORY
    time_series:
      schema: FILE_ROOT_NAME [ TAG_COLUMN ] @ TIME_COLUMN

Time-Series of Histograms, Graphs, Tables & Trees

“Current” Histograms, Graphs, Tables & Trees

slowdash_project:
  data_source:
    url: csv:///PATH/TO/DIRECTORY
    object_time_series:
      schema: histograms[channel]=value

Local YAML Files

slowdash_project:
  data_source:
    type: YAML
    name: RunConfig
    file: RunConfig.yaml
    valid_from: now

Local ROOT Files

[TODO]