3 posts tagged with "data"

Spice.ai v0.3.1-alpha

November 2, 2021 · 4 min read

Founder and CEO of Spice AI

We are excited to announce the release of Spice.ai v0.3.1-alpha! 🎃

This point release focuses on fixes and improvements to v0.3-alpha. Highlights include the ability to specify both seed and runtime data, to select custom named fields for time and tags, a new spice upgrade command and several bug fixes.

A special acknowledgment to @Adm28, who added the new spice upgrade command, which enables the CLI to self-update, which in turn will auto-update the runtime.

Highlights in v0.3.1-alpha

Upgrade command

The CLI can now be updated using the new spice upgrade command. This command will check for, download, and install the latest Spice.ai CLI release, which will become active on it's next run.

When run, the CLI will check for the matching version of the Spice.ai runtime, and will automatically download and install it as necessary.

The version of both the Spice.ai CLI and runtime can be checked with the spice version CLI command.

Seed data

When working with streaming data sources, like market prices, it's often also useful to seed the dataspace with historical data. Spice.ai enables this with the new seed_data node in the dataspace configuration. The syntax is exactly the same as the data syntax. For example:

dataspaces:
  - from: coinbase
    name: btcusd
    seed_data:
      connector: file
        params:
          path: path/to/seed/data.csv
      processor:
        name: csv
    data:
      connector: coinbase
        params:
          product_ids: BTC-USD
      processor:
        name: json

The seed data will be fetched first, before the runtime data is initialized. Both sets of connectors and processors use the dataspace scoped measurements, categories and tags for processing, and both data sources are merged in pod-scoped observation timeline.

Time field selectors

Before v0.3.1-alpha, data was required to include a specific time field. In v0.3.1-alpha, the JSON and CSV data processors now support the ability to select a specific field to populate the time field. An example selector to use the created_at column for time is:

data:
  processor:
    name: csv
    params:
      time_selector: created_at

Tag field selectors

Before v0.3.1-alpha, tags were required to be placed in a _tags field. In v0.3.1-alpha, any field can now be selected to populate tags. Tags are pod-unique string values, and the union of all selected fields will make up the resulting tag list. For example:

dataspace:
  from: twitter
  name: tweets
  tags:
    selectors:
      - tags
      - author_id
    values:
      - spice_ai
      - spicy

New in this release

Adds a new spice upgrade command for self-upgrade of the Spice.ai CLI.
Adds a new seed_data node to the dataspace configuration, enabling the dataspace to be seeded with an alternative source of data.
Adds the ability to select a custom time field in JSON and CSV data processors with the time_selector parameter.
Adds the ability to select custom tag fields in the dataspace configuration with selectors list.
Adds error reporting for AI engine crashes, where previously it would fail silently.
Fixes the dashboard pods list from "jumping" around due to being unsorted.
Fixes rare cases where categorical data might be sent to the AI engine in the wrong format.

Resources

Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved. We will also be starting a community call series soon!

Discord: https://discord.gg/kZnTfneP5u
Reddit: https://www.reddit.com/r/spiceai
Twitter: @spice_ai
Email: [email protected]

Spice.ai v0.3-alpha is now available

October 26, 2021 · 6 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

We are excited to announce the release of Spice.ai v0.3-alpha! 🎉

This release adds support for ingestion, automatic encoding, and training of categorical data, enabling more use-cases and datasets beyond just numerical measurements. For example, perhaps you want to learn from data that includes a category of t-shirt sizes, with discrete values, such as small, medium, and large. The v0.3 engine now supports this and automatically encodes the categorical string values into numerical values that the AI engine can use. Also included is a preview of data visualizations in the dashboard, which is helpful for developers as they author Spicepods and dataspaces.

A screenshot of the data visualization preview

A special acknowledgment to @sboorlagadda, who submitted the first Spice.ai feature contribution from the community ever! He added the ability to list pods from the CLI with the new spice pods list command. Thank you, @sboorlagadda!!!

A screenshot of the new spice pods list command and output.

If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.3-alpha

Categorical data

In v0.1, the runtime and AI engine only supported ingesting numerical data. In v0.2, tagged data was accepted and automatically encoded into fields available for learning. In this release, v0.3, categorical data can now also be ingested and automatically encoded into fields available for learning. This is a breaking change with the format of the manifest changing separating numerical measurements and categorical data.

Pre-v0.3, the manifest author specified numerical data using the fields node.

In v0.3, numerical data is now specified under measurements and categorical data under categories. E.g.

dataspaces:
  - from: event
    name: stream
    measurements:
      - name: duration
        selector: length_of_time
        fill: none
      - name: guest_count
        selector: num_guests
        fill: none
    categories:
      - name: event_type
        values:
          - dinner
          - party
      - name: target_audience
        values:
          - employees
          - investors
    tags:
      - tagA
      - tagB

Data visualizations preview

A top piece of community feedback was the ability to visualize data. After first running Spice.ai, we'd often hear from developers, "how do I see the data?". A preview of data visualizations is now included in the dashboard on the pod page.

Listing pods

Once the Spice.ai runtime has started, you can view the loaded pods on the dashboard and fetch them via API call localhost:8000/api/v0.1/pods. To make it even easier, we've added the ability to list them via the CLI with the new spice pods list command, which shows the list of pods and their manifest paths.

Coinbase data connector

A new Coinbase data connector is included in v0.3, enabling the streaming of live market ticker prices from Coinbase Pro. Enable it by specifying the coinbase data connector and providing a list of Coinbase Pro product ids. E.g. "BTC-USD". A new sample which demonstrates is also available with its associated Spicepod available from the spicerack.org registry. Get it with spice add samples/trader

Tweet Recommendation Quickstart

A new Tweet Recommendation Quickstart has been added. Given past tweet activity and metrics of a given account, this app can recommend when to tweet, comment, or retweet to maximize for like count, interaction rates, and outreach of said given Twitter account.

Trader Sample

A new Trader Sample has been added in addition to the Trader Quickstart. The sample uses the new Coinbase data connector to stream live Coinbase Pro ticker data for learning.

New in this release

Adds support for ingesting, encoding, and training on categorical data. v0.3 uses one-hot-encoding.
Changes Spicepod manifest fields node to measurements and add the categories node.
Adds the ability to select a field from the source data and map it to a different field name in the dataspace. See an example for measurements in docs.
Adds support for JSON content type when fetching from the /observations API. Previously, only CSV was supported.
Adds a preview version of data visualizations to the dashboard. The grid has several limitations, one of which is it currently cannot be resized.
Adds the ability to select which learning algorithm to use via the CLI, the API, and specified in the Spicepod manifest. Possible choices are currently "vpg", Vanilla Policy Gradient and "dql", Deep Q-Learning. Shout out to @corentin-pro, who added this feature on his second day on the team!
Adds the ability to list loaded pods with the CLI command spice pods list.
Adds a new coinbase data connector for Coinbase Pro market prices.
Adds a new Tweet Recommendation Quickstart.
Adds a new Trader Sample.
Fixes bug where the /observations endpoint was not providing fully qualified field names.
Fixes issue where debugging messages were printed when using spice add.

Resources

Community

Discord: https://discord.gg/kZnTfneP5u
Reddit: https://www.reddit.com/r/spiceai
Twitter: @spice_ai
Email: [email protected]

Spice.ai v0.2-alpha is now available

October 4, 2021 · 4 min read

Luke Kim

Founder and CEO of Spice AI

We are excited to announce the release of Spice.ai v0.2-alpha! 🎉

This release is the first major version since the initial v0.1 announcement and includes significant improvements based upon community and customer feedback. If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.2-alpha

Tagged data

In the first release, the runtime and AI engine could only ingest numerical data. In v0.2, tagged data is accepted and automatically encoded into fields available for learning. For example, it's now possible to include a "liked" tag when using tweet data, automatically encoded to a 0/1 field for training. Both CSV and the new JSON observation formats support tags. The v0.3 release will add additional support for sets of categorical data.

Streaming data

Previously, the runtime would trigger each data connector to fetch on a 15-second interval. In v0.2, we upgraded the interface for data connectors to a push/streaming model, which enables continuous streaming data into the environment and AI engine.

Interpreted data

Spice.ai works together with your application code and works best when it's provided continuous feedback. This feedback could be from the application itself, for example, ratings, likes, thumbs-up/down, profit from trades, or external expertise. The interpretations API was introduced in v0.1.1, and v0.2 adds AI engine support providing a way to give meaning or an interpretation of ranges of time-series data, which are then available within reward functions. For example, a time range of stock prices could be a "good time to buy," or perhaps Tuesday mornings is a "good time to tweet," and an application or expert can teach the AI engine this through interpretations providing a shortcut to it's learning.

New in this release

Adds core runtime and AI engine tagged data support
Adds tagged data support to the CSV processor
Adds streaming data support to the engine and data connectors
Adds a new JSON data processor for ingesting JSON data
Adds a new Twitter data connector with JSON processor support
Adds a new /pods//dataspaces API
Adds support for using interpretations in reward functions Learn more.
Adds support for downloading zipped pods from the spicerack.org registry
Adds support for adding data along with the pod manifest when adding a pod from the spicerack.org registry
Adds basic /pods//diagnostics API
Fixes pod period, interval, and granularity not being correctly set when trying to use a "d" format
Fixes the color scheme of action counts in the dashboard to improve readability

Resources

Community

Discord: https://discord.gg/kZnTfneP5u
Reddit: https://www.reddit.com/r/spiceai
Twitter: @spice_ai
Email: [email protected]

Highlights in v0.3.1-alpha​

Upgrade command​

Seed data​

Time field selectors​

Tag field selectors​

New in this release​

Resources​

Community​

Highlights in v0.3-alpha​

Categorical data​

Data visualizations preview​

Listing pods​

Coinbase data connector​

Tweet Recommendation Quickstart​

Trader Sample​

New in this release​

Resources​

Community​

Highlights in v0.2-alpha​

Tagged data​

Streaming data​

Interpreted data​

New in this release​

Resources​

Community​

Highlights in v0.3.1-alpha

Upgrade command

Seed data

Time field selectors

Tag field selectors

New in this release

Resources

Community

Highlights in v0.3-alpha

Categorical data

Data visualizations preview

Listing pods

Coinbase data connector

Tweet Recommendation Quickstart

Trader Sample

New in this release

Resources

Community

Highlights in v0.2-alpha

Tagged data

Streaming data

Interpreted data

New in this release

Resources

Community