Phillip LeBlanc

Co-Founder and CTO of Spice AI

View all authors

Spice.ai v0.5-alpha

December 6, 2021 · 3 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

We are excited to announce the release of Spice.ai v0.5-alpha! 🥇

Highlights include a new learning algorithm called "Soft Actor-Critic" (SAC), fixes to the behavior of spice upgrade, and a more consistent authoring experience for reward functions.

If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.5-alpha

Soft Actor-Critic (Discrete) (SAC) Learning Algorithm

The addition of the Soft Actor-Critic (Discrete) (SAC) learning algorithm is a significant improvement to the power of the AI engine. It is not set as the default algorithm yet, so to start using it pass the --learning-algorithm sacd parameter to spice train. We'd love to get your feedback on how its working!

Consistent reward authoring experience

With the addition of the reward function files that allow you to edit your reward function in a Python file, the behavior of starting a new training session by editing the reward function code was lost. With this release, that behavior is restored.

In addition, there is a breaking change to the variables used to access the observation state and interpretations. This change was made to better reflect the purpose of the variables and make them easier to work with in Python

Previous (Type)	New (Type)
`prev_state` (SimpleNamespace)	`current_state` (dict)
`prev_state.interpretations` (list)	`current_state_interpretations` (list)
`new_state` (SimpleNamespace)	`next_state` (dict)
`new_state.interpretations` (list)	`next_state_interpretations` (list)

Improved spice upgrade behavior

The Spice.ai CLI will no longer recommend "upgrading" to an older version. An issue was also fixed where trying to upgrade the Spice.ai CLI using spice upgrade on Linux would return an error.

New in this release

Adds a new learning algorithm called "Soft-Actor Critic" (SAC).
Updates the reward function parameters for the YAML code blocks from prev_state and new_state to current_state and next_state to be consistent with the reward function files.
Fixes an issue where editing a reward functions file would not automatically trigger training.
Fixes the normalization of values for the Deep-Q Learning algorithm to handle larger values.
Fixes an issue where the Spice.ai CLI would not upgrade on Linux with the spice upgrade command.
Fixes an issue where the Spice.ai CLI would recommend an "upgrade" to an older version.

Resources

Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved. We will also be starting a community call series soon!

Discord: https://discord.gg/kZnTfneP5u
Reddit: https://www.reddit.com/r/spiceai
Twitter: @spice_ai
Email: [email protected]

AI needs AI-ready data

December 5, 2021 · 6 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

A significant challenge when developing an app powered by AI is providing the machine learning (ML) engine with data in a format that it can use to learn. To do that, you need to normalize the numerical data, one-hot encode categorical data, and decide what to do with incomplete data - among other things.

This data handling is often challenging! For example, to learn from Bitcoin price data, the prices are better if normalized to a range between -1 and 1. Being close to 0 is also a problem because of the lack of precision in floating-point representations (usually under 1e-5).

As a developer, if you are new to AI and machine learning, a great talk that explains the basics is Machine Learning Zero to Hero. Spice.ai makes the process of getting the data into an AI-ready format easy by doing it for you!

What is AI-ready data?

You write code with if statements and functions, but your machine only understands 1s and 0s. When you write code, you leverage tools, like a compiler, to translate that human-readable code into a machine-readable format.

Similarly, data for AI needs to be translated or "compiled" to be understood by the ML engine. You may have heard of tensors before; they are simply another word for a multi-dimensional array and they are the language of ML engines. All inputs to and all outputs from the engine are in tensors. You could use the following techniques when converting (or "compiling") source data to a tensor.

Normalization/standardization of the numerical input data. Many of the inputs and outputs in machine learning are interpreted as probability distributions. Much of the math that powers machine learning, such as softmax, tanh, sigmoid, etc., is meant to work in the [-1, 1] range.

Normalizing raw data Figure 1. Normalizing Bitcoin price data.

Conversion of categorical data into numerical data. For categorical data (i.e., colors such as "red," "blue," or "green"), you can achieve this through a technique called "One Hot Encoding." In one hot encoding, each possible value in the category appears as a column. The values in the column are assigned a binary value of 1 or 0 depending on whether the value exists or not.

Figure 2. A visualization of one-hot encoding.

Several advanced techniques exist for "compiling" this source data - this process is known in the AI world as "feature engineering." This article goes into more detail on feature engineering techniques if you are interested in learning more.

There are excellent tools like Pandas, Numpy, scipy, and others that make the process of data transformation easier. However, most of these tools are Python libraries and frameworks - which means having to learn Python if you don't know it already. Plus, when building intelligent apps (instead of just doing pure data analysis), this all needs to work on real-time data in production.

Building intelligent apps

The tools mentioned above are not designed for building real-time apps. They are often designed for analytics/data science.

In your app, you will need to do this data compilation in real-time - and you can't rely on a local script to help process your data. It becomes trickier if the team responsible for the initial training of the machine learning model is not the team responsible for deploying it out into production.

How data is loaded and processed in a static dataset is likely very different from how the data is loaded and processed in real-time as your app is live. The result often is two separate codebases that are maintained by different teams that are both responsible for doing the same thing! Ensuring that those codebases stay consistent and evolve together is another challenge to tackle.

Spice.ai helps developers build apps with real-time ML

Spice.ai handles the "compilation" of data for you.

You specify the data that your ML should learn from in a Spicepod. The Spice.ai runtime handles the logistics of gathering the data and compiling it into an AI-ready format.

It does this by using many techniques described earlier, such as normalization and one-hot encoding. And because we're continuing to evolve Spice.ai, our data compilation will only get better over time.

In addition, the design of the Spice.ai runtime naturally ensures that the data used for both the training and real-time cases are consistent. Spice.ai uses the same data-components and runtime logic to produce the data. And not only that, you can take this a step further and share your Spicepod with someone else, and they would be able to use the same AI-ready data for their applications.

Summary

Spice.ai handles the process of compiling your data into an AI-ready format in a way that is consistent both during the training and real-time stages of the ML engine. A Spicepod defines which data to get and where to get it. Sharing this Spicepod allows someone else to use the same AI-ready data format in their application.

Learn more and contribute

Building intelligent apps that leverage AI is still way too hard, even for advanced developers. Our mission is to make this as easy as creating a modern web page. If the vision resonates with you, join us!

Our Spice.ai Roadmap is public, and now that we have launched, the project and work are open for collaboration.

If you are interested in partnering, we'd love to talk. Try out Spice.ai, email us "hey," get in touch on Discord, or reach out on Twitter.

We are just getting started! 🚀

Phillip

Spice.ai v0.4-alpha

November 15, 2021 · 4 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

We are excited to announce the release of Spice.ai v0.4-alpha! 🏄‍♂️

Highlights include support for authoring reward functions in a code file, the ability to specify the time of recommendation, and ingestion support for transaction/correlation ids. Authoring reward functions in a code file is a significant improvement to the developer experience than specifying functions inline in the YAML manifest, and we are looking forward to your feedback on it!

If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.4-alpha

Upgrade using spice upgrade

The spice upgrade command was added in the v0.3.1-alpha release, so you can now upgrade from v0.3.1 to v0.4 by simply running spice upgrade in your terminal. Special thanks to community member @Adm28 for contributing this feature!

Reward Function Files

In addition to defining reward code inline, it is now possible to author reward code in functions in a separate Python file.

The reward function file path is defined by the reward_funcs property.

A function defined in the code file is mapped to an action by authoring its name in the with property of the relevant reward.

Example:

training:
  reward_funcs: my_reward.py
  rewards:
    - reward: buy
      with: buy_reward
    - reward: sell
      with: sell_reward
    - reward: hold
      with: hold_reward

Learn more in the documentation: docs.spiceai.org/concepts/rewards/external

Time Categories

Spice.ai can now learn from cyclical patterns, such as daily, weekly, or monthly cycles.

To enable automatic cyclical field generation from the observation time, specify one or more time categories in the pod manifest, such as a month or weekday in the time section.

For example, by specifying month the Spice.ai engine automatically creates a field in the AI engine data stream called time_month_{month} with the value calculated from the month of which that timestamp relates.

Example:

time:
  categories:
    - month
    - dayofweek

Supported category values are: month dayofmonth dayofweek hour

Learn more in the documentation: docs.spiceai.org/reference/pod/#time

Get recommendation for a specific time

It is now possible to specify the time of recommendations fetched from the /recommendation API.

Valid times are from pod epoch_time to epoch_time + period.

Previously the API only supported recommendations based on the time of the last ingested observation.

Requests are made in the following format: GET http://localhost:8000/api/v0.1/pods/{pod}/recommendation?time={unix_timestamp}

An example for quickstarts/trader

GET http://localhost:8000/api/v0.1/pods/trader/recommendation?time=1605729600

Specifying {unix_timestamp} as 0 will return a recommendation based on the latest data. An invalid {unix_timestamp} will return a result that has the valid time range in the error message:

{
  "response": {
    "result": "invalid_recommendation_time",
    "message": "The time specified (1610060201) is outside of the allowed range: (1610057600, 1610060200)",
    "error": true
  }
}

New in this release

Adds time categories configuration to the pod manifest to enable learning from cyclical patterns in data - e.g. hour, day of week, day of month, and month
Adds support for defining reward functions in a rewards functions code file.
Adds the ability to specify recommendation time making it possible to now see which action Spice.ai recommends at any time during the pod period.
Adds support for ingestion of transaction/correlation identifiers (e.g. order_id, trace_id) in the pod manifest.
Adds validation for invalid dataspace names in the pod manifest.
Adds the ability to resize columns to the dashboard observation data grid.
Updates to TensorFlow 2.7 and Keras 2.7
Fixes a bug where data processors were using data connector params
Fixes a dashboard issue in the pod observations data grid where a column might not be shown.
Fixes a crash on pod load if the training section is not included in the manifest.
Fixes an issue where data manager stats errors were incorrectly being printed to console.
Fixes an issue where selectors may not match due to surrounding whitespace.

Resources

Community

Discord: https://discord.gg/kZnTfneP5u
Reddit: https://www.reddit.com/r/spiceai
Twitter: @spice_ai
Email: [email protected]

Spice.ai v0.3-alpha is now available

October 26, 2021 · 6 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

We are excited to announce the release of Spice.ai v0.3-alpha! 🎉

This release adds support for ingestion, automatic encoding, and training of categorical data, enabling more use-cases and datasets beyond just numerical measurements. For example, perhaps you want to learn from data that includes a category of t-shirt sizes, with discrete values, such as small, medium, and large. The v0.3 engine now supports this and automatically encodes the categorical string values into numerical values that the AI engine can use. Also included is a preview of data visualizations in the dashboard, which is helpful for developers as they author Spicepods and dataspaces.

A screenshot of the data visualization preview

A special acknowledgment to @sboorlagadda, who submitted the first Spice.ai feature contribution from the community ever! He added the ability to list pods from the CLI with the new spice pods list command. Thank you, @sboorlagadda!!!

A screenshot of the new spice pods list command and output.

If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.3-alpha

Categorical data

In v0.1, the runtime and AI engine only supported ingesting numerical data. In v0.2, tagged data was accepted and automatically encoded into fields available for learning. In this release, v0.3, categorical data can now also be ingested and automatically encoded into fields available for learning. This is a breaking change with the format of the manifest changing separating numerical measurements and categorical data.

Pre-v0.3, the manifest author specified numerical data using the fields node.

In v0.3, numerical data is now specified under measurements and categorical data under categories. E.g.

dataspaces:
  - from: event
    name: stream
    measurements:
      - name: duration
        selector: length_of_time
        fill: none
      - name: guest_count
        selector: num_guests
        fill: none
    categories:
      - name: event_type
        values:
          - dinner
          - party
      - name: target_audience
        values:
          - employees
          - investors
    tags:
      - tagA
      - tagB

Data visualizations preview

A top piece of community feedback was the ability to visualize data. After first running Spice.ai, we'd often hear from developers, "how do I see the data?". A preview of data visualizations is now included in the dashboard on the pod page.

Listing pods

Once the Spice.ai runtime has started, you can view the loaded pods on the dashboard and fetch them via API call localhost:8000/api/v0.1/pods. To make it even easier, we've added the ability to list them via the CLI with the new spice pods list command, which shows the list of pods and their manifest paths.

Coinbase data connector

A new Coinbase data connector is included in v0.3, enabling the streaming of live market ticker prices from Coinbase Pro. Enable it by specifying the coinbase data connector and providing a list of Coinbase Pro product ids. E.g. "BTC-USD". A new sample which demonstrates is also available with its associated Spicepod available from the spicerack.org registry. Get it with spice add samples/trader

Tweet Recommendation Quickstart

A new Tweet Recommendation Quickstart has been added. Given past tweet activity and metrics of a given account, this app can recommend when to tweet, comment, or retweet to maximize for like count, interaction rates, and outreach of said given Twitter account.

Trader Sample

A new Trader Sample has been added in addition to the Trader Quickstart. The sample uses the new Coinbase data connector to stream live Coinbase Pro ticker data for learning.

New in this release

Adds support for ingesting, encoding, and training on categorical data. v0.3 uses one-hot-encoding.
Changes Spicepod manifest fields node to measurements and add the categories node.
Adds the ability to select a field from the source data and map it to a different field name in the dataspace. See an example for measurements in docs.
Adds support for JSON content type when fetching from the /observations API. Previously, only CSV was supported.
Adds a preview version of data visualizations to the dashboard. The grid has several limitations, one of which is it currently cannot be resized.
Adds the ability to select which learning algorithm to use via the CLI, the API, and specified in the Spicepod manifest. Possible choices are currently "vpg", Vanilla Policy Gradient and "dql", Deep Q-Learning. Shout out to @corentin-pro, who added this feature on his second day on the team!
Adds the ability to list loaded pods with the CLI command spice pods list.
Adds a new coinbase data connector for Coinbase Pro market prices.
Adds a new Tweet Recommendation Quickstart.
Adds a new Trader Sample.
Fixes bug where the /observations endpoint was not providing fully qualified field names.
Fixes issue where debugging messages were printed when using spice add.

Resources

Community

Discord: https://discord.gg/kZnTfneP5u
Reddit: https://www.reddit.com/r/spiceai
Twitter: @spice_ai
Email: [email protected]

Highlights in v0.5-alpha​

Soft Actor-Critic (Discrete) (SAC) Learning Algorithm​

Consistent reward authoring experience​

Improved spice upgrade behavior​

New in this release​

Resources​

Community​

What is AI-ready data?​

Building intelligent apps​

Spice.ai helps developers build apps with real-time ML​

Summary​

Learn more and contribute​

Highlights in v0.4-alpha​

Upgrade using spice upgrade​

Reward Function Files​

Time Categories​

Get recommendation for a specific time​

New in this release​

Resources​

Community​

Highlights in v0.3-alpha​

Categorical data​

Data visualizations preview​

Listing pods​

Coinbase data connector​

Tweet Recommendation Quickstart​

Trader Sample​

New in this release​

Resources​

Community​

Highlights in v0.5-alpha

Soft Actor-Critic (Discrete) (SAC) Learning Algorithm

Consistent reward authoring experience

Improved spice upgrade behavior

New in this release

Resources

Community

What is AI-ready data?

Building intelligent apps

Spice.ai helps developers build apps with real-time ML

Summary

Learn more and contribute

Highlights in v0.4-alpha

Upgrade using spice upgrade

Reward Function Files

Time Categories

Get recommendation for a specific time

New in this release

Resources

Community

Highlights in v0.3-alpha

Categorical data

Data visualizations preview

Listing pods

Coinbase data connector

Tweet Recommendation Quickstart

Trader Sample

New in this release

Resources

Community