Tensors: Introduction

AI
a 3x3 array of handdrawn cubes that represent a 2D array of 3D objects

A tensor CAN be an array of arrays (of arrays of arrays of arrays)

What is a Tensor?

Note: I’m referring to tensors only from the perspective of ML and not physics or any other discipline so I’d look elsewhere if you’re not here for information about ML. Continuing on!

This introduction from Machine Learning Mastery is very thorough, this article from Wikipedia covers in unbelievable (and somewhat incomprehensible) detail what tensors are as well. But in most of the articles I’ve read, I’ve had trouble really understanding what they actually are. So let’s resolve that.

First, let’s list some things that tensors aren’t:

  • A single ‘node’ in an ML layer or neural network

  • ONLY an input to a model

  • ONLY an output of a model

  • ONLY a data container that holds information

How I Imagine What Tensors Are

Tensors are a way we can generalize the idea of representing a set of values across a varying set of dimensions depending on the application. These dimensions can range from 1D to 2D, 3D, and beyond.

Similar to how we can do mathematical operations on vectors and matrices, we can do the same with tensors. That’s actually the point of them, structuring them like this allows us to define complex relationships, and carry out computations across these dimensions.

So I like to think of a tensor as a “highly customizable value container” that is set up in a specific way so that it can have relationships to other functions, utilities, or other “value containers”.

It’s almost like an electrical plug

A tensor on its own can be very simple, like a standard household electrical plug. But it can also be extremely complex. I’ve seen electrical plugs that have as many as 64 pins. On their own those pins are useless, the same as an empty tensor with defined dimensions. What happens before or after either of them, you don’t really know without the full picture.

Thinking about it, there are actually a lot of things they share in common:

  1. Modularity - you can change devices with a plug, and you can change datasets via tensors.

  2. Standardization - plugs follow an electrical standard, tensors have standard libraries and methods for how to use them

  3. Versatility - plugs carry a diverse range of electrical signals, tensors can represent all sorts of “algebraic objects

  4. Information Transmission - plugs can carry data transmission (ethernet, usb, etc) and tensors primary job in ML is to contain data in specific shapes and orientations as it passes through the system.

  5. Hidden Complexity - plugs don’t reveal anything about the circuit they are connected to on their own, and neither do tensors!

  6. Dimensionality - a plug has a number of pins that represent its “dimensionality” and a tensor can have dimensions from 0 to infinity theoretically. you’d be limited by the hardware before the concept of a tensor would limit you. I suppose the same is true for a plug too though!

That Sounds a lot like a Matrix

It does, also a vector and a scalar. And I think you wouldn’t be wrong to call it either of those names so long as the dimensions match up. Vectors have 1D, Matrices have 2. Tensors can have any of these, and even more.

In each of these setups, the tensor contains data that is meant to have linear algebra done on it. The same like matrices or vectors or scalars. So the tensor is just a way of having a flexible structure that can be any of the dimensions needed for the application, and then the math carried out on it, and the relationships established to it, are standardized in the practice of ML.

Tensor Operations and Properties

Tensors (like all math stuff) come along with their own set of jargon that describes things about them.

Let’s review some of their properties first:

  1. Rank(order): this describes how many dimensions a tensor has

  2. Shape: this describes how many members or values there are in each dimension

  3. Size: the complete number of members in the tensor ( dimension1 x dimension2 x dimension3 x … )

  4. Data Type: the type of data that is contained within the tensor (int, float, bool, etc)

Then let’s take a look at some types of operations that can be done on them:

  1. Basic Arithmetic - Add, Subtract, Scalar ops

  2. Reduction Operations - Sum, Mean, Max/Min

  3. Reordering Operations - Transpose, squeeze, Reshape

  4. Linear Algebra - To me this is hitting on Basic Arithmetic a bit redundant but of course, there are some big powers in the branch of Linear Algebra not covered there.

  5. Decompositions - I don’t really understand Decompositions yet so I’m not going to talk about them, but I assume it’s something that’s possible because it’s in that wiki article.

  6. Other Tasks - Stacking, Concatenating, Norms

That’s a lot

and that’s part of why they are so hard to describe succinctly. They do so many things and have so many different shapes and contain data of such various types there just isn’t a good 1:1 analog in the everyday world that can encapsulate the capabilities of the tensor.

Conclusion

Tensors are a way to contain sets of data that needs a highly specific Size, Shape, and Rank (dimensions) so that tasks can be carried out with less computational expense, or just organized more expediently for the output of upstream functions, or the input of downstream functions.

It’s an interface, a container, and a relationship describer all in one. It doesn’t have a great 1:1 analog in the real world, but shares quite a lot of similarities with an electrical plug.

And lastly, it’s an absolutely fundamental and foundational concept to understand prior to getting too deep into the weeds with ML systems.

Previous
Previous

What is a GAN?

Next
Next

Bias, Variance, and In Between