Data Manipulation

Tejas-Garhewal · August 21, 2022, 4:22pm

1. Run the code in this section. Change the conditional statement `X == Y` to `X < Y` or `X > Y`, and then see what kind of tensor you can get.

X = torch.arange(15).reshape(5,3)

Y = torch.arange(15, 0, -1).reshape(5,3)

X == Y, X > Y, X < Y


(tensor([[False, False, False],
         [False, False, False],
         [False, False, False],
         [False, False, False],
         [False, False, False]]),
 tensor([[False, False, False],
         [False, False, False],
         [False, False,  True],
         [ True,  True,  True],
         [ True,  True,  True]]),
 tensor([[ True,  True,  True],
         [ True,  True,  True],
         [ True,  True, False],
         [False, False, False],
         [False, False, False]]))

2. Replace the two tensors that operate by element in the broadcasting mechanism with other shapes, e.g., 3-dimensional tensors. Is the result the same as expected?

X = torch.arange(8).reshape(4, 2, 1)
Y = torch.arange(8).reshape(1, 2 ,4)

print(f"{X}, \n\n\n{Y}, \n\n\n{X + Y}")


tensor([[[0],
         [1]],

        [[2],
         [3]],

        [[4],
         [5]],

        [[6],
         [7]]]), 


tensor([[[0, 1, 2, 3],
         [4, 5, 6, 7]]]), 


tensor([[[ 0,  1,  2,  3],
         [ 5,  6,  7,  8]],

        [[ 2,  3,  4,  5],
         [ 7,  8,  9, 10]],

        [[ 4,  5,  6,  7],
         [ 9, 10, 11, 12]],

        [[ 6,  7,  8,  9],
         [11, 12, 13, 14]]])

Yes, the result matches what I expected as well as with what I learned in this notebook

Gr8guns · September 28, 2022, 3:33am

Exercise-2. Replace the two tensors that operate by element in the broadcasting mechanism with other shapes, e.g., 3-dimensional tensors. Is the result the same as expected?

I understand this error in principle, but can someone clarify objectively what “non-singleton dimension” means?

c = torch.arange(6).reshape((3, 1, 2))
e = torch.arange(8).reshape((8, 1, 1))
c, e
(tensor([[[0, 1]],
 
         [[2, 3]],
 
         [[4, 5]]]),
 tensor([[[0]],
 
         [[1]],
 
         [[2]],
 
         [[3]],
 
         [[4]],
 
         [[5]],
 
         [[6]],
 
         [[7]]]))


c + e
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In [53], line 1
----> 1 c + e

RuntimeError: The size of tensor a (3) must match the size of tensor b (8) at non-singleton dimension 0

xavier_porras · November 8, 2022, 8:54pm

in: (X>Y).dtype
out: torch.bool

in: X = torch.arange(12, dtype=torch.float32).reshape(3,4)
Y = torch.tensor([[1, 4, 3, 5]])
X.shape, Y.shape
(torch.Size([3, 4]), torch.Size([1, 4]))

Exp for broadcasting
Each tensor has at least one dimension.
When iterating over the dimension sizes, starting at the trailing dimension, the dimension sizes must either be equal, one of them is 1, or one of them does not exist.

ari · March 1, 2023, 8:47am

This code:
before = id(X)
X += Y
id(X) == before

Does not return true for me. I asked chatGPT it says this does not adjust the vairable in place.
What am I doing wrong?
Thanks!

EDIT: Is seems this only works with lists, not regular variables. Is this where I went wrong. Thanks!

Ashkin · May 9, 2023, 9:29pm

@ari can you check if both X and Y are tensors. It could be that your Y is a ndarray from numpy.

cclj · July 2, 2023, 4:16am

Ex1.

import torch

X = torch.arange(12, dtype=torch.float32).reshape((3,4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

X < Y

Output:

tensor([[ True, False,  True, False],
        [False, False, False, False],
        [False, False, False, False]])

X > Y

Output:

tensor([[False, False, False, False],
        [ True,  True,  True,  True],
        [ True,  True,  True,  True]])

As expected, the operators > and < perform element-wise comparison operations on the two tensors with the same shape, as per the documentation.

Ex2.

The broadcasting scheme expands the dimensions by copying the elements along length-1 axes, so that a binary operation can be feasible.
Along each trailing dimension, the dimension sizes must either be: (1) equal, (2) one of them is 1, or (3) one of them does not exist.
Take the example of and , then the addition of yields a tensor in shape (3, 3, 3) defined as

where is determined via

a = torch.arange(9).reshape((3, 1, 3))
b = torch.arange(3).reshape((1, 3, 1))
a, b

Output:

(tensor([[[0, 1, 2]],
 
         [[3, 4, 5]],
 
         [[6, 7, 8]]]),
 tensor([[[0],
          [1],
          [2]]]))

c = a + b
c

Output:

tensor([[[ 0,  1,  2],
         [ 1,  2,  3],
         [ 2,  3,  4]],

        [[ 3,  4,  5],
         [ 4,  5,  6],
         [ 5,  6,  7]],

        [[ 6,  7,  8],
         [ 7,  8,  9],
         [ 8,  9, 10]]])

# If not that straightforward to see, let's try an explicit broadcasting scheme.
c1 = torch.zeros((3, 3, 3))
for i in range(3):
    for j in range(3):
        for k in range(3):
            c1[i, j, k] = a[i, 0, k] + b[0, j, 0]
c1 - c

Output:

tensor([[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]])

Nathan_Zorndorf · August 6, 2023, 5:40am

In Saving Memory the text mentions two reasons that creating new spaces in memory to store variables might be undesireable:

First, we do not want to run around allocating memory unnecessarily all the time. In machine learning, we often have hundreds of megabytes of parameters and update all of them multiple times per second. Whenever possible, we want to perform these updates in place . Second, we might point at the same parameters from multiple variables. If we do not update in place, we must be careful to update all of these references, lest we spring a memory leak or inadvertently refer to stale parameters.

I don’t understand the second reason. Can someone provide an example? When would you point at the same parameters from multiple variables and what does this look like?

iamyaa · August 31, 2023, 12:05pm

np.ones() gives only ones as digits so the above diagram is not correct.
Here is a sample:

v=np.ones((3,1))
v
array ([[1.],
[1.],
[1.]])
check it out

polhuang · December 9, 2023, 4:15pm

Thanks for including that. You can understand the concept instantly from the visual description.

Nathanael · January 27, 2024, 9:15am

import torch x=torch.arange(12,dtype=torch.float32).reshape(3,4) y=torch.tensor([[2, 6, 7, 8], [1, 2, 3, 4], [4, 3, 2, 1]]) x<y,x>y,x==y

(tensor([[ True, True, True, True],
[False, False, False, False],
[False, False, False, False]]),
tensor([[False, False, False, False],
[ True, True, True, True],
[ True, True, True, True]]),
tensor([[False, False, False, False],
[False, False, False, False],
[False, False, False, False]]))

filipv · June 12, 2024, 7:49pm

I went through these exercises a week or so ago, but I recall:

X < Y or X > Y yields a boolean tensor which is the result of element-wise inequality operations.
I don’t recall being surprised, but I had already read through the PyTorch document on Broadcasting semantics. Of note – the tensors are aligned starting at the trailing dimension.

Sarah · June 13, 2024, 8:21pm

Exercise 1

import torch

# Rewriting the tensors created in section 2.1.3 Operations
X = torch.arange(12, dtype=torch.float64).reshape((3, 4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
X, Y, X == Y, X < Y, X > Y

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]], dtype=torch.float64),
 tensor([[2., 1., 4., 3.],
         [1., 2., 3., 4.],
         [4., 3., 2., 1.]]),
 tensor([[False,  True, False,  True],
         [False, False, False, False],
         [False, False, False, False]]),
 tensor([[ True, False,  True, False],
         [False, False, False, False],
         [False, False, False, False]]),
 tensor([[False, False, False, False],
         [ True,  True,  True,  True],
         [ True,  True,  True,  True]]))

Exercise 2

# Rewriting and modifying the tensors created in section 2.1.4 Broadcasting
a = torch.arange(12).reshape((2, 1, 6))
b = torch.arange(4).reshape((1, 4, 1))
c = a + b
a, b, c, a.shape, b.shape, c.shape

(tensor([[[ 0,  1,  2,  3,  4,  5]],
 
         [[ 6,  7,  8,  9, 10, 11]]]),
 tensor([[[0],
          [1],
          [2],
          [3]]]),
 tensor([[[ 0,  1,  2,  3,  4,  5],
          [ 1,  2,  3,  4,  5,  6],
          [ 2,  3,  4,  5,  6,  7],
          [ 3,  4,  5,  6,  7,  8]],
 
         [[ 6,  7,  8,  9, 10, 11],
          [ 7,  8,  9, 10, 11, 12],
          [ 8,  9, 10, 11, 12, 13],
          [ 9, 10, 11, 12, 13, 14]]]),
 torch.Size([2, 1, 6]),
 torch.Size([1, 4, 1]),
 torch.Size([2, 4, 6]))

Pen_Dora · August 11, 2024, 8:48pm

1.

X = torch.arange(12, dtype=torch.float32).reshape((3,4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
print(X,Y,X==Y, X<Y, X>Y)

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]]) 

tensor([[2., 1., 4., 3.],
        [1., 2., 3., 4.],
        [4., 3., 2., 1.]]) 

tensor([[False,  True, False,  True],
        [False, False, False, False],
        [False, False, False, False]]) 

tensor([[ True, False,  True, False],
        [False, False, False, False],
        [False, False, False, False]]) 

tensor([[False, False, False, False],
        [ True,  True,  True,  True],
        [ True,  True,  True,  True]])

2.

a = torch.arange(3).reshape((3, 1))
b = torch.arange(2).reshape((1, 2))

a_3d = a.reshape((3,1,1))
b_3d = b.reshape((1,2,1))
print(a_3d, b_3d)
print(a_3d+b_3d)

tensor([[[0]],

        [[1]],

        [[2]]]) 

tensor([[[0],
         [1]]])

tensor([[[0],
         [1]],

        [[1],
         [2]],

        [[2],
         [3]]])

Looks correct, kindly tell me if something’s missing, I am new to this.

Lamhita_Prem · August 29, 2024, 7:12am

Chapter 2.1

X = torch.arange(12, dtype=torch.float32).reshape((3,4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
torch.cat((X, Y), dim=0)
Output:
tensor([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.],
[ 2., 1., 4., 3.],
[ 1., 2., 3., 4.],
[ 4., 3., 2., 1.]])
X > Y
Output:
tensor([[False, False, False, False],
[ True, True, True, True],
[ True, True, True, True]])
X < Y
tensor([[ True, False, True, False],
[False, False, False, False],
[False, False, False, False]])
Where elements of the tensors are equal, it’s False, in both cases
Take 2 3D tensors:
A = torch.tensor([[[1, 2]], [[3, 4]], [[5, 6]]])
B = torch.tensor([[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]])
A + B
Output:
tensor([[[ 2, 4],
[ 4, 6]],
```
 [[ 8, 10],
  [10, 12]],

 [[14, 16],
  [16, 18]]])
```

Explanation:
A’s shape is (3,1,2)
B’s shape is (3,2,2)

Before summing up, A is broadcasted along the 2nd dimension:
A = tensor([[[1, 2],[1,2]],

[[3, 4],[3, 4]],

[[5, 6],[5,6]]])

Then follows the usual element-by-element summation

amaze1111 · November 7, 2024, 10:26am

Why is it not broadcasting?

hhllhhyyds · November 27, 2024, 6:15am

using candle in Rust to show Tensor data manipulation

use candle_core::{Device, Tensor};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let cpu = Device::Cpu;

    let g = Tensor::arange::<f32>(0., 12., &cpu)?;
    println!("cpu g = {g}");

    let g = g.reshape((3, 4))?;

    let gpu = Device::new_metal(0)?;
    let x = Tensor::arange::<f32>(0., 12., &gpu)?;
    println!("metal x = {x}");

    println!("x element count = {}", x.elem_count());

    println!("x shape = {:?}", x.shape());

    let x = x.reshape((3, 4))?;

    println!("x after reshape is\n{}, shape is {:?}", x, x.shape());

    let zeros_tensor = Tensor::zeros((2, 3, 4), candle_core::DType::F32, &cpu)?;
    println!("tensor zeros:\n{}", zeros_tensor);

    println!(
        "tensor ones:\n{}",
        Tensor::ones((2, 3, 4), candle_core::DType::F32, &cpu)?
    );

    println!("tensor random:\n{}", Tensor::randn(0.0, 1.0, (3, 4), &cpu)?);

    println!(
        "tensor specified:\n{}",
        Tensor::new(&[[2_i64, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]], &cpu)?
    );

    println!("x[-1] = {:?}", x.get(2)?.to_vec1::<f32>()?);
    println!(
        "x[1:3] = {:?}",
        x.index_select(&Tensor::new(&[1_i64, 2], &gpu)?, 0)?
            .to_vec2::<f32>()?
    );

    x.get(1)?.slice_set(&Tensor::new(&[17_f32], &gpu)?, 0, 2)?;
    println!("x = \n{}", x);

    let y = Tensor::from_slice(&[12_f32; 8], (2, 4), &gpu)?;
    let x = x.slice_assign(&[0..2, 0..4], &y)?;
    println!("x = \n{}", x);

    let z = x.to_device(&cpu)?;
    println!("x exp = \n{}", x.exp()?);
    println!("z exp = \n{}", z.exp()?);

    let p = Tensor::from_slice(&[1_f32, 2., 4., 8.], (1, 4), &gpu)?;
    let q = Tensor::from_slice(&[2_f32; 4], (1, 4), &gpu)?;

    println!("p = {p},\nq = {q}");

    println!("p + q = {}", (p.clone() + q.clone())?);
    println!("p - q = {}", (p.clone() - q.clone())?);
    println!("p * q = {}", (p.clone() * q.clone())?);
    println!("p / q = {}", (p.clone() / q.clone())?);
    println!("p ** q = {}", (p.clone().pow(&q))?);

    let gz0 = Tensor::cat(&[g.clone(), z.clone()], 0)?;
    let gz1 = Tensor::cat(&[g.clone(), z.clone()], 1)?;

    println!("gz0 = \n{gz0}");
    println!("gz1 = \n{gz1}");

    println!("z == g:\n{}", z.eq(&g)?);

    println!("z < g:\n{}", z.lt(&g)?);

    println!("g sum = {}", g.sum_all()?);

    let a = Tensor::arange(0_i64, 3, &gpu)?.reshape((3, 1))?;
    println!("a = \n{a}");

    let b = Tensor::arange(0_i64, 2, &gpu)?.reshape((1, 2))?;
    println!("b = \n{b}");

    println!("a + b = \n{}", a.broadcast_add(&b)?);

    Ok(())
}

dscyrescotti · September 12, 2025, 1:47pm

I’m having a hard time trying to wrap my head around how broadcasting works. Could anyone confirm that my understanding is correct?

For each corresponding dimension (from the trailing) of two (or more) tensors, one of them must be either one or zero or they must have the size. Otherwise, it won’t work.

m = torch.arange(12).reshape((6, 1, 2))
n = torch.arange(6).reshape((3, 2))
# -1th -> 2 and 2 (same)
# -2th -> 1 and 3 (one of them is 1)
# -3th -> 6 and 0 (one of them is 0)

Then one of the corresponding dimension is expanded to match each other.

Hala_Ali_Khan · September 17, 2025, 3:20am

Q1 . Run the code in this section. Change the conditional statement X == Y to X < Y or X >Y, and then see what kind of tensor you can get. ( Ex. 2.1.8)

X = torch.arange(2, 25, 2, dtype = torch.float32).reshape(3,4)
Y = torch.tensor([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
X,Y

(tensor([[ 2., 4., 6., 8.],
[10., 12., 14., 16.],
[18., 20., 22., 24.]]),
tensor([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]]))

X==Y, X<Y, X>Y
(tensor([[False, False, False, False],
[False, False, False, False],
[False, False, False, False]]),
tensor([[False, False, False, False],
[False, False, False, False],
[False, False, False, False]]),
tensor([[True, True, True, True],
[True, True, True, True],
[True, True, True, True]]))

Hala_Ali_Khan · September 17, 2025, 3:44am

Q2. Replace the two tensors that operate by element in the broadcasting mechanism with other shapes, e.g., 3-dimensional tensors. Is the result the same as expected?

p = torch.arange(1, 25, 2, dtype = torch.float32).reshape(3,2,2)
q = torch.arange(5, 17, dtype=torch.float32).reshape(2,3,2)
p,q

(tensor([[[ 1., 3.],
[ 5., 7.]],
     [[ 9., 11.],
      [13., 15.]],

     [[17., 19.],
      [21., 23.]]]),
tensor([[[ 5., 6.],
[ 7., 8.],
[ 9., 10.]],
     [[11., 12.],
      [13., 14.],
      [15., 16.]]]))

p+q

RuntimeError Traceback (most recent call last)
Cell In[14], line 1
----> 1 p+q

RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 1

p = torch.arange(1, 25, 2, dtype = torch.float32).reshape(3,1,4)
q = torch.arange(5, 17, dtype=torch.float32).reshape(3,4,1)
p,q

(tensor([[[ 1., 3., 5., 7.]],

     [[ 9., 11., 13., 15.]],

     [[17., 19., 21., 23.]]]),

tensor([[[ 5.],
[ 6.],
[ 7.],
[ 8.]],

     [[ 9.],
      [10.],
      [11.],
      [12.]],

     [[13.],
      [14.],
      [15.],
      [16.]]]))

p+q

tensor([[[ 6., 8., 10., 12.],
[ 7., 9., 11., 13.],
[ 8., 10., 12., 14.],
[ 9., 11., 13., 15.]],

    [[18., 20., 22., 24.],
     [19., 21., 23., 25.],
     [20., 22., 24., 26.],
     [21., 23., 25., 27.]],

    [[30., 32., 34., 36.],
     [31., 33., 35., 37.],
     [32., 34., 36., 38.],
     [33., 35., 37., 39.]]])

P.S it requires at least one of the dimensions (either 1 or 2 - not 0) to be singleton. Only then it is able to broadcast and stretch

Hala_Ali_Khan · September 17, 2025, 4:12am

@dscyrescotti I will try to explain. I hope the internal working becomes more clear after this.

Lets divide this whole process in steps.
Step 1 you create two tensors m = (6,1,2) and n = (3,2)
Step 2 when you perform operation on these two tensors, they are first rearranged as (6,1,2)
(3,2)
given the code both of tensor shapes are read from the right (as you mentioned) but the missing dimension in n doesnt become 0, it becomes 1.
Step 3 both tensors are expanded in a shape (6,3,2) - I’d suggest try solving it on a paper. You’ll get a clearer picture.
Step 4 now since both the tensors have same shape performing addition operation is possible!

m = torch.arange(1,13).reshape(6,1,2)
m

tensor([[[ 1, 2]],

    [[ 3,  4]],

    [[ 5,  6]],

    [[ 7,  8]],

    [[ 9, 10]],

    [[11, 12]]])

n = torch.arange(1,7).reshape(3,2)
n

tensor([[1, 2],
[3, 4],
[5, 6]])

m+n

tensor([[[ 2, 4],
[ 4, 6],
[ 6, 8]],

    [[ 4,  6],
     [ 6,  8],
     [ 8, 10]],

    [[ 6,  8],
     [ 8, 10],
     [10, 12]],

    [[ 8, 10],
     [10, 12],
     [12, 14]],

    [[10, 12],
     [12, 14],
     [14, 16]],

    [[12, 14],
     [14, 16],
     [16, 18]]])

Try to solve this by hand, using the same values and then compare. It will help a lot

Data Manipulation

1. Run the code in this section. Change the conditional statement X == Y to X < Y or X > Y, and then see what kind of tensor you can get.

2. Replace the two tensors that operate by element in the broadcasting mechanism with other shapes, e.g., 3-dimensional tensors. Is the result the same as expected?

Exercise 1

Exercise 2

1. Run the code in this section. Change the conditional statement `X == Y` to `X < Y` or `X > Y`, and then see what kind of tensor you can get.