2

I have this tensor dimension:

(batch_size, class_id, range_indices) -> (4, 3, 2)
int64
[[[1250 1302]
  [1324 1374]
  [1458 1572]]

 [[1911 1955]
  [1979 2028]
  [2120 2224]]

 [[2546 2599]
  [2624 2668]
  [2765 2871]]

 [[3223 3270]
  [3286 3347]
  [3434 3539]]]

How do I construct densed representation with filled value with this rule:

Since there is 3 class IDs, therefore:

  1. Class ID 0: filled with 1
  2. Class ID 1: filled with 2
  3. Class ID 2: filled with 3
  4. Default: filled with 0

Therefore, it will outputting vector like this:

[0 0 0 ...(until 1250)... 1 1 1 ...(until 1302)... 0 0 0 ...(until 1324)... 2 2 2 ...(until 1374)... and so on]

Here is copiable code:

data = np.array([[[1250, 1302],
                  [1324, 1374],
                  [1458, 1572]],

                 [[1911, 1955],
                  [1979, 2028],
                  [2120, 2224]],

                 [[2546, 2599],
                  [2624, 2668],
                  [2765, 2871]],

                 [[3223, 3270],
                  [3286, 3347],
                  [3434, 3539]]])

Here is code generated by ChatGPT, but I'm not sure it's Numpythonic way since it's using list comprhension:

import numpy as np

# Given tensor
tensor = np.array([[[1250, 1302],
                    [1324, 1374],
                    [1458, 1572]],

                   [[1911, 1955],
                    [1979, 2028],
                    [2120, 2224]],

                   [[2546, 2599],
                    [2624, 2668],
                    [2765, 2871]],

                   [[3223, 3270],
                    [3286, 3347],
                    [3434, 3539]]])

# Determine the maximum value in the tensor to define the size of the output array
max_value = tensor.max()

# Create an empty array filled with zeros of size max_value + 1
dense_representation = np.zeros(max_value + 1, dtype=int)

# Generate the class_ids array, replicated for each batch
class_ids = np.tile(np.arange(1, tensor.shape[1] + 1), tensor.shape[0])

# Generate start and end indices
start_indices = tensor[:, :, 0].ravel()
end_indices = tensor[:, :, 1].ravel()

# Create an array of indices to fill
indices = np.hstack([np.arange(start, end) for start, end in zip(start_indices, end_indices)])

# Create an array of values to fill
values = np.hstack([np.full(end - start, class_id) for start, end, class_id in zip(start_indices, end_indices, class_ids)])

# Fill the dense representation array
dense_representation[indices] = values

# The resulting dense representation
print(dense_representation)
print(dense_representation[1249:1303])
print(dense_representation[1323:1375])
print(dense_representation[1457:1573])
print(dense_representation[1910:1956])

Output:

[0 0 0 ... 3 3 0]
[0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0]
[0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 0]
[0 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 0]
[0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 0]
4
  • Would the output be one or four arrays?
    – mozway
    Commented Jul 10 at 10:07
  • @mozway one array (I said densed vector) Commented Jul 10 at 10:11
  • It might be interesting to provide a smaller example (which could result in a printable array) with the exact expected output.
    – mozway
    Commented Jul 10 at 10:16
  • @mozway Sorry for uncovinient, I know what you mean, it's because the tensor is my real case data. Commented Jul 10 at 10:22

1 Answer 1

1

IIUC, you could craft the output array with zeros, repeat, tile:

start = data[..., 0].ravel()
end = data[..., 1].ravel()
slices = [slice(a,b) for a,b in zip(start, end)]
n = end-start
out =  np.zeros(data.max(), dtype='int')
out[np.r_[*slices]] = np.repeat(np.tile(np.arange(data.shape[1])+1, data.shape[0]), n)

Variant with boolean indexing:

start = data[..., 0].ravel()
end = data[..., 1].ravel()
out =  np.zeros(data.max(), dtype='int')
idx = np.arange(len(out))
m = ((idx >= start[:, None]) & (idx < end[:, None])).any(axis=0)
n = end-start
out[m] = np.repeat(np.tile(np.arange(data.shape[1])+1, data.shape[0]), n)

Or:

start = data[..., 0].ravel()
end = data[..., 1].ravel()
out =  np.zeros(data.max(), dtype='int')
idx = np.arange(len(out))

m1 = ((idx >= start[:, None]) & (idx < end[:, None]))
m2 = m1.any(axis=0)
nums = np.tile(np.arange(data.shape[1])+1, data.shape[0])

out[m2] = nums[m1[:, m2].argmax(0)]

Output:

[0 0 0 ... 3 3 3]
4
  • out[np.r_[*slices]] = np.repeat(np.tile(np.arange(data.shape[1])+1, data.shape[0]), n) Invalid syntax, perhaps you forgot comma at np.r_ Commented Jul 10 at 10:19
  • I think you have a too old python version for this syntax. I used 3.11
    – mozway
    Commented Jul 10 at 10:22
  • It's Google Colaboratory runtime. But at least the boolean indexing is worked to me and have verified np.array_equal(out, dense_representation) is True. Commented Jul 10 at 10:26
  • 1
    @MuhammadIkhwanPerwira If np.r_[*slices] does not work for you, the following should be equivalent and "backward-compatible": np.r_[tuple(slices)] (thus using an indexing tuple rather than an unpacked index)
    – simon
    Commented Jul 10 at 14:09

Not the answer you're looking for? Browse other questions tagged or ask your own question.