You achieve what you want which is all the colums of image 1, all the colums of image 2. Moral of this, If you want to reshape the ordering only remains for contiguous dimensions. 1,2,3,4 Contiguous here mens 1-2, 2-3 even 1-2-3, but not 1-3 for example. To do so use permute and the brain first. This is a somewhat silly question but I can't seem to figure out how permute works in matlab. Take the documentation example: A = 1 2; 3 4; permute(A,2 1) ans = 1. Permutation Problem 1. Choose 3 horses from group of 4 horses. In a race of 15 horses you beleive that you know the best 4 horses and that 3 of them will finish in the top spots: win, place and show (1st, 2nd and 3rd). So out of that set of 4 horses you want to pick the subset of 3 winners and the order in which they finish. Permute(dims)将tensor的维度换位。参数:参数是一系列的整数,代表原来张量的维度。比如三维就有0,1,2这些dimension。例:import torchimport numpy as npa=np.array(1,2,3,4,5,6)unpermuted=torch.tensor(a)print(unpermuted.siz.
Heap's algorithm generates all possible permutations of n objects. It was first proposed by B. R. Heap in 1963.[1] The algorithm minimizes movement: it generates each permutation from the previous one by interchanging a single pair of elements; the other n−2 elements are not disturbed. In a 1977 review of permutation-generating algorithms, Robert Sedgewick concluded that it was at that time the most effective algorithm for generating permutations by computer.[2]
The sequence of permutations of n objects generated by Heap's algorithm is the beginning of the sequence of permutations of n+1 objects. So there is one infinite sequence of permutations generated by Heap's algorithm (sequence A280318 in the OEIS).
Details of the algorithm[edit]
Permute 2 2 2 16
For a collection C{displaystyle C} containing n different elements, Heap found a systematic method for choosing at each step a pair of elements to switch in order to produce every possible permutation of these elements exactly once.
Described recursively as a decrease and conquer method, Heap's algorithm operates at each step on the k{displaystyle k} initial elements of the collection. Initially kn{displaystyle kn} and thereafter k<n{displaystyle k. Each step generates the k!{displaystyle k!} permutations that end with the same n−k{displaystyle n-k} final elements. It does this by calling itself once with the kth{displaystyle k{text{th}}} element unaltered and then k−1{displaystyle k-1} times with the (kth{displaystyle k{text{th}}}) element exchanged for each of the initial k−1{displaystyle k-1} elements. The recursive calls modify the initial k−1{displaystyle k-1} elements and a rule is needed at each iteration to select which will be exchanged with the last. Heap's method says that this choice can be made by the parity of the number of elements operated on at this step. If k{displaystyle k} is even, then the final element is iteratively exchanged with each element index. If k{displaystyle k} is odd, the final element is always exchanged with the first.
One can also write the algorithm in a non-recursive format.[3]
Proof[edit]
In this proof, we'll use the implementation below as Heap's Algorithm. While it is not optimal (see section below)[clarification needed], the implementation is nevertheless still correct and will produce all permutations. The reason for using the below implementation is that the analysis is easier, and certain patterns can be easily illustrated.
Claim: If array A has length n, then performing Heap's algorithm will either result in A being 'rotated' to the right by 1 (i.e. each element is shifted to the right with the last element occupying the first position) or result in A being unaltered, depending if n is even or odd, respectively.
Basis: The claim above trivially holds true for n=1{displaystyle n=1} as Heap's algorithm will simply return A unaltered in order.
Induction: Assume the claim holds true for some i≥1{displaystyle igeq 1}. We will then need to handle two cases for i+1{displaystyle i+1}: i+1{displaystyle i+1} is even or odd.
If, for A, n=i+1{displaystyle n=i+1} is even, then the subset of the first i elements will remain unaltered after performing Heap's Algorithm on the subarray, as assumed by the induction hypothesis. By performing Heap's Algorithm on the subarray and then performing the swapping operation, in the kth iteration of the for-loop, where k≤i+1{displaystyle kleq i+1}, the kth element in A will be swapped into the last position of A which can be thought as a kind of 'buffer'. By swapping the 1st and last element, then swapping 2nd and last, all the way until the nth and last elements are swapped, the array will at last experience a rotation. To illustrate the above, look below for the case n=4{displaystyle n=4}
If, for A, n=i+1{displaystyle n=i+1} is odd, then the subset of the first i elements will be rotated after performing Heap's Algorithm on the first i elements. Notice that, after 1 iteration of the for-loop, when performing Heap's Algorithm on A, A is rotated to the right by 1. By the induction hypothesis, it is assumed that the first i elements will rotate. After this rotation, the first element of A will be swapped into the buffer which, when combined with the previous rotation operation, will in essence perform a rotation on the array. Perform this rotation operation n times, and the array will revert to its original state. This is illustrated below for the case n=5{displaystyle n=5}.
The induction proof for the claim is now complete, which will now lead to why Heap's Algorithm creates all permutations of array A. Once again we will prove by induction the correctness of Heap's Algorithm.
Basis: Heap's Algorithm trivially permutes an array A of size 1 as outputing A is the one and only permutation of A.
Induction: Assume Heap's Algorithm permutes an array of size i. Using the results from the previous proof, every element of A will be in the 'buffer' once when the first i elements are permuted. Because permutations of an array can be made by altering some array A through the removal of an element x from A then tacking on x to each permutation of the altered array, it follows that Heap's Algorithm permutes an array of size i+1{displaystyle i+1}, for the 'buffer' in essence holds the removed element, being tacked onto the permutations of the subarray of size i. Because each iteration of Heap's Algorithm has a different element of A occupying the buffer when the subarray is permuted, every permutation is generated as each element of A has a chance to be tacked onto the permutations of the array A without the buffer element.
Frequent mis-implementations[edit]
It is tempting to simplify the recursive version given above by reducing the instances of recursive calls. Compress 1 0 1 – image compression socks. For example, as:
This implementation will succeed in producing all permutations but does not minimize movement. As the recursive call-stacks unwind, it results in additional swaps at each level. Half of these will be no-ops of A[i]{displaystyle A[i]} and A[k−1]{displaystyle A[k-1]} where ik−1{displaystyle ik-1} but when k{displaystyle k} is odd, it results in additional swaps of the kth{displaystyle kth} with the 0th{displaystyle 0th} element.
n{displaystyle n} | n!−1{displaystyle n!-1} | swaps | additional = swaps −(n!−1){displaystyle -(n!-1)} |
---|---|---|---|
1 | 0 | 0 | 0 |
2 | 1 | 1 | 0 |
3 | 5 | 6 | 1 |
4 | 23 | 27 | 4 |
5 | 119 | 140 | 21 |
6 | 719 | 845 | 126 |
7 | 5039 | 5922 | 883 |
8 | 40319 | 47383 | 7064 |
9 | 362879 | 426456 | 63577 |
These additional swaps significantly alter the order of the k−1{displaystyle k-1} prefix elements.
The additional swaps can be avoided by either adding an additional recursive call before the loop and looping k−1{displaystyle k-1} times (as above) or looping k{displaystyle k} times and checking that i{displaystyle i} is less than k−1{displaystyle k-1} as in:
The choice is primarily aesthetic but the latter results in checking the value of i{displaystyle i} twice as often.
See also[edit]
References[edit]
- ^Heap, B. R. (1963). 'Permutations by Interchanges'(PDF). The Computer Journal. 6 (3): 293–4. doi:10.1093/comjnl/6.3.293.
- ^Sedgewick, R. (1977). 'Permutation Generation Methods'. ACM Computing Surveys. 9 (2): 137–164. doi:10.1145/356689.356692.
- ^Sedgewick, Robert. 'a talk on Permutation Generation Algorithms'(PDF).
PyTorch provides a lot of methods for the Tensor type. Some of these methodsmay be confusing for new users. Here, I would like to talk aboutview()
vsreshape()
,transpose()
vspermute()
.
view() vs transpose()
Both view()
and reshape()
can be used to change the size or shape oftensors. But they are slightly different.
The view()
has existed for a long time. It will return a tensor with the newshape. The returned tensor shares the underling data with the original tensor.If you change the tensor value in the returned tensor, the corresponding valuein the viewed tensor also changes.
On the other hand, it seems that reshape()
has been introduced in version0.4. According to thedocument, thismethod will
Returns a tensor with the same data and number of elements as input, but with the specified shape. When possible, the returned tensor will be a view of input. Otherwise, it will be a copy. Contiguous inputs and inputs with compatible strides can be reshaped without copying, but you should not depend on the copying vs. viewing behavior.
It means that torch.reshape
may return a copy or a view of the originaltensor. You can not count on that to return a view or a copy. According to thedeveloper:
Photolemur 3 1 0 – automated photo enhancement. if you need a copy use clone() if you need the same storage use view(). The semantics of reshape() are that it may or may not share the storage and you don't know beforehand.
As a side note, I found that torch version 0.4.1 and 1.0.1 behaves differentlywhen you print the id
of original tensor and viewing tensor:
You see that id
of a.storage()
and b.storage()
is not the same. Isn'tthat their underlying data the same? Why this difference?
I filed an issue in thePyTorch repo and got answers from the developer. It turns out that to find thedata pointer, we have to use the data_ptr()
method. You will find that theirdata pointers are the same.
view() vs transpose()
transpose()
, like view()
can also be used to change the shape of a tensorand it also returns a new tensor sharing the data with the original tensor:
Returns a tensor that is a transposed version of input. The given dimensions dim0 and dim1 are swapped.
The resulting out tensor shares it's underlying storage with the input tensor, so changing the content of one would change the content of the other.
One difference is that view()
can only operate on contiguous tensor and thereturned tensor is still contiguous. transpose()
can operate both oncontiguous and non-contiguous tensor. Unlike view()
, the returned tensor maybe not contiguous any more.
Permute 2 2 2 17
If, for A, n=i+1{displaystyle n=i+1} is even, then the subset of the first i elements will remain unaltered after performing Heap's Algorithm on the subarray, as assumed by the induction hypothesis. By performing Heap's Algorithm on the subarray and then performing the swapping operation, in the kth iteration of the for-loop, where k≤i+1{displaystyle kleq i+1}, the kth element in A will be swapped into the last position of A which can be thought as a kind of 'buffer'. By swapping the 1st and last element, then swapping 2nd and last, all the way until the nth and last elements are swapped, the array will at last experience a rotation. To illustrate the above, look below for the case n=4{displaystyle n=4}
If, for A, n=i+1{displaystyle n=i+1} is odd, then the subset of the first i elements will be rotated after performing Heap's Algorithm on the first i elements. Notice that, after 1 iteration of the for-loop, when performing Heap's Algorithm on A, A is rotated to the right by 1. By the induction hypothesis, it is assumed that the first i elements will rotate. After this rotation, the first element of A will be swapped into the buffer which, when combined with the previous rotation operation, will in essence perform a rotation on the array. Perform this rotation operation n times, and the array will revert to its original state. This is illustrated below for the case n=5{displaystyle n=5}.
The induction proof for the claim is now complete, which will now lead to why Heap's Algorithm creates all permutations of array A. Once again we will prove by induction the correctness of Heap's Algorithm.
Basis: Heap's Algorithm trivially permutes an array A of size 1 as outputing A is the one and only permutation of A.
Induction: Assume Heap's Algorithm permutes an array of size i. Using the results from the previous proof, every element of A will be in the 'buffer' once when the first i elements are permuted. Because permutations of an array can be made by altering some array A through the removal of an element x from A then tacking on x to each permutation of the altered array, it follows that Heap's Algorithm permutes an array of size i+1{displaystyle i+1}, for the 'buffer' in essence holds the removed element, being tacked onto the permutations of the subarray of size i. Because each iteration of Heap's Algorithm has a different element of A occupying the buffer when the subarray is permuted, every permutation is generated as each element of A has a chance to be tacked onto the permutations of the array A without the buffer element.
Frequent mis-implementations[edit]
It is tempting to simplify the recursive version given above by reducing the instances of recursive calls. Compress 1 0 1 – image compression socks. For example, as:
This implementation will succeed in producing all permutations but does not minimize movement. As the recursive call-stacks unwind, it results in additional swaps at each level. Half of these will be no-ops of A[i]{displaystyle A[i]} and A[k−1]{displaystyle A[k-1]} where ik−1{displaystyle ik-1} but when k{displaystyle k} is odd, it results in additional swaps of the kth{displaystyle kth} with the 0th{displaystyle 0th} element.
n{displaystyle n} | n!−1{displaystyle n!-1} | swaps | additional = swaps −(n!−1){displaystyle -(n!-1)} |
---|---|---|---|
1 | 0 | 0 | 0 |
2 | 1 | 1 | 0 |
3 | 5 | 6 | 1 |
4 | 23 | 27 | 4 |
5 | 119 | 140 | 21 |
6 | 719 | 845 | 126 |
7 | 5039 | 5922 | 883 |
8 | 40319 | 47383 | 7064 |
9 | 362879 | 426456 | 63577 |
These additional swaps significantly alter the order of the k−1{displaystyle k-1} prefix elements.
The additional swaps can be avoided by either adding an additional recursive call before the loop and looping k−1{displaystyle k-1} times (as above) or looping k{displaystyle k} times and checking that i{displaystyle i} is less than k−1{displaystyle k-1} as in:
The choice is primarily aesthetic but the latter results in checking the value of i{displaystyle i} twice as often.
See also[edit]
References[edit]
- ^Heap, B. R. (1963). 'Permutations by Interchanges'(PDF). The Computer Journal. 6 (3): 293–4. doi:10.1093/comjnl/6.3.293.
- ^Sedgewick, R. (1977). 'Permutation Generation Methods'. ACM Computing Surveys. 9 (2): 137–164. doi:10.1145/356689.356692.
- ^Sedgewick, Robert. 'a talk on Permutation Generation Algorithms'(PDF).
PyTorch provides a lot of methods for the Tensor type. Some of these methodsmay be confusing for new users. Here, I would like to talk aboutview()
vsreshape()
,transpose()
vspermute()
.
view() vs transpose()
Both view()
and reshape()
can be used to change the size or shape oftensors. But they are slightly different.
The view()
has existed for a long time. It will return a tensor with the newshape. The returned tensor shares the underling data with the original tensor.If you change the tensor value in the returned tensor, the corresponding valuein the viewed tensor also changes.
On the other hand, it seems that reshape()
has been introduced in version0.4. According to thedocument, thismethod will
Returns a tensor with the same data and number of elements as input, but with the specified shape. When possible, the returned tensor will be a view of input. Otherwise, it will be a copy. Contiguous inputs and inputs with compatible strides can be reshaped without copying, but you should not depend on the copying vs. viewing behavior.
It means that torch.reshape
may return a copy or a view of the originaltensor. You can not count on that to return a view or a copy. According to thedeveloper:
Photolemur 3 1 0 – automated photo enhancement. if you need a copy use clone() if you need the same storage use view(). The semantics of reshape() are that it may or may not share the storage and you don't know beforehand.
As a side note, I found that torch version 0.4.1 and 1.0.1 behaves differentlywhen you print the id
of original tensor and viewing tensor:
You see that id
of a.storage()
and b.storage()
is not the same. Isn'tthat their underlying data the same? Why this difference?
I filed an issue in thePyTorch repo and got answers from the developer. It turns out that to find thedata pointer, we have to use the data_ptr()
method. You will find that theirdata pointers are the same.
view() vs transpose()
transpose()
, like view()
can also be used to change the shape of a tensorand it also returns a new tensor sharing the data with the original tensor:
Returns a tensor that is a transposed version of input. The given dimensions dim0 and dim1 are swapped.
The resulting out tensor shares it's underlying storage with the input tensor, so changing the content of one would change the content of the other.
One difference is that view()
can only operate on contiguous tensor and thereturned tensor is still contiguous. transpose()
can operate both oncontiguous and non-contiguous tensor. Unlike view()
, the returned tensor maybe not contiguous any more.
Permute 2 2 2 17
But what does contiguous mean?
There is a good answer on SOwhich discusses the meaning of contiguous
in Numpy. It also applies toPyTorch.
As I understand, contiguous
in PyTorch means if the neighboring elements inthe tensor are actually next to each other in memory. Let's take a simpleexample:
Tensor x
and y
in the above example share the same memory space1.
If you check their contiguity withis_contiguous()
,you will find that x
is contiguous but y
is not.
Since x is contiguous, x[0][0] and x[0][1] are next to each other in memory.But y[0][0] and y[0][1] is not.
A lot of tensor operations requires that the tensor should be contiguous,otherwise, an error will be thrown. To make a non-contiguous tensor becomecontiguous, use call thecontiguous()
,which will return a new contiguous tensor. In plain words, it will create a newmemory space for the new tensor and copy the value from the non-contiguoustensor to the new tensor.
permute()
and tranpose()
are similar. transpose()
can only swap twodimension. But permute()
can swap all the dimensions. For example:
Note that, in permute()
, you must provide the new order of all thedimensions. In transpose()
, you can only provide two dimensions. tranpose()
can be thought as a special case of permute()
method in for 2D tensors.
Permute 2 2 2 16 Riddle
- tensor data pointers.
- view after transpose raises non-contiguous error.
- When to use which, permute, view, transpose.
- Difference between reshape() and view().
Permute 2 2 2 1997
To show a tensor's memory address, use
tensor.data_ptr()
. ↩︎