How To Find All Variables With Identical Id?
Solution 1:
There are 2 issues - how do you identify the variables that you want to compare, and how to do you compare them.
Take the second first.
My version (1.8.2) does not have a np.shares_memory
function. It does have a np.may_share_memory
.
https://github.com/numpy/numpy/pull/6166 is the pull request that adds shares_memory
; it' dated last August. So you'd have to have brand new numpy
to use it. Note that a definitive test is potentially hard, and it may issue as 'TOO HARD' error message. I imagine, for example that there are some slices that share the memory, but hard to identify by simply comparing buffer starting points.
https://github.com/numpy/numpy/blob/97c35365beda55c6dead8c50df785eb857f843f0/numpy/core/tests/test_mem_overlap.py is the unit test for these memory_overlap
functions. Read it if you want to see what a daunting task it is to think of all the possible overlap conditions between 2 known arrays.
I like to look at the array's .__array_interface__
. One item in that dictionary is 'data', which is a pointer to the data buffer. Identical pointer means the data is shared. But a view might start somewhere down the line. I wouldn't be surprised if shares_memeory
looks at this pointer.
Identical id
means 2 variables reference the same object, but different array objects can share a data buffer.
All these tests require looking specific references; so you still need to get some sort of list of references. Look at locals()
?, globals()
. What about unnamed references, such as list of arrays, or some user defined dictionary?
An example Ipython run:
Some variables and references:
In [1]: a=np.arange(10)
In [2]: b=a # reference
In [3]:c=a[:]# view
In [4]: d=a.copy()# copy
In [5]: e=a[2:]# another view
In [6]: ll=[a, a[:], a[3:], a[[1,2,3]]]# list
Compare id
:
In[7]: id(a)
Out[7]: 142453472In[9]: id(b)
Out[9]: 142453472
None of the others share the id
, except ll[0]
.
In[10]: np.may_share_memory(a,b)
Out[10]: TrueIn[11]: np.may_share_memory(a,c)
Out[11]: TrueIn[12]: np.may_share_memory(a,d)
Out[12]: FalseIn[13]: np.may_share_memory(a,e)
Out[13]: TrueIn[14]: np.may_share_memory(a,ll[3])
Out[14]: False
That's about what I'd expect; views share memory, copies do not.
In [15]: a.__array_interface__
Out[15]:
{'version': 3,
'data': (143173312, False),
'typestr': '<i4',
'descr': [('', '<i4')],
'shape': (10,),
'strides': None}
In [16]: a.__array_interface__['data']
Out[16]: (143173312, False)
In [17]: b.__array_interface__['data']
Out[17]: (143173312, False)
In [18]: c.__array_interface__['data']
Out[18]: (143173312, False)
In [19]: d.__array_interface__['data']
Out[19]: (151258096, False) # copy - diff buffer
In [20]: e.__array_interface__['data']
Out[20]: (143173320, False) # differs by 8 bytes
In [21]: ll[1].__array_interface__['data']
Out[21]: (143173312, False) # same point
Just with this short session I hav 76 items in locals()
. But I can search it for matching id
with:
In [26]: [(k,v) for k,v in locals().items() if id(v)==id(a)]
Out[26]:
[('a', array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])),
('b', array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))]
Same for the other tests.
I can search ll
in the same way:
In [28]: [n for n,l in enumerate(ll) if id(l)==id(a)]
Out[28]: [0]
And I could add a layer to the locals()
search by testing if an item is a list or dictionary, and doing a search within that.
So even if we settle on the testing method, it isn't trivial to search for all possible references.
I think the best approach is to just understand your own use of variables, so that you can clearly identify references, views and copies. In selected cases you can perform tests like may_share_memory
or comparing databuffers. But there isn't an inexpensive, definitive test. When in doubt it is cheaper to make a copy, than to risk over writing something. In my years of numpy
use I've never felt the need to an definitive answer to this question.
I don't find the OWNDATA
flag very useful. Consider the above variables
In[35]: a.flags['OWNDATA']Out[35]: TrueIn[36]: b.flags['OWNDATA'] # refOut[36]: TrueIn[37]: c.flags['OWNDATA'] # viewOut[37]: FalseIn[38]: d.flags['OWNDATA'] # copyOut[38]: TrueIn[39]: e.flags['OWNDATA'] # viewOut[39]: False
While I can predict the OWNDATA
value in these simple cases, its value doesn't say much about shared memory, or shared id. False
suggests it was created from another array, and thus may share memory. But that's just a 'may'.
I often create a sample array by reshaping a range.
In[40]: np.arange(3).flags['OWNDATA']Out[40]: TrueIn[41]: np.arange(4).reshape(2,2).flags['OWNDATA']Out[41]: False
There's clearly no other reference to the data, but the reshaped array does not 'own' its own data. Same would happen with
temp = np.arange(4); temp = temp.reshape(2,2)
I'd have to do
temp = np.arange(4); temp.shape = (2,2)
to keep OWNDATA
true. False OWNDATA
means something right after creating the new array object, but it doesn't change if the original reference is redefined or deleted. It easily becomes out of date.
Solution 2:
The assignment b=a
does not create a view on the original array a
but simply creates a reference to it. In other words, b
is just a different name for a
. Both variables a
and b
refer to the same array which owns its data such that the OWNDATA
flag is set. Modifying b
will modify a
.
The assignment b=a.copy()
creates a copy of the original array. That is, a
and b
refer to separate arrays which both own their data such that the OWNDATA
flag is set. Modifying b
will not modify a
.
However, if you make the assignment b=a[:]
, you will create a view of the original array and b
will not own its data. Modifying b
will modify a
.
The shares_memory
function is what you are looking for. It does what it says on the box: Check whether to arrays a
and b
have shared memory and thus affect each other.
Post a Comment for "How To Find All Variables With Identical Id?"