Skip to content Skip to sidebar Skip to footer

How To Find All Variables With Identical Id?

Let's say I have a numpy array a and create b like this: a = np.arange(3) b = a If I now change b e.g. like this b[0] = 100 and print a, b, their ids and .flags print a print a.f

Solution 1:

There are 2 issues - how do you identify the variables that you want to compare, and how to do you compare them.

Take the second first.

My version (1.8.2) does not have a np.shares_memory function. It does have a np.may_share_memory.

https://github.com/numpy/numpy/pull/6166 is the pull request that adds shares_memory; it' dated last August. So you'd have to have brand new numpy to use it. Note that a definitive test is potentially hard, and it may issue as 'TOO HARD' error message. I imagine, for example that there are some slices that share the memory, but hard to identify by simply comparing buffer starting points.

https://github.com/numpy/numpy/blob/97c35365beda55c6dead8c50df785eb857f843f0/numpy/core/tests/test_mem_overlap.py is the unit test for these memory_overlap functions. Read it if you want to see what a daunting task it is to think of all the possible overlap conditions between 2 known arrays.

I like to look at the array's .__array_interface__. One item in that dictionary is 'data', which is a pointer to the data buffer. Identical pointer means the data is shared. But a view might start somewhere down the line. I wouldn't be surprised if shares_memeory looks at this pointer.

Identical id means 2 variables reference the same object, but different array objects can share a data buffer.

All these tests require looking specific references; so you still need to get some sort of list of references. Look at locals()?, globals(). What about unnamed references, such as list of arrays, or some user defined dictionary?

An example Ipython run:

Some variables and references:

In [1]: a=np.arange(10)
In [2]: b=a           # reference
In [3]:c=a[:]# view
In [4]: d=a.copy()# copy
In [5]: e=a[2:]# another view
In [6]: ll=[a, a[:], a[3:], a[[1,2,3]]]# list 

Compare id:

In[7]: id(a)
Out[7]: 142453472In[9]: id(b)
Out[9]: 142453472

None of the others share the id, except ll[0].

In[10]: np.may_share_memory(a,b)
Out[10]: TrueIn[11]: np.may_share_memory(a,c)
Out[11]: TrueIn[12]: np.may_share_memory(a,d)
Out[12]: FalseIn[13]: np.may_share_memory(a,e)
Out[13]: TrueIn[14]: np.may_share_memory(a,ll[3])
Out[14]: False

That's about what I'd expect; views share memory, copies do not.

In [15]: a.__array_interface__
Out[15]: 
{'version': 3,
 'data': (143173312, False),
 'typestr': '<i4',
 'descr': [('', '<i4')],
 'shape': (10,),
 'strides': None}
In [16]: a.__array_interface__['data']
Out[16]: (143173312, False)
In [17]: b.__array_interface__['data']
Out[17]: (143173312, False)
In [18]: c.__array_interface__['data']
Out[18]: (143173312, False)
In [19]: d.__array_interface__['data']
Out[19]: (151258096, False)            # copy - diff buffer
In [20]: e.__array_interface__['data'] 
Out[20]: (143173320, False)            # differs by 8 bytes
In [21]: ll[1].__array_interface__['data']
Out[21]: (143173312, False)            # same point

Just with this short session I hav 76 items in locals(). But I can search it for matching id with:

In [26]: [(k,v) for k,v in locals().items() if id(v)==id(a)]
Out[26]: 
[('a', array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])),
 ('b', array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))]

Same for the other tests.

I can search ll in the same way:

In [28]: [n for n,l in enumerate(ll) if id(l)==id(a)]
Out[28]: [0]

And I could add a layer to the locals() search by testing if an item is a list or dictionary, and doing a search within that.

So even if we settle on the testing method, it isn't trivial to search for all possible references.

I think the best approach is to just understand your own use of variables, so that you can clearly identify references, views and copies. In selected cases you can perform tests like may_share_memory or comparing databuffers. But there isn't an inexpensive, definitive test. When in doubt it is cheaper to make a copy, than to risk over writing something. In my years of numpy use I've never felt the need to an definitive answer to this question.


I don't find the OWNDATA flag very useful. Consider the above variables

In[35]: a.flags['OWNDATA']Out[35]: TrueIn[36]: b.flags['OWNDATA']   # refOut[36]: TrueIn[37]: c.flags['OWNDATA']   # viewOut[37]: FalseIn[38]: d.flags['OWNDATA']   # copyOut[38]: TrueIn[39]: e.flags['OWNDATA']   # viewOut[39]: False

While I can predict the OWNDATA value in these simple cases, its value doesn't say much about shared memory, or shared id. False suggests it was created from another array, and thus may share memory. But that's just a 'may'.

I often create a sample array by reshaping a range.

In[40]: np.arange(3).flags['OWNDATA']Out[40]: TrueIn[41]: np.arange(4).reshape(2,2).flags['OWNDATA']Out[41]: False

There's clearly no other reference to the data, but the reshaped array does not 'own' its own data. Same would happen with

temp = np.arange(4); temp = temp.reshape(2,2)

I'd have to do

temp = np.arange(4); temp.shape = (2,2)

to keep OWNDATA true. False OWNDATA means something right after creating the new array object, but it doesn't change if the original reference is redefined or deleted. It easily becomes out of date.

Solution 2:

The assignment b=a does not create a view on the original array a but simply creates a reference to it. In other words, b is just a different name for a. Both variables a and b refer to the same array which owns its data such that the OWNDATA flag is set. Modifying b will modify a.

The assignment b=a.copy() creates a copy of the original array. That is, a and b refer to separate arrays which both own their data such that the OWNDATA flag is set. Modifying b will not modify a.

However, if you make the assignment b=a[:], you will create a view of the original array and b will not own its data. Modifying b will modify a.

The shares_memory function is what you are looking for. It does what it says on the box: Check whether to arrays a and b have shared memory and thus affect each other.

Post a Comment for "How To Find All Variables With Identical Id?"