Skip to content

DOC,DISCUSS: A How-To guide for ndarray indexing #19586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Mukulikaa opened this issue Jul 29, 2021 · 7 comments
Closed

DOC,DISCUSS: A How-To guide for ndarray indexing #19586

Mukulikaa opened this issue Jul 29, 2021 · 7 comments

Comments

@Mukulikaa
Copy link
Contributor

Hi, all! I'm planning to write a how-to doc for ndarray indexing as per discussion in #19407. I would love to know which use-cases you'd like to see in the doc. These are a few I have in mind (with inspiration from StackOverflow):

  • Accessing specific/random rows and columns
  • Indexing along a specific axis
  • Creating a sub-ndarray from a larger matrix
  • Using variables for slicing e.g. having start and stop relative to each other
  • Indexing based on conditional arguments
  • Generating index arrays based on specific conditions
  • Fetching indices of N max/min values
  • Getting the number of indices generated for an indexing expression without performing the operation
  • Indexing with index arrays of shapes different from the ndarray
  • Indexing the same ndarray multiple times efficiently
  • One-hot encoding with indices

As a new user of NumPy, I'm not sure if some of these cases are trivial or might be out of scope for a doc focus on indexing; I would appreciate any opinions regarding this.

cc: @melissawm @rossbar

@AnjaTRPES
Copy link

AnjaTRPES commented Jul 30, 2021

Hi Mukulikaa,
these all sound great to me! I have used numpy for a while, and it took me a long time to figure out how to index arrays efficiently (tbh, I'd rather say I am still in the process of it). Could I help you with writing this how-doc?

@Mukulikaa
Copy link
Contributor Author

Hi @AnjaTRPES, I would love to use your help! I'm curious to know if you referred to any specific resources or used special tricks to learn how to index arrays efficiently. I will make a draft PR outlining the doc soon (to make it easier to collaborate) and I'd appreciate your inputs for best practices there.

@melissawm
Copy link
Member

Nice @Mukulikaa - it might be useful to ask in the mailing list as well and see if there are any extra suggestions there.

@adeak
Copy link
Contributor

adeak commented Aug 2, 2021

I can instantly think of two different patterns that are hard to figure out on your own, and confusing when you see them for the first time, but fairly easy to understand and learn to use if you've seen them before (and an "indexing howto" might be a potential place for this!).


The first pattern that can confound new users while being a fairly basic use case for broadcasting and fancy indexing is

I have a 2d array of shape (n, m), and for each row I have k column indices. How do I get the (n, k)-shaped values I'm looking for?

(Note sure if this is included in your "Indexing with index arrays of shapes different from the ndarray" bullet point.) The solution being a range in the remaining leading dimension(s):

arr = np.arange(3*4).reshape(3, 4)
column_indices = [[1, 3], [0, 2], [2, 2]]
print(arr[np.arange(arr.shape[0])[:,None], column_indices])

More generally you'd need an mgrid (ogrid) in every leading dimension, with an additional trailing singleton dimension.


The second pattern is only a partial match for the "indexing" topic, because there are multiple ways to solve the problem. This is

If I give you a permutation index array, can you tell me the inverse?

I.e. if you have b = a[inds], how can you get a = b[inverse_inds]? I know two solutions off the top of my head:

a = np.arange(20, 10, -1)
rng = np.random.default_rng()
inds = rng.permutation(a.size)
b = a[inds]

# solution 1: argsort the indices
inverse_inds = inds.argsort()

# solution 2: index into the left-hand side
inverse_inds2 = np.empty_like(inds)
inverse_inds2[inds] = np.arange(inds.size)

print(np.array_equal(inverse_inds, inverse_inds2))  # True
print(np.array_equal(a, b[inverse_inds]))  # True

@rossbar
Copy link
Contributor

rossbar commented Aug 3, 2021

@Mukulikaa your suggested list of topics in the OP seems like a great start for a how-to. Keep in mind that the document doesn't have to be completely comprehensive, so I would recommend not to worry about getting all of the bullets in the first version of the document. I think this will be a very nice addition to the docs!

@Mukulikaa
Copy link
Contributor Author

Hey @adeak, thanks for your helpful inputs!

And, @rossbar - will definitely keep your point in mind 🙂

@melissawm
Copy link
Member

I believe there's nothing else to do here so closing. Thanks everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy