By guyinmatsci


2019-07-11 18:06:18 8 Comments

I've got a numpy array of strings (str8192) where the second column is the names of things. For the sake of this lets say this array is called thingList. I have two strings, string1 and string2. I'm trying to get a list of every item in the second column of thingList that is in string1 or in string 2. Currently I have this running with a for loop, but I was hoping there was a faster way I don't know about, I'm pretty new to programming.

Once I find a match, I also want to record what is in the first column but the same row as the match.

Any help to speed this is greatly appreciated, as thingList is pretty large and this functions is run quite a lot with various arrays.

tempThing = []
tempCode = []

for i in range(thingList.shape[0]):
        if thingList[i][1].lower() in string1.lower() or thingList[i] [1].lower() in string2.lower():
            tempThing.append(thingList[i][1])
            tempCode.append(thingList[i][0])

This code works fine, but it definitely is the bottleneck in my program and is slowing it down a lot.

2 comments

@C.Nivs 2019-07-11 19:19:42

Numpy arrays will default to iterate over the rows, so no need to do for i in range(...):

x = np.array(list(range(3)), list(range(3,6)))

for i in x:
    print(i)

[0 1 2]
[3 4 5]

# This yields the same result, so use the former
for i in range(x.shape[0]):
    print(x[i])

[0 1 2]
[3 4 5]

Next, you are spending a ton of time doing str.lower() over and over again. I'd probably pre-lower all of your strings ahead of time:

y = np.array([list('ABC'), list('DEF')])

np.char.lower(y)
array([['a', 'b', 'c'],
       ['d', 'e', 'f']],
      dtype='<U1')

# apply this to string1 and string2
l_str1, l_str2 = string1.lower(), string2.lower()

Now your loop should look like:

l_str1, l_str2 = string1.lower(), string2.lower()

for val1, val2 in thingList:
    to_check = val2.lower()

    if to_check in l_str1 or to_check in l_str2:
        tempThing.append(val1)
        tempCode.append(val2)

Now you can apply this to a list comprehension:

# you can zip these together so you aren't using str.lower() 
# for all of your if statements
tmp = ((*uprow) for uprow, (a, b) in zip(thingList, np.char.lower(thingList))
       if b in l_str1 or b in l_str2)

# this will unpack pairs
tempThing, tempCode = zip(*tmp)

@vlemaistre 2019-07-11 18:56:43

You could use list comprehensions, they are faster than traditional for loops. Furthermore, there are a few minor improvements you could make to make your code run faster :

thing_list = [['Thing1', 'bo'], ['Thing2', 'b'], [ 'Thing3', 'ca'],
              ['Thing4', 'patrick']]*100
string1 = 'bobby'
string2 = 'patrick neils'

# Compute your lower strings before the for loops to avoid
# calling the function at each loop
st1_lower = string1.lower()
st2_lower = string2.lower()

# You can store both the item and the name in the same array to reduce
# the computing time and do it in one list comprehension
result = [[x[0], x[1]] for x in thing_list
          if (x[1].lower() in st1_lower) or (x[1].lower() in st2_lower) ]

Output :

[['Thing1', 'bo'], ['Thing2', 'b'], ['Thing4', 'patrick']]

Performance :

For loops : 172 µs ± 9.59 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

List comprehension : 81.1 µs ± 2.17 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Related Questions

Sponsored Content

47 Answered Questions

86 Answered Questions

[SOLVED] How do I remove a particular element from an array in JavaScript?

  • 2011-04-23 22:17:18
  • Walker
  • 6030936 View
  • 7509 Score
  • 86 Answer
  • Tags:   javascript arrays

47 Answered Questions

[SOLVED] How to check if an object is an array?

10 Answered Questions

[SOLVED] Does Python have a string 'contains' substring method?

11 Answered Questions

[SOLVED] How do I get a substring of a string in Python?

  • 2009-03-19 17:29:41
  • Joan Venge
  • 2670704 View
  • 1960 Score
  • 11 Answer
  • Tags:   python string

41 Answered Questions

[SOLVED] Sort array of objects by string property value

16 Answered Questions

[SOLVED] How to insert an item into an array at a specific index (JavaScript)?

28 Answered Questions

[SOLVED] Finding the index of an item given a list containing it in Python

  • 2008-10-07 01:39:38
  • Eugene M
  • 3447675 View
  • 2839 Score
  • 28 Answer
  • Tags:   python list indexing

30 Answered Questions

[SOLVED] How to append something to an array?

18 Answered Questions

[SOLVED] How do I empty an array in JavaScript?

  • 2009-08-05 09:08:39
  • akano1
  • 2380094 View
  • 2198 Score
  • 18 Answer
  • Tags:   javascript arrays

Sponsored Content