By guyinmatsci


2019-07-11 18:06:18 8 Comments

I've got a numpy array of strings (str8192) where the second column is the names of things. For the sake of this lets say this array is called thingList. I have two strings, string1 and string2. I'm trying to get a list of every item in the second column of thingList that is in string1 or in string 2. Currently I have this running with a for loop, but I was hoping there was a faster way I don't know about, I'm pretty new to programming.

Once I find a match, I also want to record what is in the first column but the same row as the match.

Any help to speed this is greatly appreciated, as thingList is pretty large and this functions is run quite a lot with various arrays.

tempThing = []
tempCode = []

for i in range(thingList.shape[0]):
        if thingList[i][1].lower() in string1.lower() or thingList[i] [1].lower() in string2.lower():
            tempThing.append(thingList[i][1])
            tempCode.append(thingList[i][0])

This code works fine, but it definitely is the bottleneck in my program and is slowing it down a lot.

2 comments

@C.Nivs 2019-07-11 19:19:42

Numpy arrays will default to iterate over the rows, so no need to do for i in range(...):

x = np.array(list(range(3)), list(range(3,6)))

for i in x:
    print(i)

[0 1 2]
[3 4 5]

# This yields the same result, so use the former
for i in range(x.shape[0]):
    print(x[i])

[0 1 2]
[3 4 5]

Next, you are spending a ton of time doing str.lower() over and over again. I'd probably pre-lower all of your strings ahead of time:

y = np.array([list('ABC'), list('DEF')])

np.char.lower(y)
array([['a', 'b', 'c'],
       ['d', 'e', 'f']],
      dtype='<U1')

# apply this to string1 and string2
l_str1, l_str2 = string1.lower(), string2.lower()

Now your loop should look like:

l_str1, l_str2 = string1.lower(), string2.lower()

for val1, val2 in thingList:
    to_check = val2.lower()

    if to_check in l_str1 or to_check in l_str2:
        tempThing.append(val1)
        tempCode.append(val2)

Now you can apply this to a list comprehension:

# you can zip these together so you aren't using str.lower() 
# for all of your if statements
tmp = ((*uprow) for uprow, (a, b) in zip(thingList, np.char.lower(thingList))
       if b in l_str1 or b in l_str2)

# this will unpack pairs
tempThing, tempCode = zip(*tmp)

@vlemaistre 2019-07-11 18:56:43

You could use list comprehensions, they are faster than traditional for loops. Furthermore, there are a few minor improvements you could make to make your code run faster :

thing_list = [['Thing1', 'bo'], ['Thing2', 'b'], [ 'Thing3', 'ca'],
              ['Thing4', 'patrick']]*100
string1 = 'bobby'
string2 = 'patrick neils'

# Compute your lower strings before the for loops to avoid
# calling the function at each loop
st1_lower = string1.lower()
st2_lower = string2.lower()

# You can store both the item and the name in the same array to reduce
# the computing time and do it in one list comprehension
result = [[x[0], x[1]] for x in thing_list
          if (x[1].lower() in st1_lower) or (x[1].lower() in st2_lower) ]

Output :

[['Thing1', 'bo'], ['Thing2', 'b'], ['Thing4', 'patrick']]

Performance :

For loops : 172 µs ± 9.59 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

List comprehension : 81.1 µs ± 2.17 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Related Questions

Sponsored Content

15 Answered Questions

[SOLVED] How to insert an item into an array at a specific index (JavaScript)?

45 Answered Questions

[SOLVED] How to check if an object is an array?

79 Answered Questions

[SOLVED] How do I remove a particular element from an array in JavaScript?

  • 2011-04-23 22:17:18
  • Walker
  • 5841397 View
  • 7285 Score
  • 79 Answer
  • Tags:   javascript arrays

30 Answered Questions

[SOLVED] How to append something to an array?

46 Answered Questions

10 Answered Questions

[SOLVED] Does Python have a string 'contains' substring method?

41 Answered Questions

[SOLVED] Sort array of objects by string property value

28 Answered Questions

[SOLVED] Finding the index of an item given a list containing it in Python

  • 2008-10-07 01:39:38
  • Eugene M
  • 3328929 View
  • 2746 Score
  • 28 Answer
  • Tags:   python list indexing

11 Answered Questions

[SOLVED] How to substring a string in Python?

  • 2009-03-19 17:29:41
  • Joan Venge
  • 2570161 View
  • 1908 Score
  • 11 Answer
  • Tags:   python string

18 Answered Questions

[SOLVED] How do I empty an array in JavaScript?

  • 2009-08-05 09:08:39
  • akano1
  • 2282179 View
  • 2199 Score
  • 18 Answer
  • Tags:   javascript arrays

Sponsored Content