Best approach to game sprite object detection?
I am trying to get opencv to recognize the player in One Finger Death Punch. I am currently using template matching, but I am not getting any appropriate matches. Here is my template Here is my test image
Here is my code
import numpy as np
from PIL import ImageGrab
import cv2
import time
from matplotlib import pyplot as plt
def screen_record():
last_time = time.time()
while(True):
# 800x600 windowed mode
#printscreen = np.array(ImageGrab.grab(bbox=(0,40,800,490)))
img_rgba = cv2.imread('Resources/testimg.png',cv2.IMREAD_UNCHANGED)
img_gray = cv2.cvtColor(img_rgba, cv2.COLOR_BGRA2GRAY)
template = cv2.cvtColor(cv2.imread('Resources/base2.png',cv2.IMREAD_UNCHANGED),cv2.COLOR_BGRA2GRAY)
print('loop took {} seconds'.format(time.time()-last_time))
last_time = time.time()
w, h = template.shape[::-1]
res = cv2.matchTemplate(img_gray,template,cv2.TM_CCOEFF_NORMED)
threshold = 0.5
loc = np.where (res >= threshold)
for pt in zip(*loc[::-1]):
cv2.rectangle(img_rgba, pt, (pt[0] + w, pt[1] + h), (0,255,255), 2)
#cv2.imshow('window',cv2.cvtColor(printscreen, cv2.COLOR_BGR2RGB))
cv2.imshow('window2', img_rgba)
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
screen_record()
Eventually I want it to be able to detect the player and enemies while the game is playing hence the function name, commented out code, and variable names. Any advice on how to best accomplish this would be appreciated.
I just watched a video of One Finger Death Punch; it isn't a sprite based fighter game, it's rather vectorial, so the template matching won't work. Anyway it's quite slow method.
What you could try:
Could you point me to some good documentation on these methods? I am having trouble finding info on color based segmentation instead of marker based. How to use the HOG is also a bit confusing. For example in both instances how do I find the location of what I detect?
Edit: Ok through some dedicated research I managed to figure out and execute your first suggestions. Here is my current code I got it to keep track of the player, although technically it doesn't move much it still sticks with its position, I am not sure if I really need it to see/know about the player other than its always in the middle since enemies come to you. Anyway I got it to detect a single enemy, but multiple doesn't work
Ok, it's a good beginning!
To detect (multiple) enemies, don't use the Hu moments, it will only give one centroid. ConnectedComponentsWithStats is better while they don't touch each other.
To detect enemies which touch each other, you can try several tricks:
Ok this is going to get lengthy, but here it goes.
Alright so I figured out your suggestions first messing with ConnectedComponentsWithStats, which worked, but like you mentioned since it doesn't work well if they touch I moved one to using Hough circle. This worked to although its very finicky. For example with just a gray image it was great once I got the setting right for the test image. When I did it for the game though it would keep droping the detection. Also although it didn't during the game test, during the image test it would detect the player which I didn't want. So I filtered out the player as if i was using my old method of detection and then ran that through the hough circle transform and again it was kinda finiky. It was better, but it still cut out every once in a while...
Whelp I just wrote a ton more and it said forbidden so the gist is I added a gaussan blur and it helped, but it still cuts out and also the environment in the game interferes with detection because of layering, see this for example, and the threshed here. Note how the pole and overhang block the heads. So how do I make it 100% consistent when they aren't blocked and how do I deal with them being blocked? Also here is my latest code
Good news I am have made significant progress to the point where it gets through the tutorial level! It still picks up some false positives every once in a while which cause it to mess, but its gotten better recently now that I switch to using pywin32 to capture the screen instead of PIL as it sped up the time between captures a bit. Here is my lastest code, let me know if you have any ideas for optimizations. One thing I would like to do is somehow remove an enemy I just hit from the list of enemies as it may be detecting the dead enemy before it disappears and thus giving a false positive.