Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Optimization for screen grab, thresholding, and houghcircles

So I've been working on a project to make a bot visually play One Finger Death Punch. I'm looking for ways to improve and optimize it and need advice. Here is the code for my screen grab, image processing, and enemy detection using houghcircles.

Screen Grab

import win32gui
import win32ui
import win32con
import win32api
import numpy as np
import pywintypes

winlist = []
def enum_cb(hwnd,results):
    winlist.append((hwnd, win32gui.GetWindowText(hwnd)))

def get_screen(windowName):
    toplist = []
    win32gui.EnumWindows(enum_cb, toplist)

    game = [(hwnd, title) for hwnd, title in winlist if windowName in title.lower()]
    game = game[0]
    if game[1] == 'One Finger Death Punch':
        hwnd = game[0]
    else:
        return

    try:
        if win32gui.GetForegroundWindow() is not hwnd:
            win32gui.SetForegroundWindow(hwnd)
        left, top, right, bot = win32gui.GetWindowRect(hwnd)
        w = right - left
        h = bot - top
        if w is not 800 and h is not 470:
            win32gui.MoveWindow(hwnd, 0, 0, 800, 470, True)
        wDC = win32gui.GetWindowDC(hwnd)
        dcObj = win32ui.CreateDCFromHandle(wDC)
        cDC = dcObj.CreateCompatibleDC()
        dataBitMap = win32ui.CreateBitmap()
        dataBitMap.CreateCompatibleBitmap(dcObj, w - 18, h - 48)
        cDC.SelectObject(dataBitMap)
        cDC.BitBlt((0, 0), (w - 18, h - 48), dcObj, (left + 9, top + 37), win32con.SRCCOPY)

        signedIntsArray = dataBitMap.GetBitmapBits(True)
        img = np.fromstring(signedIntsArray, dtype='uint8')

        img.shape = (h - 48, w - 18, 4)

        dcObj.DeleteDC()
        cDC.DeleteDC()
        win32gui.ReleaseDC(hwnd, wDC)
        win32gui.DeleteObject(dataBitMap.GetHandle())

        return img, win32gui.GetWindowRect(hwnd)
    except pywintypes.error:
        return

Image Processing

import cv2
import numpy as np

def roi(img, vertices):
    #blank mask:
    mask = np.zeros_like(img)

    for vert in vertices:
        # fill the mask
        cv2.fillPoly(mask, vert, (255,255,255))
        # now only show the area that is the mask
        masked = cv2.bitwise_and(img, mask)
    return masked

def process_img(original_image, low_threshold, high_threshold,vertices):
    img_hsv = cv2.cvtColor(original_image, cv2.COLOR_BGR2HSV)

    img_thresholded = cv2.inRange(img_hsv,low_threshold,high_threshold)
    erode_x = 2#cv2.getTrackbarPos('erode', 'Control')
    erode_y = erode_x
    dilate_x = 2#cv2.getTrackbarPos('dilate', 'Control')
    dilate_y = dilate_x
    ekernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(erode_x,erode_y))
    dkernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (dilate_x,dilate_y))
    cv2.erode(img_thresholded,ekernel,img_thresholded,iterations = 1)
    cv2.dilate(img_thresholded,dkernel,img_thresholded,iterations = 1)

    processed_img = roi(img_thresholded, [vertices])

    return processed_img

Enemy Detection

import cv2
import numpy as np
from imageprocessing import roi, process_img

def findEnemies(screen,bbox,player_x):
    low_threshold = np.array([0, 0, 120])
    high_threshold = np.array([0, 0, 160])

    vert1 = np.array([[0, 155], [0, 120], [bbox[2] - 430, 120], [bbox[2] - 430, 155]], np.int32)
    vert2 = np.array([[411, 155], [411, 120], [bbox[2], 120], [bbox[2], 155]], np.int32)
    vertices = np.array([[vert1], [vert2]])  # Enemy vision


    enemyview = process_img(screen,low_threshold,high_threshold,vertices)

    circles = cv2.HoughCircles(enemyview,cv2.HOUGH_GRADIENT,2,23, param1=100, param2=20, minRadius=6, maxRadius=11)
    if circles is not None:
        circles = np.uint16(np.around(circles))
        for i in circles[0, :]:
            cv2.circle(screen, (i[0], i[1]), i[2], (0, 0, 255), 1)  # draw the outer circle
            cv2.putText(screen,str(abs(int(i[0])-player_x)),(i[0],i[1]),cv2.FONT_HERSHEY_SIMPLEX,0.5,(255,255,255),2)

    return enemyview, circles

Measuring the time between screen grabs it starts out between 0.05 and 0.08 seconds, but it slow climbs to between 0.08 to 0.1 after it starts to go through the level and through recent testing seems to slowly get higher and higher and I am not sure why. This is despite how low I set the sleep between and after my inputs.

There are a couple issues I am having that relate to time and how well houghcircles work. The faster the images are processed the better because I get an accurate measurement of the enemies distance, the problem is the bot has no way to differentiate an enemy that is still active and one it just hit, but is in hitstun or lag before it flies away and thus even though the enemy can't hit them anymore it still will attack in that direction as if there is an enemy there. This causes it to miss, which I am trying to avoid as it unusually leads to it getting hit, which I am also trying to avoid.

On top of this detecting enemies can be finicky sometimes resulting in it staring at the enemy until its to late and gets hit. A couple of things I think cause this. First the game is layered which means some objects are in the foreground which block the view of the enemies for a couple frames and depending on how close they are to the bot will cause it to miss them, with proper speed of processing images this should be a minimal problem. The other problem is weapons and hats enemies wear sometimes cause the detection to not work or at least not efficiently enough to detect them until they are either closer or their hat is hit or knocked off.

You can get the full project here if you want to see the rest of my code.

Optimization for screen grab, thresholding, and houghcircles

So I've been working on a project to make a bot visually play One Finger Death Punch. I'm looking for ways to improve and optimize it and need advice. Here is the code for my screen grab, image processing, and enemy detection using houghcircles.

Screen Grab

import win32gui
import win32ui
import win32con
import win32api
import numpy as np
import pywintypes

winlist = []
def enum_cb(hwnd,results):
    winlist.append((hwnd, win32gui.GetWindowText(hwnd)))

def get_screen(windowName):
    toplist = []
    win32gui.EnumWindows(enum_cb, toplist)

    game = [(hwnd, title) for hwnd, title in winlist if windowName in title.lower()]
    game = game[0]
    if game[1] == 'One Finger Death Punch':
        hwnd = game[0]
    else:
        return

    try:
        if win32gui.GetForegroundWindow() is not hwnd:
            win32gui.SetForegroundWindow(hwnd)
        left, top, right, bot = win32gui.GetWindowRect(hwnd)
        w = right - left
        h = bot - top
        if w is not 800 and h is not 470:
            win32gui.MoveWindow(hwnd, 0, 0, 800, 470, True)
        wDC = win32gui.GetWindowDC(hwnd)
        dcObj = win32ui.CreateDCFromHandle(wDC)
        cDC = dcObj.CreateCompatibleDC()
        dataBitMap = win32ui.CreateBitmap()
        dataBitMap.CreateCompatibleBitmap(dcObj, w - 18, h - 48)
        cDC.SelectObject(dataBitMap)
        cDC.BitBlt((0, 0), (w - 18, h - 48), dcObj, (left + 9, top + 37), win32con.SRCCOPY)

        signedIntsArray = dataBitMap.GetBitmapBits(True)
        img = np.fromstring(signedIntsArray, dtype='uint8')

        img.shape = (h - 48, w - 18, 4)

        dcObj.DeleteDC()
        cDC.DeleteDC()
        win32gui.ReleaseDC(hwnd, wDC)
        win32gui.DeleteObject(dataBitMap.GetHandle())

        return img, win32gui.GetWindowRect(hwnd)
    except pywintypes.error:
        return

Image Processing

import cv2
import numpy as np

def roi(img, vertices):
    #blank mask:
    mask = np.zeros_like(img)

    for vert in vertices:
        # fill the mask
        cv2.fillPoly(mask, vert, (255,255,255))
        # now only show the area that is the mask
        masked = cv2.bitwise_and(img, mask)
    return masked

def process_img(original_image, low_threshold, high_threshold,vertices):
    img_hsv = cv2.cvtColor(original_image, cv2.COLOR_BGR2HSV)

    img_thresholded = cv2.inRange(img_hsv,low_threshold,high_threshold)
    erode_x = 2#cv2.getTrackbarPos('erode', 'Control')
    erode_y = erode_x
    dilate_x = 2#cv2.getTrackbarPos('dilate', 'Control')
    dilate_y = dilate_x
    ekernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(erode_x,erode_y))
    dkernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (dilate_x,dilate_y))
    cv2.erode(img_thresholded,ekernel,img_thresholded,iterations = 1)
    cv2.dilate(img_thresholded,dkernel,img_thresholded,iterations = 1)

    processed_img = roi(img_thresholded, [vertices])

    return processed_img

Enemy Detection

import cv2
import numpy as np
from imageprocessing import roi, process_img

def findEnemies(screen,bbox,player_x):
    low_threshold = np.array([0, 0, 120])
    high_threshold = np.array([0, 0, 160])

    vert1 = np.array([[0, 155], [0, 120], [bbox[2] - 430, 120], [bbox[2] - 430, 155]], np.int32)
    vert2 = np.array([[411, 155], [411, 120], [bbox[2], 120], [bbox[2], 155]], np.int32)
    vertices = np.array([[vert1], [vert2]])  # Enemy vision


    enemyview = process_img(screen,low_threshold,high_threshold,vertices)

    circles = cv2.HoughCircles(enemyview,cv2.HOUGH_GRADIENT,2,23, param1=100, param2=20, minRadius=6, maxRadius=11)
    if circles is not None:
        circles = np.uint16(np.around(circles))
        for i in circles[0, :]:
            cv2.circle(screen, (i[0], i[1]), i[2], (0, 0, 255), 1)  # draw the outer circle
            cv2.putText(screen,str(abs(int(i[0])-player_x)),(i[0],i[1]),cv2.FONT_HERSHEY_SIMPLEX,0.5,(255,255,255),2)

    return enemyview, circles

Measuring the time between screen grabs it starts out between 0.05 and 0.08 seconds, but it slow climbs to between 0.08 to 0.1 after it starts to go through the level and through recent testing seems to slowly get higher and higher and I am not sure why. This is despite how low I set the sleep between and after my inputs.

There are a couple issues I am having that relate to time and how well houghcircles work. The faster the images are processed the better because I get an accurate measurement of the enemies distance, the problem is the bot has no way to differentiate an enemy that is still active and one it just hit, but is in hitstun or lag before it flies away and thus even though the enemy can't hit them anymore it still will attack in that direction as if there is an enemy there. This causes it to miss, which I am trying to avoid as it unusually leads to it getting hit, which I am also trying to avoid.

On top of this detecting enemies can be finicky sometimes resulting in it staring at the enemy until its to late and gets hit. A couple of things I think cause this. First the game is layered which means some objects are in the foreground which block the view of the enemies for a couple frames and depending on how close they are to the bot will cause it to miss them, with proper speed of processing images this should be a minimal problem. The other problem is weapons and hats enemies wear sometimes cause the detection to not work or at least not efficiently enough to detect them until they are either closer or their hat is hit or knocked off.

You can get the full project here if you want to see the rest of my code.