Skip to content

OpenCV real-time desktop image capturing using D3DShot

Background

I have been studying using OpenCV and image detection to make an online bot for gaming. The gist of it is you use PyAutoGui or Pillow to capture a desktop screenshot, capture them in a loop, and then try to detect another smaller image within it using OpenCV.

The first challenge you will run into is it is very slow. Capturing my desktop of 3440 x 1440 and I only get ~ 10 FPS with a modern computer using Pillow. It does scale with the resolution, so the lower the resolution the higher the FPS. So then you might try using pywin32 to capture only the window of the game since it can be made to have a smaller resolution. The problem there is, it only works with certain programs. For many games, it keeps capturing and keeps returning only the first frame of the window (or a black screen) and not the current frame of the window. The problem is mentioned here: https://stackoverflow.com/questions/68396303/python-windows-screen-capture-continuous-streaming-issue . This brings you back to capturing the entire desktop.

Another problem you might encounter is using the above methods only allows you to capture your primary desktop or all of your desktops. There is no way to use the methods above to capture just a secondary monitor.

Solution

Use D3DShot, it gives you double the performance of Pillow, PyAutoGui, or pywin32 and it allows you to specify which desktop you want to capture (I could find no other way to do this). The catch was I could only get it working with Python 3.8. Find the code below. It gives me ~20 FPS vs the ~10 I got with other solutions. I left in the Pillow code, so you could do your own comparisons.

import numpy as np # used in ImageGrab version, not in D3DShot
import cv2
from PIL import ImageGrab # used in ImageGrab version, not in D3DShot
from time import time, sleep
import d3dshot


desktop = d3dshot.create(capture_output="numpy")
desktop.display = desktop.displays[0] # change 0 in order to get different desktops

loop_time = time()
sleep(0.1) #use this otherwise we get a divide by 0 error for our FPS timer

while(True):

    # get updated image of screen

    #uncomment below 3 lines to use ImageGrab to get the desktop
    #img = ImageGrab.grab()
    #img_np = np.array(img)
    #frame = cv2.cvtColor(img_np, cv2.COLOR_BGR2GRAY)

    #uncomment below 1 line to use D3DShot to get the desktop
    frame = cv2.cvtColor(desktop.screenshot(), cv2.COLOR_BGR2RGB)

    cv2.imshow("Frame", frame)
    print('FPS {}'.format(1 / (time() - loop_time)))
    loop_time = time()

    if cv2.waitKey(1) == ord('q'):
        cv2.destroyAllWindows()
        break

print('Done.')

Leave a Reply

Your email address will not be published. Required fields are marked *