Looking through the unanswered Python questions on StackOverflow, I found one that seemed interesting.. "Python Get Screen Pixel Value in OS X" - how to access screen pixel values, without the overhead of calling the screencapture
command, then loading the resulting image.
After a bit of searching, the best supported way of grabbing a screenshot is provided by the CoreGraphics API, part of Quartz, specifically CGWindowListCreateImage
.
Since CoreGraphics is a C-based API, the code map almost directly to Python function calls. It's also simplified a bit, because PyObjC handles most of the memory-management (when the wrapping Python object goes out of scope, the underlying object is freed)
Getting the image
After finding some sample iOS code with sane arguments (which can also be found via Apple's docs), I ended up with a CGImage
containing the screenshot:
>>> import Quartz.CoreGraphics as CG >>> image = CG.CGWindowListCreateImage(CG.CGRectInfinite, CG.kCGWindowListOptionOnScreenOnly, CG.kCGNullWindowID, CG.kCGWindowImageDefault) >>> print image <CGImage 0x106b8eff0>
Hurray. We can get the width/height of the image with help from this SO question:
>>> width = CG.CGImageGetWidth(image) >>> height = CG.CGImageGetHeight(image)
Extracting pixel values
Then it was a case of working out how to extract the pixel, which took far longer than all of the above. The simplest way I found of doing this is:
- Use
CGImageGetDataProvider
to get an intermediate representation of the data - Pass the DataProvider to
CGDataProviderCopyData
. In Python this returns a string, which is really a byte-array containing 8-bit unsigned chars, suitable for unpacking with the handystruct
module - Calculate the correct offset for a given (x,y) coordinate as described here
Like so:
>>> prov = CG.CGImageGetDataProvider(image) >>> data = CG.CGDataProviderCopyData(prov) >>> print prov <CGDataProvider 0x7fc19b1022f0> >>> print type(data) <objective-c class __NSCFData at 0x7fff78073cf8>
..and calculate the offset
>>> x, y = 100, 200 # pixel coordinate to get value for >>> offset = 4 * ((width*int(round(y))) + int(round(x))) >>> print offset 1344400
Finally, we can unpack the pixels at that offset with struct.unpack_from
- B
is an unsigned char:
>>> b, g, r, a = struct.unpack_from("BBBB", data, offset=offset) >>> print (r, g, b, a) (23, 23, 23, 255)
Note that the values are stores as BGRA (not RGBA).
Verification, and code
To verify this wasn't generating nonsense values, I used the nice and simple pngcanvas to write the screenshot to a PNG file (pngcanvas is a useful module because it's pure-Python, and a single self-contained .py
file - much lighter weight than something like the PIL, good for when you just want to write pixels to an image-file)
The performance was definitely better than the screencapture
solution. The screencapture
command took about 80ms to write a TIFF file, then there would be additional time to open and parse the TIFF file in Python. The PyObjC code takes about 70ms to take the screenshot and have the values accessible to Python.
Finally, the result - best to view the code on my StackOverflow answer (as there might be other better answers, or edits to the code)
I'll include the code here too, for completeness sake:
import time import struct import Quartz.CoreGraphics as CG class ScreenPixel(object): """Captures the screen using CoreGraphics, and provides access to the pixel values. """ def capture(self, region = None): """region should be a CGRect, something like: >>> import Quartz.CoreGraphics as CG >>> region = CG.CGRectMake(0, 0, 100, 100) >>> sp = ScreenPixel() >>> sp.capture(region=region) The default region is CG.CGRectInfinite (captures the full screen) """ if region is None: region = CG.CGRectInfinite else: # TODO: Odd widths cause the image to warp. This is likely # caused by offset calculation in ScreenPixel.pixel, and # could could modified to allow odd-widths if region.size.width % 2 > 0: emsg = "Capture region width should be even (was %s)" % ( region.size.width) raise ValueError(emsg) # Create screenshot as CGImage image = CG.CGWindowListCreateImage( region, CG.kCGWindowListOptionOnScreenOnly, CG.kCGNullWindowID, CG.kCGWindowImageDefault) # Intermediate step, get pixel data as CGDataProvider prov = CG.CGImageGetDataProvider(image) # Copy data out of CGDataProvider, becomes string of bytes self._data = CG.CGDataProviderCopyData(prov) # Get width/height of image self.width = CG.CGImageGetWidth(image) self.height = CG.CGImageGetHeight(image) def pixel(self, x, y): """Get pixel value at given (x,y) screen coordinates Must call capture first. """ # Pixel data is unsigned char (8bit unsigned integer), # and there are for (blue,green,red,alpha) data_format = "BBBB" # Calculate offset, based on # http://www.markj.net/iphone-uiimage-pixel-color/ offset = 4 * ((self.width*int(round(y))) + int(round(x))) # Unpack data from string into Python'y integers b, g, r, a = struct.unpack_from(data_format, self._data, offset=offset) # Return BGRA as RGBA return (r, g, b, a) if __name__ == '__main__': # Timer helper-function import contextlib @contextlib.contextmanager def timer(msg): start = time.time() yield end = time.time() print "%s: %.02fms" % (msg, (end-start)*1000) # Example usage sp = ScreenPixel() with timer("Capture"): # Take screenshot (takes about 70ms for me) sp.capture() with timer("Query"): # Get pixel value (takes about 0.01ms) print sp.width, sp.height print sp.pixel(0, 0) # To verify screen-cap code is correct, save all pixels to PNG, # using http://the.taoofmac.com/space/projects/PNGCanvas from pngcanvas import PNGCanvas c = PNGCanvas(sp.width, sp.height) for x in range(sp.width): for y in range(sp.height): c.point(x, y, color = sp.pixel(x, y)) with open("test.png", "wb") as f: f.write(c.dump())