SikuliX API 1.1.0 is on Maven Central

It is available for dependency download in projects and this is the index entry.

If you used the snapshot version until now, just remove the -SNAPSHOT from the version in the dependency specification.

The OSSRH repository entry is not needed anymore, until you decide to again use snapshots of version 1.1.1 towards end of this year.

SikuliX API 1.1.0 is on Maven Central

New in 1.1.0 — observe revised with new features

The observe feature (see docs especially for the usage with Java) allows to wait for an image to appear or vanish or to wait for changes of the pixel content in a region on the screen. This observation can be done inline (the script waits until the observe ends) or in parallel to the continuing workflow, meaning that the observation is delegated to a thread (hence runs in background).

Before running the observation, you register one or more events (appear, vanish or change) with the region, where you expect the events to happen. Since the beginning of the observe feature in Sikuli, you have to specify a handler function (callback), that is called, when the event happens and where you could handle the situation with the option to stop the observation at this point. With one region object (you can have different objects for the same area on the screen though) only one observation can be run at any time, but you might specify as many events as needed (… or what makes sense).

Here I only want to talk about background observes, which is the more interesting situation (the inline observe is only some bundled waits just a bit more sophisticated).

So the usual pattern in a Python script is:

def handler(event):
    # do something
reg = <some region object>
reg.onAppear("someImage.png", handler)
reg.observeInBackground(FOREVER) # observe in background
# script immediately continues here

In the moment someImage.png appears in the given region, we want to do something (which is done in the handler function). Since the observe runs in background, our script continues and the handler does its job in parallel to our script in the moment the event happens.

When the script ends all running observations are stopped automatically.

With the revision in 1.1.0 the observe feature now is very reliable and produces valuable debug information if needed.

These are the major changes and new features:

  • observation of appear and vanish events are paused when they happen – they have to be explicitely told to continue their observation
  • there is a count how often an event happened until now
  • the events get a unique name at time of registering, that can be used, to later access the event information (region, image, match, count, …)
  • the event attributes are accessed using getters
  • you might inactivate (pause) the observation of an event and activate it again later
  • at any time you can get relevant information about the state of an observation

Another new helpful feature, is the fact, that mouse actions are now handled like transactions: even in parallel workflows (like with a main script and background handlers) only one compound mouse action can be processed at any one time. This means a mouse click in the handler is completed before a parallel click in the main workflow will be processed and vice versa. This rather complex feature is worth its own article.

The same example as above, now continuing the observation after the event happened:

def handler(event):
    # do something
    if event.getCount() < 5:
reg = <some region object>
reg.observeInBackground(FOREVER) # observe in background
# script immediately continues here

Besides showing the getters it is principally the same as above. But now we limit the appearences to 5 (after that the event is deactivated) and say that the observation for this event should continue only after a pause of 1 second after returning from the handler.

With this last example I want to show the usage of named events.

The challenge is to scroll a webpage until an image gets visible. I know, that this can be rather easily done with exists() too, it is here done with observe for demonstration only.

For this snippet to work, the mouse must be positioned at a place, where the focused window accepts mouse wheeling.

reg = <some region object>
imgStop = "imgStop.png" # the stop image to wait for 

stopAppeared = reg.onAppear(imgStop) # no handler, saving the name
reg.observeInBackground(FOREVER) # start observation 
while not reg.isObserving(): wait(0.3) # hack currently needed

while reg.isObserving(): # until observing is stopped
   wheel(WHEEL_UP, 10) # wheel (might be WHEEL_DOWN on Windows)

m = r.getEvent(stopAppeared).getMatch() # get the event's match

We register an appear event without specifying a handler, but store the event’s name in stopAppeared. We use Region.isObserving(), to wait for the observation to stop because our one event has happened. While the observe is running, we use the mouse wheel to scroll the page. After the observation has ended, we get the event’s match to move the mouse there and add a 2-seconds-highlight.

The mentioned hack after starting the observation is currently needed, to bridge the startup time for the observation (loading the image and preparing the threaded search loop). This will be revised with version 1.1.1.

Another possibility would have been to use the features Region.hasEvents() or Region.hasEvent(stopAppeared) instead of Region.isObserving() like this:

while len(reg.hasEvents()) == 0: # returns a maybe empty list
# or
while not reg.hasEvent(stopAppeared): # returns the event or None

Hope this gives some ideas about new possibilities with version 1.1.0

New in 1.1.0 — observe revised with new features

AnkuLua: Game automation – SikuliX-like scripting on Android

A taiwanese lover of the SikuliX API created a visual automation tool on Android, that was inspired by the SikuliX features. The app runs directly on the device and the scripting itself has to be done with the scripting language Lua.

The app name AnkuLua is based on: Android Sikuli Lua.

The entry on Google Play says:

Let AnkuLua run application (like playing game) for you.

✓no root required
✓one script for all devices
✓use simple Lua script language
✓straightforward usage
✓fast image matching
✓auto click/tap

with AnkuLua, users can do followings, but not limited to
✓click on pictures (with offset)
✓wait for pictures to appear in specified time
✓wait for pictures to vanish in specified time
✓type texts
✓sent key event (like home, back)
✓drag and drop from one picture to another one
✓set similarity to compared pictures
✓search only some regions of the screen

According to the developer: the implementation is mainly written in Java, utilizes OpenCV and running Lua scripts is supported by a service. It is planned to make AnkuLua more robust and stable as the next steps.

The app on Google Play: AnkuLua

Here you can get more information.

AnkuLua: Game automation – SikuliX-like scripting on Android

SikuliX – how does it find images on the screen?

SikuliX uses the OpenCV package for finding an image on the screen.
The SikuliX feature is based on OpenCV’s method matchTemplate(), which is rather well explained on this example page . If you are not familiar with how it works, you should just have a look there and then come back and read further.
A basic feature in Sikulix is to wait for an image to appear in a given region:
# top left part of the screen
aRegion = Region(0, 0, 500, 500)
# a png image file on the file system
# this is the image we want to look for in the given Region
aImage = “someImage.png”
# search and get the result
aMatch = aRegion.find(aImage)
To not make it too complicated here, I do not talk about how you create the aImage – we just assume it is there and accessible.
The matchTemplate() expects an even sized or larger image (base), where the given image (target) should be searched. To prepare that, we internally make a screenshot (using Java Robot class) of the screen area defined by the given aRegion. This now is the base image and held in memory. The target image is also created as in memory image read form the image file. Both images then are converted to the needed OpenCV objects (CVMat).
Now we run the matchTemplate() function and get a matrix in the size of the base image, that contains for each pixel a similarity score for the target image compared pixel by pixel with it’s top left corner at this pixel location. If this is not clear here now, go back to the above mentioned OpenCV example and try to understand. The score values at each pixel location vary between 0.0 and 1.0: the lower the value, the lower the probability, that the area with it’s top left corner at this pixel location contains the target image. Score values above 0.7 – 0.8 signal a high probability, that the image is here.
In the next step, we use another OpenCV method, to get the relevant max value (result score) from the mentioned result matrix, meaning that we are looking for a pixel, that most probable is the top left corner of the target image in the base image.
If nothing else is said, only a result score > 0.7 is taken as found. The other values will signal a FindFailed exception. Depending on various aspects of the target image (mainly how much even background towards the edges in contained in the target image), one usually get result scores > 0.8 or even 0.9. If one follows SikuliX’s recommendation how to create target images, one should in most cases get result scores >0.95 or even >0.99 (internally taken as exact match with 1.0).
If the result score is accepted as found, in the next step we create a match object, that denotes the region on the screen, that most probably contains the image (aMatch in the above snippet).
If the image is not found (result score not acceptable), we either terminate the search operation signalling failed or start a new search with a new screenshot of the given region. This is repeated until the image is either found or the given or implicit waiting time (3 seconds in the standard) has elapsed, which also results in a FindFailed signal. The rate of this repetition can be specified, to reduce the cpu usage by SikuliX, since the search process is pure number crunching.
A word on elapsed time for search operations: The larger the base image the longer the search. The smaller the size difference of the 2 images, the faster. On modern systems with large monitors searching a small to medium sized image (up to 10.000 pixels), the elapsed time might be between 0.5 and 1 second or even more. The usual approach, to reduce search time is to reduce the search region as much as possible to the area, one expects the target image to appear. Small images of some 10 pixels in search regions of some 1000 pixels are found within some 10 milliseconds or even faster.
The actual version 1.1.0 of Sikulix implements a still-there-feature: before searching in the search region, it is first checked, wether the image is still in the same place as at the time of the last search (if the search region contains this last match). On success, this preflight operation usually takes some milliseconds, which speeds up workflows enormously if they contain repetitive tasks with the same images.
Not knowing the magic behind SikuliX’s search feature and the matchTemplate() function, people always wonder, why images showing up multiple times on the screen, are not found in some regular order (e.g. top left to bottom right). That this is not the case is caused by the implementation of the matchTemplate() function as some statistical numeric matrix calculations. So never expect SikuliX to return the top left appearance of a visual being more than once on the screen at time of search. The result is not predictable in this sense.
If you want to find a specific item of these multiple occurrences, you have to restrict the search region, so that only the one you are looking for is found.
For cases where this is not suitable or if you want to cycle through all appearances, we have the findAll() method, that returns a list of matches in decreasing result score order. You might work through this list according to their position on the screen by using their (x,y) top left corner coordinates. findAll internally evaluates the search result matrix, by repetitively looking for the next max value after having “switched off” some area around the last max.
There are convenience functions available, that return the list of matches along rows or columns starting with the match at top left.
To learn more about SikuliX you should have a look at the docs.
SikuliX – how does it find images on the screen?