Consistent Selenium Testing in Python


Back in April, I learned about Timestrap, a self-hostable, Django-based time-tracking project from a post on HackerNews by Isaac Bythewood. As I have been learning Python in the past year or so, I reached out to Isaac and started contributing to the project. After getting familiar with the core application, I turned my attention to testing and eventually found my way to Selenium, a collection of browser automation tools used for frontend testing.

I had never worked with Selenium or other automated testing products, so it struck me as a great opportunity to get my feet wet in something new. After getting things up and running, we quickly learned that the test results were quite inconsistent across development environments - even to a point that occasionally tests would succeed when run individually, but fail with the full test case.

After much trial and error, we have settled on a (mostly) consistent setup for testing with Selenium, Python and SauceLabs. This produces much better results than testing in development environments and crossing fingers during CI. Hopefully this primer will help others facing similar challenges (as we had a lot of trouble finding good material on the subject).

Use pip to install the selenium package (perhaps in a virtual environment):

Selenium needs a WebDriver before it can do anything useful. There are currently drivers for Firefox, Chrome, Edge and Safari. We originally started out with Firefox's geckodriver, but in initial attempts to fight inconsistency moved to chromedriver hoping for better results. Ultimately, both seem to have their shortfalls but we have stuck with chromedriver since the original change so that is what I will use in examples here.

The installation process is pretty simple, chromedriver just needs to be executable on the development system so Selenium can interact with it during testing. Check the ChromeDriver Downloads page for the latest version of the driver to download and install. In Linux, this may look like this:

curl -L https://chromedriver.storage.googleapis.com/2.32/chromedriver_linux64.zip -o chromedriver.zip
sudo mkdir -p /usr/local/bin/
sudo unzip chromedriver.zip -d /usr/local/bin/
sudo chmod +x /usr/local/bin/chromedriver

The above set of commands

  1. downloads chromedriver,
  2. places it in a common $PATH location, and
  3. sets it to be executable.

With chromedriver ready to go, all that is left is to import the WebDriver package from Selenium and tell it to use chromedriver. E.g.

from selenium import webdriver driver = webdriver.Chrome()

Running this code should invoke a window in Chrome, but nothing will happen because this example does not give the WebDriver any instruction.

Let's try a simple task: getting the "word of the day" from Merriam-Webster's website. A quick look at the source of M-W's word of the day page reveals where the word can be found in markup:

<article>
... <div class="quick-def-box"> <div class="word-header"> <div class="word-and-pronunciation"> <h1>confrere</h1> ... </div> </div>
...
</article>

So the actual word of the day, "confrere" today, is found in a h1 child of a div element with the class word-and-pronunciation. Searching the page reveals that this class is unique, so it can be used by Selenium to identify the element and get its content like so:

from selenium import webdriver driver = webdriver.Chrome()
driver.get('https://www.merriam-webster.com/word-of-the-day')
element = driver.find_element_by_css_selector('.word-and-pronunciation h1')
print(element.text)
driver.close()

Running the above should invoke a Chrome window that loads the word of the day page and then closes. The Python script should output the word before exiting. And there you have it! This example uses a CSS selector with Selenium's find_element_by_css_selector method, but there are many other find_element_by_* methods available for page "navigation".

There are lots of important Selenium classes and methods that will be used extensively for testing web pages. Here is a short list of some key functionality to know about -

The example above uses WebDriver.find_element_by_css_selector and there are eight of these methods in total (plus eight more in plural form):

  1. find_element_by_class_name
  2. find_element_by_css_selector
  3. find_element_by_id
  4. find_element_by_link_text
  5. find_element_by_name
  6. find_element_by_partial_link_text
  7. find_element_by_tag_name
  8. find_element_by_xpath

All of these methods are pretty descriptive (and long), so a nice helper is the WebDriver.common.By class. By can replace the longer form methods with a simpler shorthand. The previous code example could be replaced with:

from selenium import webdriver
from selenium.webdriver.common.by import By driver = webdriver.Chrome()
driver.get('https://www.merriam-webster.com/word-of-the-day')
element = driver.find_element(By.CSS_SELECTOR, '.word-and-pronunciation h1')
print(element.text)
driver.close()

While this code is not necessarily shorter, I suggest taking it a bit further and creating a wrapper method for finding elements. This should significantly reduce the effort of typing these methods out as test size and complexity increases. Here is an example wrapper I have used in test cases:

def find(self, by, value): elements = self.driver.find_elements(by, value) if len(elements) is 1: return elements[0] else: return elements

This uses the plural find_elements method and returns either a list or a single item depending on what is found. With this, I can use find(By.ID, 'my-id') instead of driver.find_element_by_id('my-id'). This form should produce much cleaner code, particularly when jumping between the various available find methods.

Most web app projects will deal with some degree of input and Selenium can support that fairly well. Every WebElement class (the result of the various find_element methods) has a send_keys method that can be used to simulate typing in an element. Let's try to use this functionality to search "Python" on Wikipedia -

A quick look at Wikipedia's page source reveals that the search input element uses the id searchInput. With this, Selenium can find the element and send some keys to it:

from selenium import webdriver
from selenium.webdriver.common.by import By driver = webdriver.Chrome()
driver.get('https://www.wikipedia.org/')
el = driver.find_element(By.ID, 'searchInput')
el.send_keys('Python')

The above code should result in an open Chrome window with the Wikipedia page loaded and "Python" in the search input field. This windows stays open because the code does not include the driver.close() command that is used in previous examples.

There are a couple of different ways to actually submit a form. In general I have found no real difference between any of the options, but I tend to fall back on locating and "clicking" the form's submit button when possible. Here are some of the ways submission can be accomplished:

Lastly, Selenium has a set of key codes that can be used to simulate "special" (non-alphanumeric) keys. These codes are found in WebDriver.common.keys. In order to submit the form, the code will need to use the return (or enter) key, so a revised version of the Wikipedia search code looks like this:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys driver = webdriver.Chrome()
driver.get('https://www.wikipedia.org/')
el = driver.find_element(By.ID, 'searchInput')
el.send_keys('Python')
el.send_keys(Keys.RETURN)

Just like the two previous examples, this script should exit leaving a Chrome page open to the Wikipedia search results for "Python".

This is perhaps the cleanest way to get a form submitted because it doesn't require finding other elements, but a thorough tester may want to consider testing multiple submission methods to ensure functionality.

While Selenium does offer a WebElement.clear() method, its implementation is inconsistent across browsers and its behavior can be defined differently depending on the app and element being tested. For these reasons, I don't think it should be used to clear form input fields. Instead, Selenium's Keys class can be used to simulate pressing the backspace key multiple times in a field.

Here is a simple function to handle this -

from selenium.webdriver.common.keys import Keys def clear(element): value = element.get_attribute('value') if len(value) > 0: for char in value: element.send_keys(Keys.BACK_SPACE)

This clear function will take a WebElement, get the length of its value attribute, and simulate hitting the BACK_SPACE until all text is removed from the field.

Let's use Selenium to load Google and search for "selenium". Google's search input element does not have a unique ID or class, but it does use a name attribute with the value "q". This can be used to find the element and send the keys. Expanding on the previous example:

 0 1 2 3 4 5 6 7 8 9
10
11
12
13
14
15
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys def clear(element): value = element.get_attribute('value') if len(value) > 0: for char in value: element.send_keys(Keys.BACK_SPACE) driver = webdriver.Chrome()
driver.get('https://www.google.com/')
el = driver.find_element(By.NAME, 'q')
el.send_keys('selenium')
el.send_keys(Keys.RETURN)

This should produce the Google search results page for "selenium".

On the results page, the search field still has a name value of "q" and now is pre-filled with "selenium" for a value. Although the name has not changed, Selenium will need to find the element again because the page has changed. Add the following to the code to locate the element and use the custom clear() function to clear it:

el = driver.find_element(By.NAME, 'q')
clear(el)

And it's gone!

Overall, this BACK_SPACE should be much more reliable than the WebElement.clear() method.

"Waiting" in Selenium can be a deceptively complex problem. Up to this point, all examples have relied on Selenium's own ability to wait for a page to finish loading before taking any particular action. For simple tests, this may be a perfectly sufficient course. But as tests and applications become more complex this method may not always do the job.

Selenium provides some useful tools for addressing this issue -

The easiest way to add some wiggle room is the WebDriver.implicitly_wait() method. This method accepts an integer input that defines how many seconds to wait when executing any of the find_element methods.

The default implicit wait is zero (or no wait), so if a particular element is not found immediately Selenium will raise a NoSuchElementException. Let's try to find an element with a name attribute "query" on GitHub (there isn't one):

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys driver = webdriver.Chrome()
driver.get('https://www.github.com/')
el = driver.find_element(By.NAME, 'query')

This code should result in a NoSuchElementException pretty quickly after Chrome loads GitHub's homepage.

Now, let's try the code below, which sets an implicit wait time of five seconds for the same impossible task:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys driver = webdriver.Chrome()
driver.implicitly_wait(5)
driver.get('https://www.github.com/')
el = driver.find_element(By.NAME, 'query')

This code will produce the exact same exception, but it will wait five seconds before doing so.

While these examples paint a very simple picture, the reality is that various conditions of any test environment or application will impact Selenium's ability to determine when a page is loaded or whether or not an element exists.

I recommend all tests set a 10 second implicit wait time. This should help to prevent intermittent exceptions caused by issues with underlying elements like network connection or buggy web servers.

When implicit waits are not enough, expected conditions are extremely valuable. The WebDriverWait class provides the until() and until_not() methods that can be used with expected_conditions to create more complex and nuanced wait conditions.

There are many expected conditions available, but the one that I have frequently come back to in my testing is presence_of_element_located().

presence_of_element_located() will take an object describing a method and locator and return true if the object exists in the DOM. This can be used with WebDriverWait.until() and a wait time (in seconds) like so:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec driver = webdriver.Chrome()
WebDriverWait(driver, 5).until(ec.presence_of_element_located((By.ID, 'html-id')))

For a real example, the website webcountdown.net creates a countdown timer that creates a pop-up in the DOM when the timer finishes. Selenium can handle this using the above template:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec driver = webdriver.Chrome()
driver.get('http://www.webcountdown.net/?c=3') # Starts a 3 second timer.
if WebDriverWait(driver, 5).until(ec.presence_of_element_located((By.ID, 'popupiframe'))): print('Popup located!')

The above code should open the countdown page, run a three second countdown and print "Popup located!" after the three second countdown completes. This works because WebDriver is told to wait for up to five seconds for the popup to appear.

If, for example, this were modified with a two second timeout for the WebDriverWait class, Selenium would raise a selenium.common.exceptions.TimeoutException because the timer does not finish (and therefore does not create the element with ID "popupiframe") before the two seconds are up.

What is WebDriverWait good for? Briefly - single page apps (SPAs).

Testing may require traversing an app's navigational elements and if the page is not fully reloading, Selenium will need to use WebDriverWait to do things like wait for a new section or table of data to load after an AJAX-style API call.

Other expected conditions will follow pretty much the same syntax and mostly have (very) verbose names. Two of the others that I have found useful in practice are text_to_be_present_in_element() and element_to_be_clickable().

Lastly, I have also used a workaround method to do simple, explicit time-based waits without any expected conditions. One area where this happened to come in handy for me is testing the result of a Javascript-based "stop watch" that updates in real time. As part of a test, I initiate the stop watch, wait for two seconds and then verify the displayed time to be correct. To achieve this, I created a method that essentially does an expected conditional wait that times out intentionally:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait def wait(self, seconds): try: WebDriverWait(self.driver, seconds).until(lambda driver: 1 == 0) except TimeoutException: pass

This method can be used, for example to wait five seconds by calling wait(5). WebDriverWait will raise an exception after five seconds because the until() argument is a simple lambda that will always return False. By catching and passing on the exception, this method just waits for the specified number of seconds and nothing else. Handy!

These basics are enough to get things going in Selenium, but over time as test complexities increase and multiple developer environments evolve, consistency will become a considerable pain. In our experience developing Timestrap, there were inconsistencies causing test failures based on development OS (Windows, OS X, Linux flavors, etc.), web drivers (Firefox, Chrome, gecko, etc.), and seemingly the phases of the moon.

After trying many different things to stabilize environments, we eventually found and started using SauceLabs. SauceLabs provides a number of services related to testing and a few free tiers for open source projects, including Cross Browser Testing. Using this service can help bring stability and consistency to Selenium tests regardless of the local development environment.

To get started, SauceLabs requires an existing, publicly accessible open source repository (e.g. on GitHub, GitLab, etc.). Use the OSS Sign Up page with the "Open Sauce" plan to get started. Once signed up and logged in, there are a couple of different ways to take advantage of SauceLabs testing:

If you have an Internet accessible project available, Manual Tests can be used to poke around and get a feel for the various environments supported. This can serve as a wonderfully quick and easy way to do some prodding from a virtual browser in iOS, Android, OS X, Windows, Linux using various versions of Safari, Chrome, Firefox, Internet Explorer and Opera. Once a session is complete, the dashboard will have a log with screenshots and videos available to view or download.

While manual testing is quick and convenient, automated testing is the important feature necessary to improve the consistency of Selenium tests in Python overall. Running Python's Selenium tests through SauceLabs requires three key things:

From a logged in SauceLabs account, the access key can be found on the User Settings page. This key and the associated username will need to be available in the local test environment in order to execute the Selenium-driven tests on SauceLabs.

I recommend getting used to using the environment variables SAUCE_USERNAME and SAUCE_ACCESS_KEY as these will be used by the Sauce Connect Proxy Client for local development testing.

On Linux this can be achieved with:

export SAUCE_USERNAME={sauce-username}
export SAUCE_ACCESS_KEY={sauce-access-key}

Selenium provides a WebDriver.Remote class for interacting with a command-based remote server running the WebDriver protocol. The class must be initialized with two arguments, command_executor, a URL pointing to the remote command point, and desired_capabilities, a dictionary of settings for the executor.

For SauceLabs, the command_executor should be set to http://SAUCE_USERNAME:SAUCE_ACCESS_KEY@ondemand.saucelabs.com/wd/hub where SAUCE_USERNAME and SAUCE_ACCESS_KEY represent the properties outlined in the previous section of this post.

The desired_capabilities dictionary is used to provide the environment settings to SauceLabs. SauceLabs has a wonderful Platform Configurator tool for easily selecting from the available options.

To use the example below, the local environment must provide two variables: SAUCE_USERNAME and SAUCE_ACCESS_KEY. With these variables set, the following code will create a remote WebDriver set up to access SauceLabs using Chrome 48 on a PC running Linux:

 0 1 2 3 4 5 6 7 8 9
10
11
12
13
14
15
16
17
18
import os
from selenium import webdriver # Get the user name and access key from the environment.
sauce_username = os.environ['SAUCE_USERNAME']
sauce_access_key = os.environ['SAUCE_ACCESS_KEY'] # Build the command executor URL.
url = 'http://{}:{}@ondemand.saucelabs.com/wd/hub'.format( sauce_username, sauce_access_key) # Build the capabilities dictionary (from Platform Configurator).
caps = {'browserName': "chrome"}
caps['platform'] = "Linux"
caps['version'] = "48.0" driver = webdriver.Remote(command_executor=url, desired_capabilities=caps)
driver.get('https://www.google.com')
driver.quit()

After executing the above sequence, the SauceLabs Dashboard should show a new job with video and screenshots of Chrome on Linux loading Google. Neat!

When using Chrome, a chromeOptions dictionary can also be provided in the desired_capabilities dictionary with some more specific settings. Within that dictionary, a prefs dictionary can also be used to set further preferences. For instance, if testing needs to be done on an app that requires login, it may be helpful to use this chromeOptions dictionary:

caps['chromeOptions'] = { 'prefs': { 'credentials_enable_service': False, 'profile': { 'password_manager_enabled': False } }
}

Very simply, this prevents the "Do you want to save your password?" sort of dialog box from appearing in all screenshots of a test session after login.

All of this works great just as described... if the app being tested happens to be available on the public Internet. If that is not the case (and it probably isn't), SauceLabs provides the Sauce Connect Proxy to connect to your local app.

On Linux, for example, the proxy client can be installed like so:

wget https://saucelabs.com/downloads/sc-4.4.9-linux.tar.gz
sudo mkdir -p /usr/local/bin/
tar xzf sc-4.4.9-linux.tar.gz
mv sc-4.4.9-linux/bin/sc /usr/local/bin/
sudo chmod +x /usr/local/bin/sc
sc --version
#Sauce Connect 4.4.9, build 3688 098cbcf -dirty

The sc command will make use of the SAUCE_USERNAME and SAUCE_ACCESS_KEY environment variables. When executed with no parameters, the proxy client will run through some initialization leading to the message, Sauce Connect is up, you may start your tests. From here the client will simply sit and listen for commands and the SauceLabs Tunnels page should show the client as active.

With all of this in place, tests against a local development server can now be proxied up to SauceLabs and run in a considerably more consistent environment!

This method significantly improved the test infrastructure for Timestrap and allowed us to refocus on development instead of testing.

We can bring all this together in one (admittedly somewhat complex) Python test file:

 0 1 2 3 4 5 6 7 8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import os
import threading
import time
import unittest
from http.server import BaseHTTPRequestHandler, HTTPServer
from selenium import webdriver
from selenium.webdriver.common.by import By class TestHandler(BaseHTTPRequestHandler): def do_GET(self): self.send_response(200) self.send_header('Content-type', 'text/html') self.end_headers() self.wfile.write(b'<html><head><title>Python Selenium!</title></head>') self.wfile.write(b'<body><div id="main">Hello!</div></body>') self.wfile.write(b'</body></html>') class TestRequest(unittest.TestCase): @classmethod def setUpClass(cls): server = HTTPServer(('127.0.0.1', 8000), TestHandler) cls.server_thread = threading.Thread(target=server.serve_forever, daemon=True) cls.server_thread.start() time.sleep(1) sauce_username = os.environ['SAUCE_USERNAME'] sauce_access_key = os.environ['SAUCE_ACCESS_KEY'] url = 'http://{}:{}@ondemand.saucelabs.com/wd/hub'.format( sauce_username, sauce_access_key) caps = {'browserName': "chrome"} caps['platform'] = "Linux" caps['version'] = "48.0" cls.driver = webdriver.Remote(command_executor=url, desired_capabilities=caps) @classmethod def tearDownClass(cls): cls.driver.quit() def test_request(self): self.driver.get('http://127.0.0.1:8000') self.assertEqual('Hello!', self.driver.find_element(By.ID, 'main').text) if __name__ == '__main__': unittest.main()

TestHandler.do_GET() is a very simple method for http.server that returns the following HTML:

<html>
<head> <title>Python Selenium!</title>
</head>
<body> <div id="main">Hello!</div>
</body>
</html>

TestRequest.setUpClass() does three import things before running the tests:

  1. Establishes the HTTPServer instance.
  2. Starts the HTTP server in a thread (to prevent blocking).
  3. Establishes the WebDriver.Remote instance using SauceLab as the command executor.

TestRequest.tearDownClass() simply shuts down the web driver.

Lastly, TestRequest.test_request() is the single test in this "suite". It simply loads the test server index page and asserts that the text "Hello!" is present inside div#main (which it should be).

Let's give it a try! Remember to set the SAUCE_USERNAME and SAUCE_ACCESS_KEY environment variables, first:

export SAUCE_USERNAME={sauce-username}
export SAUCE_ACCESS_KEY={sauce-access-key}
python tests.py
#E
#======================================================================
#ERROR: test_request (__main__.TestRequest)
#----------------------------------------------------------------------
#Traceback (most recent call last):
#[...]
#selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"id","selector":"main"}
# (Session info: chrome=48.0.2564.97)
# (Driver info: chromedriver=2.21.371459 (36d3d07f660ff2bc1bf28a75d1cdabed0983e7c4),platform=Linux 3.13.0-83-generic x86)
#[...]

Oh no! What happened? The important bit in the traceback is this: Message: no such element: Unable to locate element: {"method":"id","selector":"main"}. For some reason, Selenium was not able to find the div#main element. Since this test ran in SauceLabs, the SauceLabs Dashboard has information and a replay of the test session which reveals... oh... SauceLabs was trying to access the local network (127.0.0.1) and we forgot to start the proxy client. Oops!

Let's try that one more time, this time starting up the Sauce Connect Proxy (sc) before running the tests...

export SAUCE_USERNAME={sauce-username}
export SAUCE_ACCESS_KEY={sauce-access-key}
sc &
#[...]
#Sauce Connect is up, you may start your tests.
python tests.py
#[...]
#.
#----------------------------------------------------------------------
#Ran 1 test in 6.026s
#
#OK

Note: Don't forget to kill the sc process with, for example, pkill -x sc.

Hooray! This time the test ran successfully because SauceLabs was able to use the proxy client to access the local test server.

This means that the local development environment can still be used for testing without having to deploy between tests.

There you have it. With a local test server up and running, getting consistent results from Selenium can be incredibly smooth and save many, many testing headaches as the code base and developer contributions expand (hopefully!).

Discussion at Hacker News