WebDriver

The biggest change in Selenium recently has been the inclusion of the WebDriver API. Driving a browser natively as a user would either locally or on a remote machine using the Selenium server, it marks a leap forward in terms of browser automation.

Selenium WebDriver fits in the same role as RC did, and has incorporated the original 1.x bindings. It refers to both the language bindings and the implementations of the individual browser controlling code. This is commonly referred to as just WebDriver or sometimes as Selenium 2.

Selenium 1.0 + WebDriver = Selenium 2.0

Understanding The Components

Building a test suite using WebDriver will require you to understand and effectively use a number of different components. As with everything in software, different people use different terms for the same idea. Below is a breakdown of how terms are used in this description.

Terminology

The Parts and Pieces

At its minimum, WebDriver talks to a browser through a driver. Communication is two way: WebDriver passes commands to the browser through the driver, and receives information back via the same route.

The driver is specific to the browser, such as ChromeDriver for Google's Chrome/Chromium, GeckoDriver for Mozilla's Firefox, etc. The driver runs on the same system as the browser. This may, or may not be, the same system where the tests themselves are executing.

This simple example above is direct communication. Communication to the browser may also be remote communication through Selenium Server or RemoteWebDriver. RemoteWebDriver runs on the same system as the driver and the browser.

Remote communication can also take place using Selenium Server or Selenium Grid, both of which in turn talk to the driver on the host system

Where Frameworks Fit In

WebDriver has one job and one job only: communicate with the browser via any of the methods above. WebDriver doesn't know a thing about testing: it doesn't know how to compare things, assert pass or fail, and it certainly doesn't know a thing about reporting or Given/When/Then grammar.

This is where various frameworks come in to play. At a minimum you'll need a test framework that matches the language bindings, eg NUnit for .NET, JUnit for Java, RSpec for Ruby, etc.

The test framework is responsible for running and executing your WebDriver and related steps in your tests. As such, you can think of it looking akin to the following image.

Natural language frameworks/tools such as Cucumber may exist as part of that Test Framework box in the figure above, or they may wrap the Test Framework entirely in their own implementation.

Driver requirements

Through WebDriver, Selenium supports all major browsers on the market such as Chrom(ium), Firefox, Internet Explorer, Opera, and Safari. Where possible, WebDriver drives the browser using the browser's built-in support for automation, although not all browsers have official support for remote control.

WebDriver's aim is to emulate a real user's interaction with the browser as closely as possible. This is possible at varying levels in different browsers. For more details on the different driver idiosyncracies, please see Driver Idiosyncracies.

Even though all the drivers share a single user-facing interface for controlling the browser, they have slightly different ways of setting up browser sessions. Since many of the driver implementations are provided by third parties, they are not included in the standard Selenium distribution.

Driver instantiation, profile management, and various browser specific settings are examples of parameters that have different requirements depending on the browser. This section explains the basic requirements for getting you started with the different browsers.

Adding Executables to your PATH

Most drivers require an extra executable for Selenium to communicate with the browser. You can manually specify where the executable lives before starting WebDriver, but this can make your tests less portable, as the executables will need to be in the same place on every machine, or included within your test code repository.

By adding a folder containing WebDriver's binaries to your system's path, Selenium will be able to locate the additional binaries without requiring your test code to locate the exact location of the driver.

Quick reference

Browser Supported OS Maintained by Download Issue Tracker
Chromium/Chrome Windows
macOS
Linux
Google Downloads Issues
Firefox Windows
macOS
Linux
Mozilla Downloads Issues
Edge Windows 10 Microsoft Downloads Issues
Internet Explorer Windows Selenium Project Downloads Issues
Safari macOS El Capitan and newer Apple Built in Issues
Opera Windows
macOS
Linux
Opera Downloads Issues

Chromium/Chrome

To drive Chrome or Chromium, you have to download chromedriver and put it in a folder that is on your system's path.

On Linux or macOS, this means modifying the PATH environmental variable. You can see what directories, separated by a colon, make up your system's path by executing the following command:

$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

To include chromedriver on the path if it isn't already, make sure you include the chromedriver binary's parent directory. The following line will set the PATH environmental variable its current content, plus an additional path added after the colon:

$ export PATH="$PATH:/path/to/chromedriver"

When chromedriver is available on your path, you should be able to execute the _chromedriver_ executable from any directory.

To instantiate a Chrome/Chromium session, you can do the following:

 import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

WebDriver driver = new ChromeDriver();
 #Simple assignment
from selenium.webdriver import Chrome

driver = Chrome()
 #Or use the context manager
from selenium.webdriver import Chrome

with Chrome() as driver:
    #your code inside this indent
 using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

IWebDriver driver = new ChromeDriver();
 require "selenium-webdriver"

driver = Selenium::WebDriver.for :chrome
 const {Builder} = require('selenium-webdriver');

(async function myFunction() {
    let driver = await new Builder().forBrowser('chrome').build();
    //your code inside this block
})();

Remember that you have to set the path to the chromedriver executable. This is possible using the following line:

 System.setProperty("webdriver.chrome.driver", "/path/to/chromedriver");
 Chrome(executable_path='/path/to/chromedriver')
 Selenium::WebDriver::Chrome.driver_path = "/path/to/chromedriver"

The chromedriver is implemented as a WebDriver remote server that by exposing Chrome's internal automation proxy interface instructs the browser what to do.

Firefox

Starting with Selenium 3, Mozilla has taken over implementation of Firefox Driver, with geckodriver. The new driver for Firefox is called geckodriver and works with Firefox 48 and newer. Since the Firefox WebDriver is under development, the newer the Firefox version the better the support.

As geckodriver is the new default way of launching Firefox, you can instantiate Firefox in the same way as Selenium 2:

 import org.openqa.selenium.WebDriver;
import org.openqa.selenium.Firefox.FirefoxDriver;

WebDriver driver = new FirefoxDriver();
 #Simple assignment
from selenium.webdriver import Firefox

driver = Firefox()
 #Or use the context manager
from selenium.webdriver import Firefox

with Firefox() as driver:
   #your code inside this indent
 using OpenQA.Selenium;
using OpenQA.Selenium.Firefox;

IWebDriver driver = new FirefoxDriver();
 require "selenium-webdriver"

driver = Selenium::WebDriver.for :firefox
 const {Builder} = require('selenium-webdriver');

(async function myFunction() {
   let driver = await new Builder().forBrowser('firefox').build();
   //your code inside this block
})();

If you prefer not to set geckodriver's location using PATH, set the geckodriver binary location programmatically:

  System.setProperty("webdriver.gecko.driver", "/path/to/geckodriver");
  Firefox(executable_path='/path/to/geckodriver')
  Selenium::WebDriver::Firefox.driver_path = "/path/to/geckodriver"

It is also possible to set the property at run time:

mvn test -Dwebdriver.gecko.driver=/path/to/geckodriver

It is currently possible to revert to the older, more feature complete Firefox driver, by installing Firefox 47.0.1 or 45 ESR and specifying a desired capability of marionette as false. Later releases of Firefox are no longer compatible.

Edge

Edge is Microsoft's newest browser, included with Windows 10 and Server 2016. Updates to Edge are bundled with major Windows updates, so you'll need to download a binary which matches the build number of your currently installed build of Windows. The Edge Developer site contains links to all the available binaries. Bugs against the EdgeDriver implementation can be raised with Microsoft. If you'd like to run tests against Edge, but aren't running Windows 10, Microsoft offer free VMs for testers on the Edge Developer site.

 import org.openqa.selenium.WebDriver;
import org.openqa.selenium.edge.EdgeDriver;

WebDriver driver = new EdgeDriver();
#Simple assignment
from selenium.webdriver import Edge

driver = Edge()
#Or use the context manager
from selenium.webdriver import Edge

with Edge() as driver:
   #your code inside this indent
using OpenQA.Selenium;
using OpenQA.Selenium.Edge;

IWebDriver driver = new EdgeDriver();
require "selenium-webdriver"

driver = Selenium::WebDriver.for :edge
const {Builder} = require('selenium-webdriver');

(async function myFunction() {
   let driver = await new Builder().forBrowser('MicrosoftEdge').build();
   //your code inside this block
})();

If Edge driver is not present in your path, you can set the path using the following line:

  System.setProperty("webdriver.edge.driver", "C:/path/to/MicrosoftWebDriver.exe");
  Edge(executable_path='/path/to/MicrosoftWebDriver.exe')
  Selenium::WebDriver::Edge.driver_path = "C:\path\to\MicrosoftWebDriver.exe"

Internet Explorer

Internet Explorer was Microsoft's default browser until Windows 10, although it is still included in Windows 10. Internet Explorer Driver is the only driver The Selenium project aims to support the same releases Microsoft considers current. Older releases may work, but will be unsupported.

While the Selenium project provides binaries for both the 32-bit and 64-bit versions of Internet Explorer, there are some limitations with Internet Explorer 10 & 11 with the 64-bit driver, but using the 32-bit driver continues to work well. It should be noted that as Internet Explorer preferences are saved against the logged in user's account, some additional setup is required.

 import org.openqa.selenium.WebDriver;
import org.openqa.selenium.ie.InternetExplorerDriver;

WebDriver driver = new InternetExplorerDriver();
 #Simple assignment
from selenium.webdriver import Ie

driver = Ie()
 #Or use the context manager
from selenium.webdriver import Ie

with Ie() as driver:
   #your code inside this indent
 using OpenQA.Selenium;
using OpenQA.Selenium.IE;

IWebDriver driver = new InternetExplorerDriver();
 require "selenium-webdriver"

driver = Selenium::WebDriver.for :internet_explorer
 const {Builder} = require('selenium-webdriver');

(async function myFunction() {
   let driver = await new Builder().forBrowser('internet explorer').build();
   //your code inside this block
})();

If Internet Explorer driver is not present in your path, you can set the path using the following line:

  System.setProperty("webdriver.ie.driver", "C:/path/to/IEDriver.exe");
  Ie(executable_path='/path/to/IEDriverServer.exe')
  Selenium::WebDriver::IE.driver_path = "C:\path\to\IEDriver.exe"
Microsoft also offer a WebDriver binary for Internet Explorer 11 on Windows 7 & 8.1. It has not been updated since 2014 and is based of a draft version of the W3 specification. Jim Evans has an excellent writeup on Microsoft's implementation.

Opera

Current releases of Opera are built on top of the Chromium engine, and WebDriver is now supported via the closed-source Opera Chromium Driver, which can be added to your PATH or as a system property.

Instantiating a driver session is similar to Firefox and Chromium:

 import org.openqa.selenium.WebDriver;
import org.openqa.selenium.opera.OperaDriver;

WebDriver driver = new OperaDriver();
#Simple assignment
from selenium.webdriver import Opera

driver = Opera()
#Or use the context manager
from selenium.webdriver import Opera

with Opera() as driver:
   #your code inside this indent
using OpenQA.Selenium;
using OpenQA.Selenium.Opera;

IWebDriver driver = new OperaDriver();
require "selenium-webdriver"

driver = Selenium::WebDriver.for :opera

Safari

Starting with Safari 10 on macOS El Capitan and Sierra, WebDriver support is included with each release of the browser. To enable support:

  1. Enable the Developer menu from Safari preferences
  2. Check the Allow Remote Automation option from with the Develop menu
  3. Run
    /usr/bin/safaridriver -p 1337
    from the terminal for the first time and type your password at the prompt to authorise WebDriver

You can then start a driver session using:

 import org.openqa.selenium.WebDriver;
import org.openqa.selenium.safari.SafariDriver;

WebDriver driver = new SafariDriver();
#Simple assignment
from selenium.webdriver import Safari

driver = Safari()
 #Or use the context manager
from selenium.webdriver import Safari

with Safari() as driver:
   #your code inside this indent
 using OpenQA.Selenium;
using OpenQA.Selenium.Safari;

IWebDriver driver = new SafariDriver();
 require "selenium-webdriver"

driver = Selenium::WebDriver.for :safari
 const {Builder} = require('selenium-webdriver');

(async function myFunction() {
   let driver = await new Builder().forBrowser('safari').build();
   //your code inside this block
})();

Those looking to automate Safari on iOS should look to the Appium project. Whilst Safari was previously available for Windows, Apple has long since dropped support, making it a poor choice of test platform.

Mock browsers

HtmlUnit

HtmlUnit is a "GUI-Less browser for Java programs". It models HTML documents and provides an API that allows you to invoke pages, fill out forms, click links, etc. It has JavaScript support and is able to work with AJAX libraries, simulating Chrome, Firefox or Internet Explorer depending on the configuration used. It has been moved to a new location.

The source is maintained on svn.

PhantomJS

PhantomJS is a headless browser based on Webkit, albeit a version much older than that used by Google Chrome or Safari. . Whilst historically a popular choice, it would now be wise to avoid PhantomJS. The project has been unmaintained since the 5th of August, so whilst the web will continue to change, PhantomJS will not be updated. This was after Google announced the ability to run Chrome headlessly, something also now offered by Mozilla's Firefox.

Browser launching and manipulation

Ruby

Ruby is not installed by default on Windows. Download the latest version and run the installer. You can leave all settings at default values, except at the Installation Destination and Optional Tasks screen check Add Ruby executables to your PATH checkbox. To drive any browser, you have to install selenium-webdriver Ruby gem. To install it, open command prompt and type this:

$ gem install selenium-webdriver
Or, if you use Bundler, add this line to your application's Gemfile:
gem "selenium-webdriver"
And then execute the following command in prompt:
$ bundle install

Internet Explorer

Internet Explorer is installed by default on Windows, so no installation is needed. To drive Internet Explorer on Windows, you have to download the latest Internet Explorer Driver and put the file into a folder that is in PATH. To find out which directories are in PATH, type echo %PATH% in command prompt.

$ echo %PATH%
C:\Ruby200\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem
C:\Ruby200\bin looks like a good place. Unzip `IEDriverServer` file and move `IEDriverServer.exe` there. This should open a new Internet Explorer window:
require "selenium-webdriver"
driver = Selenium::WebDriver.for :internet_explorer

Browser Navigation

Navigate To

The first thing you will want to do after launching a browser is to open your website. This can be achieved in a single line:

 
//Convenient
driver.get("https://www.seleniumhq.org");

//Longer way
driver.navigate().to("https://seleniumhq.github.io/docs/");
 
 driver.get("https://www.seleniumhq.org")
 
# Convenient
driver.get 'https://www.seleniumhq.org'

# Longer way
driver.navigate.to 'https://seleniumhq.github.io/docs/'
 
 driver.Navigate().GoToUrl(@"http://google.com");
 driver.get("https://seleniumhq.github.io/docs/");

Get Current URL

You can read the current URL from the browser's address bar using:

 driver.getCurrentUrl();
 driver.current_url
 driver.current_url
 driver.Url;
 await driver.getCurrentUrl();

Back

Pressing the browser's back button:

 driver.navigate().back();
 driver.back()
 driver.navigate.back
 driver.Navigate().Back();
 driver.back();

Forward

Pressing the browser's forward button:

  driver.navigate().forward();
  driver.forward()
  driver.navigate.forward
  driver.Navigate().Forward();
  driver.forward();
 

Refresh

Refresh the current page:

  driver.navigate().refresh();
  driver.refresh()
  driver.navigate.refresh
  driver.Navigate().Refresh();
  driver.refresh();
 

Get Title

You can read the current page title from the browser:

  driver.getTitle();
  driver.title
  driver.title
  driver.Title;
  driver.getTitle();
 

Windows and tabs

WebDriver doesn't make the distinction between windows and tabs. If your site opens a new tab or window, Selenium will let you work with it using a window handle. Each window has a unique identifier which remains persistent in a single session. You can get the window handle of the current window by using:

 driver.getWindowHandle();
 driver.current_window_handle
 driver.CurrentWindowHandle
 driver.window_handle
 await driver.getWindowHandle();

Switching windows or tabs

Clicking a link which opens in a new window will visible focus the new window or tab on screen, but WebDriver will not know which window the Operating System considers active. To work with the new window you will need to switch to it. If you have only two tabs or windows open, and you know which window you start with, by the process of elimination you can loop over both windows or tabs that WebDriver can see, and switch to the one which is not the original.

 //Store the ID of the original window
String originalWindow = driver.getWindowHandle();

//Check we don't have other windows open already
assert driver.getWindowHandles().size() == 1;

//Click the link which opens in a new window
driver.findElement(By.linkText("new window")).click();

//Wait for the new window or tab
wait.until(numberOfWindowsToBe(2));

//Loop through until we find a new window handle
for (String windowHandle : driver.getWindowHandles()) {
    if(!originalWindow.contentEquals(windowHandle)) {
        driver.switchTo().window(windowHandle);
        break;
    }
}

//Wait for the new tab to finish loading content
wait.until(titleIs("Selenium documentation"));
 from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Start the driver
with webdriver.Firefox() as driver:
    # Open URL
    driver.get("https://seleniumhq.github.io/docs/wd.html")

    # Setup wait for later
    wait = WebDriverWait(driver, 10)

    # Store the ID of the original window
    original_window = driver.current_window_handle

    # Check we don't have other windows open already
    assert len(driver.window_handles) == 1

    # Click the link which opens in a new window
    driver.find_element_by_link_text("new window").click()

    # Wait for the new window or tab
    wait.until(EC.number_of_windows_to_be(2))

    # Loop through until we find a new window handle
    for window_handle in driver.window_handles:
        if window_handle != original_window:
            driver.switch_to.window(window_handle)
            break

    # Wait for the new tab to finish loading content
    wait.until(EC.title_is("Selenium documentation"))
 //Store the ID of the original window
String originalWindow = driver.CurrentWindowHandle;

//Check we don't have other windows open already
Assert.AreEqual(driver.WindowHandles.Count, 1);

//Click the link which opens in a new window
driver.FindElement(By.LinkText("new window")).Click();

//Wait for the new window or tab
wait.Until(wd => wd.WindowHandles.Count == 2);

//Loop through until we find a new window handle
foreach(String window in driver.WindowHandles)
{
    if(originalWindow != window)
    {
        driver.SwitchTo().Window(window);
        break;
    }
}
//Wait for the new tab to finish loading content
wait.Until(wd => wd.Title == "Selenium documentation");
 #Store the ID of the original window
original_window = driver.window_handle

#Check we don't have other windows open already
assert(driver.window_handles.length == 1, 'Expected one window')

#Click the link which opens in a new window
driver.find_element(link: 'new window').click

#Wait for the new window or tab
wait.until { driver.window_handles.length == 2 }

#Loop through until we find a new window handle
driver.window_handles.each do |handle|
    if handle != original_window
        driver.switch_to.window handle
        break
    end
end

#Wait for the new tab to finish loading content
wait.until { driver.title == 'Selenium documentation'}
 
 //Store the ID of the original window
const originalWindow = await driver.getWindowHandle();

//Check we don't have other windows open already
assert((await driver.getAllWindowHandles()).length === 1);

//Click the link which opens in a new window
await driver.findElement(By.linkText('new window')).click();

//Wait for the new window or tab
await driver.wait(function() {
    return driver.getAllWindowHandles().then(function(windows) {
        return windows.length === 2;
    });
}, 10000);

//Loop through until we find a new window handle
const windows = (await driver.getAllWindowHandles());
for (let i = 0; i < windows.length; i++) {
    if(windows[i]!==originalWindow) {
        await driver.switchTo().window(windows[i]);
    }
}

//Wait for the new tab to finish loading content
await driver.wait(until.titleIs('Selenium documentation'), 10000);

Closing a window or tab

When you are finished with a window or tab and it is not the last window or tab open in your browser, you should close it and switch back to the window you were using previously. Assuming you followed the code sample in the previous section you will have the previous window handle stored in a variable. Put this together and you will get:

 //Close the tab or window
driver.close();

//Switch back to the old tab or window
driver.switchTo().window(originalWindow);
 #Close the tab or window
driver.close()

#Switch back to the old tab or window
driver.switch_to.window(original_window)
 //Close the tab or window
driver.Close();

//Switch back to the old tab or window
driver.SwitchTo().Window(originalWindow);
 #Close the tab or window
driver.close

#Switch back to the old tab or window
driver.switch_to.window original_window
 //Close the tab or window
await driver.close();

//Switch back to the old tab or window
await driver.switchTo().window(originalWindow);
 

Forgetting to switch back to another window handle after closing a window will leave WebDriver executing on the now closed page, and will trigger a No Such Window Exception. You must switch back to a valid window handle in order to continue execution.

Quitting the browser at the end of a session

When you are finished with the browser session you should call quit, instead of close:

 driver.quit();
 driver.quit()
 driver.Quit();
 driver.quit
 await driver.quit();

Quit will:

Failure to call quit will leave extra background processes and ports running on your machine which could cause you problems later.

Some test frameworks offer methods and annotations which you can hook into to tear down at the end of a test.

 /**
 * Example using JUnit
 * https://junit.org/junit5/docs/current/api/org/junit/jupiter/api/AfterAll.html
 */
@AfterAll
public static void tearDown() {
    driver.quit();
}
 /*
    Example using Visual Studio's UnitTesting
    https://msdn.microsoft.com/en-us/library/microsoft.visualstudio.testtools.unittesting.aspx
*/
[TestCleanup]
public void TearDown()
{
    driver.Quit();
}

If not running WebDriver in a test context, you may consider using try / finally which is offered by most languages so that an exception will still clean up the WebDriver session.

 try {
    //WebDriver code here...
} finally {
    driver.quit();
}
 try:
    #WebDriver code here...
finally:
    driver.quit()
 try {
    #WebDriver code here...
} finally {
    driver.Quit();
}
 begin
    #WebDriver code here...
ensure
    driver.quit
end
 try {
    //WebDriver code here...
} finally {
    await driver.quit();
}

Python's WebDriver now supports the python context manager, which when using the with keyword can automatically quit the driver at the end of execution.

 with webdriver.Firefox() as driver:
    #WebDriver code here...

#WebDriver will automatically quit after indentation

Frames and Iframes

Frames are a now deprecated means of building a site layout from multiple documents on the same domain. You are unlikely to work with them unless you are working with an pre HTML5 webapp. Iframes allow the insertion of a document from an entirely different domain, and are still commonly used.

If you need to work with frames or iframes, Webdriver allows you to work with them in the same way. Consider a button within an iframe. If we inspect the element using the browser development tools, we might see the following:

<div id="modal">
  <iframe id="buttonframe" name="myframe"  src="https://seleniumhq.github.io/docs/iframe.html">
   <button>Click here</button>
 </iframe>
</div>

If it wasn't for the iframe we would expect to click on the button using something like:

//This won't work
driver.findElement(By.tagName("button")).click();

However, if there are no buttons outside of the iframe, you might instead get a no such element error. This happens because Selenium is only aware of the elements in the top level document. To interact with the button, we will need to first switch to the frame, in a similar way to how we switch windows. Webdriver offers three ways of switching to a frame.

Using a webelement

Switching using a webelement is the most flexible option. You can find the frame using your preferred selector and switch to it.

 //Store the web element
WebElement iframe = driver.findElement(By.cssSelector("#modal>iframe"));

//Switch to the frame
driver.switchTo().frame(iframe);

//Now we can click the button
driver.findElement(By.tagName("button")).click();
 

Using a name or ID

If your frame or iframe has a id or name attribute, this can be used instead. If the name or ID is not unique on the page, then the first one found will be switched to.

 //Using the ID
driver.switchTo().frame("buttonframe");

//Or using the name instead
driver.switchTo().frame("myframe");

//Now we can click the button
driver.findElement(By.tagName("button")).click();

Using an index

It is also possible to use the index of the frame, such as can be queried using window.frames in javascript.

 //Switches to the second frame
driver.switchTo().frame(1);

Leaving a frame

To leave an iframe or frameset, switch back to the default content like so:

 //Return to the top level
driver.switchTo().defaultContent();

Window Management

Screen resolution can impact how your web application renders, so WebDriver provides mechanisms for moving and resizing the browser window.

Get Window Size

Fetches the size of the browser window in pixels.

 //Access each dimension individually
int width = driver.manage().window().getSize().getWidth();
int height = driver.manage().window().getSize().getHeight();

//Or store the dimensions and query them later
Dimension size = driver.manage().window().getSize();
int width1 = size.getWidth();
int height1 = size.getHeight();
 //Access each dimension individually
width = driver.get_window_size().get("width")
height = driver.get_window_size().get("height")

//Or store the dimensions and query them later
size = driver.get_window_size()
width1 = size.get("width")
height1 = size.get("height")

Set Window Size

Restores the window and sets the window size.

 driver.manage().window().setSize(new Dimension(1024, 768));
 driver.set_window_size(1024,768)

Get Window Position

Fetches the coordinates of the top left coordinate of the browser window.

 //Access each dimension individually
int x = driver.manage().window().getPosition().getX();
int y = driver.manage().window()..getPosition().getY();

//Or store the dimensions and query them later
Point position = driver.manage().window().getPosition();
int x1 = position.getX();
int y1 = position.getY();
 //Access each dimension individually
x = driver.get_window_position().get('x')
y = driver.get_window_position().get('y')

//Or store the dimensions and query them later
position = driver.get_window_position()
x1 = position.get('x')
y1 = position.get('y')

Set Window Position

Moves the window to the chosen position.

 //Move the window to the top left of the primary monitor
driver.manage().window().setPosition(new Point(0,0));
 //Move the window to the top left of the primary monitor
driver.set_window_position(0, 0)

Maximise Window

Enlarges the window. For most operating systems, the window will fill the screen, without blocking the operating system's own menus and toolbars.

 driver.manage().window().maximize();
 driver.maximize_window()

Fullscreen Window

Fills the entire screen, similar to pressing F11 in most browsers.

 driver.manage().window().fullscreen();
 driver.fullscreen_window()

Waits

WebDriver can generally be said to have a blocking API. Because it is an out-of-process library that instructs the browser what to do, and because the web platform has an intrinsically asynchronous nature, WebDriver doesn't track the active, real-time state of the DOM. This comes with some challenges that we will discuss here.

From experience, most intermittents that arise from use of Selenium and WebDriver are connected to race conditions that occur between the browser and the user's instructions. An example could be that the user instructs the browser to navigate to a page, then gets a no such element error when trying to find an element.

Consider the following document:

<!doctype html>
<meta charset=utf-8>
<title>Race Condition Example<title>

<script>
  var initialised = false;
  window.addEventListener("load", function() {
    var newElement = document.createElement("p");
    newElement.textContent = "Hello from JavaScript!";
    document.body.appendChild(newElement);
    initialised = true;
  });
</script>

The WebDriver instructions might look innocent enough:

driver.navigate("file:///race_condition.html")
el = driver.find_element_by_tag_name("p")
assert el.text == "Hello from JavaScript!"

driver.get("file:///race_condition.html");
WebElement element = driver.findElement(By.tagName("p"));
assertEquals(element.getText(), "Hello from JavaScript!");

The issue here is that the default page load strategy used in WebDriver listens for the document.readyState to change to "complete" before returning from the call to navigate. Because the p element is added after the document has completed loading, this WebDriver script might be intermittent. It “might” be intermittent because no guarantees can be made about elements or events that trigger asynchronously without explicitly waiting—or blocking—on those events.

Fortunately, using the normal instruction set available on the WebElement interface—such as WebElement.click and WebElement.sendKeys—are guaranteed to be synchronous, in that the function calls won't return (or the callback won't trigger in callback-style languages) until the command has been completed in the browser. The advanced user interaction APIs, Keyboard and Mouse, are exceptions as they are explicitly intended as “do what I say” asynchronous commands.

Waiting is having the automated task execution elapse a certain amount of time before continuing with the next step.

To overcome the problem of race conditions between the browser and your WebDriver script, most Selenium clients ship with a wait package. When employing a wait, you are using what is commonly referred to as an explicit wait.

Explicit Wait

Explicit waits are available to Selenium clients for imperative, procedural languages. They allow your code to halt program execution, or freeze the thread, until the condition you pass it resolves. The condition is called with a certain frequency until the timeout of the wait is elapsed. This means that for as long as the condition returns a falsy value, it will keep trying and waiting.

Since explicit waits allow you to wait for a condition to occur, they make a good fit for synchronising the state between the browser and its DOM, and your WebDriver script.

To remedy our buggy instruction set from earlier, we could employ a wait to have the findElement call wait until the dynamically added element from the script has been added to the DOM:

	
WebDriver driver = new ChromeDriver();
driver.get("https://google.com/ncr");
driver.findElement(By.name("q")).sendKeys("cheese" + Keys.ENTER);
// Initialize and wait till element(link) became clickable - timeout in 10 seconds 
WebElement firstResult = new WebDriverWait(driver, 10)
        .until(ExpectedConditions.elementToBeClickable(By.xpath("//a/h3")));
// Print the first result
System.out.println(firstResult.getText());      
	
	
from selenium.webdriver.support.ui import WebDriverWait
def document_initialised(driver):
    return driver.execute_script("return initialised")

driver.navigate("file:///race_condition.html")
WebDriverWait(driver).until(document_initialised)
el = driver.find_element_by_tag_name("p")
assert el.text == "Hello from JavaScript!"
  

We pass in the condition as a function reference that the wait will run repeatedly until its return value is truthy. A “truthful” return value is anything that evaluates to boolean true in the language at hand, such as a string, number, a boolean, an object (including a WebElement), or a populated (non-empty) sequence or list. That means an empty list evaluates to false. When the condition is truthful and the blocking wait is aborted, the return value from the condition becomes the return value of the wait.

With this knowledge, and because the wait utility ignores no such element errors by default, we can refactor our instructions to be more concise:

from selenium.webdriver.support.ui import WebDriverWait

driver.navigate("file:///race_condition.html")
el = WebDriverWait(driver).until(lambda d: return d.find_element_by_tag_name("p"))
assert el.text == "Hello from JavaScript!"

In that example, we pass in an anonymous function (but we could also define it explicitly as we did earlier so it may be reused). The first and only argument that is passed to our condition is always a reference to our driver object, WebDriver (called d in the example). In a multi-threaded environment, you should be careful to operate on the driver reference passed in to the condition rather than the reference to the driver in the outer scope.

Because the wait will swallow no such element errors that are raised when the element isn't found, the condition will retry until the element is found. Then it will take the return value, a WebElement, and pass it back through to our script.

If the condition fails, e.g. a truthful return value from the condition is never reached, the wait will throw/raise an error/exception called a timeout error.

Options

The wait condition can be customised to match your needs. Sometimes it's unnecessary to wait the full extent of the default timeout, as the penalty for not hitting a successful condition can be expensive.

The wait lets you pass in an argument to override the timeout:

  
//new WebDriverWait(driver,3).until(some_condition(WebElement))
new WebDriverWait(driver, 3).until(ExpectedConditions.elementToBeClickable(By.xpath("//a/h3")));
  
  
WebDriverWait(driver, timeout=3).until(some_condition)
  

Expected conditions

Because it's quite a common occurrence to have to synchronise the DOM and your instructions, most clients also come with a set of predefined expected conditions. As might be obvious by the name, they are conditions that are predefined for frequent wait operations.

The conditions available in the different language bindings vary, but this is a non-exhaustive list of a few:

alert is present
element exists
element is visible
title contains
title is
element staleness
visible text

You can refer to the API documentation for each client binding to find an exhaustive list of expected conditions:

Implicit Wait

There is a second type of wait that is distinct from explicit wait called implicit wait. By implicitly waiting, WebDriver polls the DOM for a certain duration when trying to find any element. This can be useful when certain elements on the webpage are not available immediately and need some time to load.

Implicit waiting for elements to appear is disabled by default and will need to be manually enabled on a per-session basis. Mixing explicit waits and implicit waitis will cause unintended consequences, namely waits sleeping for the maximum time even if the element is available or condition is true.

Warning: Do not mix implicit and explicit waits. Doing so can cause unpredictable wait times. For example, setting an implicit wait of 10 seconds and an explicit wait of 15 seconds could cause a timeout to occur after 20 seconds.

An implicit wait is to tell WebDriver to poll the DOM for a certain amount of time when trying to find an element or elements if they are not immediately available. The default setting is 0, meaning disabled. Once set, the implicit wait is set for the life of the session.

  
WebDriver driver = new FirefoxDriver();
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
driver.get("http://somedomain/url_that_delays_loading");
WebElement myDynamicElement = driver.findElement(By.id("myDynamicElement"));
  
  
driver = Firefox()
driver.implicitly_wait(10)
driver.get("http://somedomain/url_that_delays_loading")
my_dynamic_element = driver.find_element_by_id("myDynamicElement")
  

FluentWait

FluentWait instance defines the maximum amount of time to wait for a condition, as well as the frequency with which to check the condition.

Users may configure the wait to ignore specific types of exceptions whilst waiting, such as NoSuchElementExceptions when searching for an element on the page.

// Waiting 30 seconds for an element to be present on the page, checking
// for its presence once every 5 seconds.
Wait<WebDriver> wait = new FluentWait<WebDriver>(driver)
  .withTimeout(30, SECONDS)
  .pollingEvery(5, SECONDS)
  .ignoring(NoSuchElementException.class);

WebElement foo = wait.until(new Function<WebDriver, WebElement>() {
  public WebElement apply(WebDriver driver) {
    return driver.findElement(By.id("foo"));
  }
});
FluentWait<By> fluentWait = new FluentWait<By>(By.tagName("TEXTAREA"));
fluentWait.pollingEvery(100, TimeUnit.MILLISECONDS);
fluentWait.withTimeout(1000, TimeUnit.MILLISECONDS);
fluentWait.until(new Predicate<By>() {
  public boolean apply(By by) {
    try {
      return browser.findElement(by).isDisplayed();
    } catch (NoSuchElementException ex) {
      return false;
    }
  }
});
browser.findElement(By.tagName("TEXTAREA")).sendKeys("text to enter");

Support classes

JavaScript Alerts, Prompts and Confirmations

WebDriver provides an API for working with the three types of native popup message offered by JavaScript. These popups are styled by the browser and offer limited customisation.

Alerts

The simplest of these is referred to as an alert, which shows a custom message, and a single button which dismisses the alert, labelled in most browsers as OK. It can also be dismissed in most browsers by pressing the close button, but this will always do the same thing as the OK button. See an example alert.

WebDriver can get the text from the popup and accept or dismiss these alerts.

 
//Click the link to activate the alert
driver.findElement(By.linkText("See an example alert")).click();

//Wait for the alert to be displayed and store it in a variable
Alert alert = wait.until(ExpectedConditions.alertIsPresent());

//Store the alert text in a variable
String text = alert.getText();

//Press the OK button
alert.accept();
 
 
# Click the link to activate the alert
driver.find_element_by_link_text("See an example alert").click()

# Wait for the alert to be displayed and store it in a variable
alert = wait.until(expected_conditions.alert_is_present())

# Store the alert text in a variable
text = alert.text

# Press the OK button
alert.accept()
 

Confirm

A confirm box is similar to an alert, except the user can also choose to cancel the message. See a sample confirm.

This example also shows a different approach to storing an alert:

 
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample confirm")).click();

//Wait for the alert to be displayed
wait.until(ExpectedConditions.alertIsPresent());

//Store the alert in a variable
Alert alert = driver.switchTo().alert();

//Store the alert in a variable for reuse
String text = alert.getText();

//Press the Cancel button
alert.dismiss();
 
 
# Click the link to activate the alert
driver.find_element_by_link_text("See a sample confirm").click()

# Wait for the alert to be displayed
wait.until(expected_conditions.alert_is_present())

# Store the alert in a variable for reuse
alert = driver.switch_to.alert

# Store the alert text in a variable
text = alert.text

# Press the Cancel button
alert.dismiss()
 

Prompt

Prompts are similar to confirm boxes, except they also include a text input. Similar to working with form elements, you can use WebDriver's send keys to fill in a response. This will completely replace the placeholder text. Pressing the cancel button will not submit any text. See a sample prompt.

 
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample prompt")).click();

//Wait for the alert to be displayed and store it in a variable
Alert alert = wait.until(ExpectedConditions.alertIsPresent());

//Type your message
alert.sendKeys("Selenium");

//Press the OK button
alert.accept();
 
 
# Click the link to activate the alert
driver.find_element_by_link_text("See a sample prompt").click()

# Wait for the alert to be displayed
wait.until(expected_conditions.alert_is_present())

# Store the alert in a variable for reuse
alert = Alert(driver)

# Type your message
alert.send_keys("Selenium")

# Press the OK button
alert.accept()
 

HTTP proxies

Page Load Strategy

Web Element

Represents a DOM element. WebElements can be found by searching from the document root using a WebDriver instance, or by searching under another WebElement:

  
driver.get('http://www.google.com')
  .then(() =>   driver.findElement(By.tagName('form')) )
  .then((searchForm) => searchForm.findElement(By.name('q')) )
  .then((searchBox) => searchBox.sendKeys('webdriver') );
  
  
driver = Firefox()
driver.get("http://www.google.com")
search_form = driver.find_element_by_tag_name("form")
search_box = search_form.find_element_by_name("q")
search_box.send_keys("webdriver")
  
  
WebDriver driver = new FirefoxDriver();
driver.get("http://www.google.com");
WebElement searchForm = driver.findElement(By.tagName("form"));
WebElement searchbox = driver.findElement(By.name("q"));
searchbox.sendKeys("webdriver");
  

Keyboard

Mouse