Selenium supports automation of all the major browsers in the market through the use of WebDriver. WebDriver is an API and protocol that defines a language-neutral interface for controlling the behaviour of web browsers. Each browser is backed by a specific WebDriver implementation, called a *driver*. The driver is the component responsible for delegating down to the browser, and handles communication to and from Selenium and the browser.
This separation is part of a conscious effort to have browser vendors take responsibility for the implementation for their browsers. Selenium makes use of these third party drivers where possible, but also provides its own drivers maintained by the project for the cases when this is not a reality.
The Selenium framework ties all of these pieces together through a user-facing interface that enables the different browser backends to be used transparently, enabling cross-browser and cross-platform automation.
More details about drivers can be found in Driver Idiosyncracies.
The Selenium framework officially supports the following browsers:
|Firefox||Mozilla||54 and newer|
|Internet Explorer||Selenium||6 and newer|
|Opera||Opera Chromium / Presto||10.5 and newer|
|Safari||Apple||10 and newer|
There is also a set of specialized browsers out there typically used in development environments. We can make use of some of these browsers for automation purposes also, and Selenium ties in support for the following specialized drivers:
|PhantomJSDriver||Headless PhantomJS browser backed by QtWebKit.||GhostDriver project|
|HtmlUnitDriver||Headless browser emulator backed by Rhino.||Selenium project|
Selenium can be extended through the use of plugins. Here are a number of plugins created and maintained by third parties. For more information on how to create your own plugin or have it listed, consult the docs.
Please note that these plugins are not supported, maintained, hosted, or endorsed by the Selenium project. In addition, be advised that the plugins listed below are not necessarily licensed under the Apache License v.2.0. Some of the plugins are available under another free and open source software license; others are only available under a proprietary license. Any questions about plugins and their license of distribution need to be raised with their respective developer(s).
|Google ChromeDriver||2.29 - 2017-04-04||changelog||issues||wiki|
One of the most fundamental techniques to learn when using WebDriver is how to find elements on the page. WebDriver offers a number of built-in selector types, amongst them finding an element by its ID attribute:
WebElement cheese = driver.findElement(By.id("cheese"));
As seen in the example, locating elements in WebDriver is done on the
WebDriver instance object. The
findElement(By) method returns
another fundamental object type, the
WebDriverrepresents the browser
WebElementrepresents a particular DOM node (a control, e.g. a link or input field, etc.)
Once you have a reference to a web element that's been “found”, you can narrow the scope of your search by using the same call on that object instance:
WebElement cheese = driver.findElement(By.id("cheese")); WebElement cheddar = cheese.findElement(By.id("cheddar"));
cheese = driver.find_element_by_id("cheese") cheddar = cheese.find_elements_by_id("cheddar")
cheese = driver.find_element(id: "cheese") cheddar = cheese.find_elements(id: "cheddar")
You can do this because both the WebDriver and WebElement types implement the SearchContext interface. In WebDriver, this is known as a role-based interface. Role-based interfaces allow you to determine whether a particular driver implementation supports a given feature. These interfaces are clearly defined and try to adhere to having only a single role of responsibility. You can read more about WebDriver's design and what roles are supported in which drivers in the [Some Other Section Which Must Be Named](#).
Consequently, the By interface used above also supports a number of additional locator strategies. A nested lookup might not be the most effective cheese location strategy since it requires two separate commands to be issued to the browser; first searching the DOM for an element with ID “cheese”, then a search for “cheddar” in a narrowed context.
To improve the performance slightly, we should try to use a more specific locator: WebDriver supports looking up elements by CSS locators, allowing us to combine the two previous locators into one search:
cheddar = driver.find_element_by_css_selector("#cheese #cheddar")
It's possible that the document we are working with may turn out to have an ordered list of the cheese we like the best:
<ol id=cheese> <li id=cheddar>… <li id=brie>… <li id=rochefort>… <li id=camembert>… </ul>
Since more cheese is undisputably better, and it would be cumbersome
to have to retrieve each of the items individually, a superior
technique for retrieving cheese is to make use of the pluralized
findElements(By). This method returns a collection of web
elements. If only one element is found, it will still return a
collection (of one element). If no elements match the locator, an
empty list will be returned.
List<WebElement> muchoCheese = driver.findElements(By.cssSelector("#cheese li"));
mucho_cheese = driver.find_elements_by_css_selector("#cheese li")
mucho_cheese = driver.find_elements(css: "#cheese li")
There are eight different built-in element location strategies in WebDriver:
|class name||Locates elements whose class name contains the search value (compound class names are not permitted)|
|css selector||Locates elements matching a CSS selector|
|id||Locates elements whose ID attribute matches the search value|
|name||Locates elements whose NAME attribute matches the search value|
|link text||Locates anchor elements whose visible text matches the search value|
|partial link text||Locates anchor elements whose visible text partially matches the search value|
|tag name||Locates elements whose tag name matches the search value|
|xpath||Locates elements matching an XPath expression|
In general, if HTML IDs are available, unique, and consistently predictable, they are the preferred method for locating an element on a page. They tend to work very quickly, and forego much processing that comes with complicated DOM traversals.
If unique IDs are unavailable, a well-written CSS selector is the preferred method of locating an element. XPath works as well as CSS selectors, but the syntax is complicated and frequently difficult to debug. Though XPath selectors are very flexible, they're typically not performance tested by browser vendors and tend to be quite slow.
Selection strategies based on link text and partial link text have drawbacks in that they only work on link elements. Additionally, they call down to XPath selectors internally in WebDriver.
Tag name can be a dangerous way to locate elements. There are frequently multiple elements of the same tag present on the page. This is mostly useful when calling the findElements(By) method which returns a collection of elements.
The recommendation is to keep your locators as compact and readable as possible. Asking WebDriver to traverse the DOM structure is an expensive operation, and the more you can narrow the scope of your search, the better.
You can set an element's text using the sendKeys method as follows:
String name = "Charles"; driver.findElement(By.name("name")).sendKeys(name);
name = "Charles" driver.find_element_by_name("name").send_keys(name)
name = "Charles" driver.find_element(name: "name").send_keys(name)
WebElement source = driver.findElement(By.id("source")); WebElement target = driver.findElement(By.id("target")); new Actions(driver).dragAndDrop(source, target).build().perform();
source = driver.find_element_by_id("source") target = driver.find_element_by_id("target") ActionChains(driver).drag_and_drop(source, target).perform()
source = driver.find_element(id: "source") target = driver.find_element(id: "target") driver.action.drag_and_drop(source, target).perform
You can click on an element using the click method: