Setting Up a Selenium-WebDriver Project

To install Selenium means to set up a project in a development so you can write a program using Selenium. How you do this depends on your programming language and your development environment.


The easiest way to set up a Selenium 2.0 Java project is to use Maven. Maven will download the java bindings (the Selenium 2.0 java client library) and all its dependencies, and will create the project for you, using a maven pom.xml (project configuration) file. Once you’ve done this, you can import the maven project into your preferred IDE, IntelliJ IDEA or Eclipse.

First, create a folder to contain your Selenium project files. Then, to use Maven, you need a pom.xml file. This can be created with a text editor. We won’t teach the details of pom.xml files or for using Maven since there are already excellent references on this. Your pom.xml file will look something like this. Create this file in the folder you created for your project.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns=""

Be sure you specify the most current version. At the time of writing, the version listed above was the most current, however there were frequent releases immediately after the release of Selenium 2.0. Check the Maven download page for the current release and edit the above dependency accordingly.

Now, from a command-line, CD into the project directory and run maven as follows.

mvn clean install

This will download Selenium and all its dependencies and will add them to the project.

Finally, import the project into your preferred development environment. For those not familiar with this, we’ve provided an appendix which shows this.

Importing a maven project into IntelliJ IDEAImporting a maven project into Eclipse.


As of Selenium 2.2.0, the C# bindings are distributed as a set of signed dlls along with other dependency dlls. Prior to 2.2.0, all Selenium dll’s were unsigned. To include Selenium in your project, simply download the latest selenium-dotnet zip file from If you are using Windows Vista or above, you should unblock the zip file before unzipping it: Right click on the zip file, click “Properties”, click “Unblock” and click “OK”.

Unzip the contents of the zip file, and add a reference to each of the unzipped dlls to your project in Visual Studio (or your IDE of choice).

Official NuGet Packages: RC WebDriver WebDriverBackedSelenium Support


If you are using Python for test automation then you probably are already familiar with developing in Python. To add Selenium to your Python environment run the following command from a command-line.

pip install selenium

Pip requires pip to be installed, pip also has a dependency on setuptools.

Teaching Python development itself is beyond the scope of this document, however there are many resources on Python and likely developers in your organization can help you get up to speed.


If you are using Ruby for test automation then you probably are already familiar with developing in Ruby. To add Selenium to your Ruby environment run the following command from a command-line.

gem install selenium-webdriver

Teaching Ruby development itself is beyond the scope of this document, however there are many resources on Ruby and likely developers in your organization can help you get up to speed.


Perl bindings are provided by a third party, please refer to any of their documentation on how to install / get started. There is one known Perl binding as of this writing.


PHP bindings are provided by a third party, please refer to any of their documentation on how to install / get started. There are three known bindings at this time: By Chibimagic By Lukasz Kolczynski and By the Facebook

Migrating from Selenium 1.0

For those who already have test suites written using Selenium 1.0, we have provided tips on how to migrate your existing code to Selenium 2.0. Simon Stewart, the lead developer for Selenium 2.0, has written an article on migrating from Selenium 1.0. We’ve included this as an appendix.

Migrating From Selenium RC to Selenium WebDriver

Introducing the Selenium-WebDriver API by Example

WebDriver is a tool for automating web application testing, and in particular to verify that they work as expected. It aims to provide a friendly API that’s easy to explore and understand, easier to use than the Selenium-RC (1.0) API, which will help to make your tests easier to read and maintain. It’s not tied to any particular test framework, so it can be used equally well in a unit testing or from a plain old “main” method. This section introduces WebDriver’s API and helps get you started becoming familiar with it. Start by setting up a WebDriver project if you haven’t already. This was described in the previous section, Setting Up a Selenium-WebDriver Project.

Once your project is set up, you can see that WebDriver acts just as any normal library: it is entirely self-contained, and you usually don’t need to remember to start any additional processes or run any installers before using it, as opposed to the proxy server with Selenium-RC.

Note: additional steps are required to use Chrome DriverOpera DriverAndroid Driver andiPhone Driver

You’re now ready to write some code. An easy way to get started is this example, which searches for the term “Cheese” on Google and then outputs the result page’s title to the console.

package org.openqa.selenium.example;

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

public class Selenium2Example  {
    public static void main(String[] args) {
        // Create a new instance of the Firefox driver
        // Notice that the remainder of the code relies on the interface, 
        // not the implementation.
        WebDriver driver = new FirefoxDriver();

        // And now use this to visit Google
        // Alternatively the same thing can be done like this
        // driver.navigate().to("");

        // Find the text input element by its name
        WebElement element = driver.findElement("q"));

        // Enter something to search for

        // Now submit the form. WebDriver will find the form for us from the element

        // Check the title of the page
        System.out.println("Page title is: " + driver.getTitle());
        // Google's search is rendered dynamically with JavaScript.
        // Wait for the page to load, timeout after 10 seconds
        (new WebDriverWait(driver, 10)).until(new ExpectedCondition<Boolean>() {
            public Boolean apply(WebDriver d) {
                return d.getTitle().toLowerCase().startsWith("cheese!");

        // Should see: "cheese! - Google Search"
        System.out.println("Page title is: " + driver.getTitle());
        //Close the browser

In upcoming sections, you will learn more about how to use WebDriver for things such as navigating forward and backward in your browser’s history, and how to test web sites that use frames and windows. We also provide a more thorough discussions and examples.

Selenium-WebDriver API Commands and Operations

Fetching a Page

The first thing you’re likely to want to do with WebDriver is navigate to a page. The normal way to do this is by calling “get”:


Dependent on several factors, including the OS/Browser combination, WebDriver may or may not wait for the page to load. In some circumstances, WebDriver may return control before the page has finished, or even started, loading. To ensure robustness, you need to wait for the element(s) to exist in the page using Explicit and Implicit Waits.

Locating UI Elements (WebElements)

Locating elements in WebDriver can be done on the WebDriver instance itself or on a WebElement. Each of the language bindings expose a “Find Element” and “Find Elements” method. The first returns a WebElement object otherwise it throws an exception. The latter returns a list of WebElements, it can return an empty list if no DOM elements match the query.

The “Find” methods take a locator or query object called “By”. “By” strategies are listed below.


This is the most efficient and preferred way to locate an element. Common pitfalls that UI developers make is having non-unique id’s on a page or auto-generating the id, both should be avoided. A class on an html element is more appropriate than an auto-generated id.

Example of how to find an element that looks like this:

<div id="coolestWidgetEvah">...</div>
WebElement element = driver.findElement("coolestWidgetEvah"));

By Class Name

“Class” in this case refers to the attribute on the DOM element. Often in practical use there are many DOM elements with the same class name, thus finding multiple elements becomes the more practical option over finding the first element.

Example of how to find an element that looks like this:

<div class="cheese"><span>Cheddar</span></div><div class="cheese"><span>Gouda</span></div>
List<WebElement> cheeses = driver.findElements(By.className("cheese"));

By Tag Name

The DOM Tag Name of the element.

Example of how to find an element that looks like this:

<iframe src="..."></iframe>
WebElement frame = driver.findElement(By.tagName("iframe"));

By Name

Find the input element with matching name attribute.

Example of how to find an element that looks like this:

<input name="cheese" type="text"/>
WebElement cheese = driver.findElement("cheese"));


Like the name implies it is a locator strategy by css. Native browser support is used by default, so please refer to w3c css selectors <> for a list of generally available css selectors. If a browser does not have native support for css queries, then Sizzle is used. IE 6,7 and FF3.0 currently use Sizzle as the css query engine.

Beware that not all browsers were created equal, some css that might work in one version may not work in another.

Example of to find the cheese below:

<div id="food"><span class="dairy">milk</span><span class="dairy aged">cheese</span></div>
WebElement cheese = driver.findElement(By.cssSelector("#food span.dairy.aged"));


At a high level, WebDriver uses a browser’s native XPath capabilities wherever possible. On those browsers that don’t have native XPath support, we have provided our own implementation. This can lead to some unexpected behaviour unless you are aware of the differences in the various xpath engines.

Driver Tag and Attribute Name Attribute Values Native XPath Support
HtmlUnit Driver Lower-cased As they appear in the HTML Yes
Internet Explorer Driver Lower-cased As they appear in the HTML No
Firefox Driver Case insensitive As they appear in the HTML Yes

This is a little abstract, so for the following piece of HTML:

<input type="text" name="example" />
<INPUT type="text" name="other" />
List<WebElement> inputs = driver.findElements(By.xpath("//input"));

The following number of matches will be found

XPath expression HtmlUnit Driver Firefox Driver Internet Explorer Driver
//input 1 (“example”) 2 2
//INPUT 0 2 0

Sometimes HTML elements do not need attributes to be explicitly declared because they will default to known values. For example, the “input” tag does not require the “type” attribute because it defaults to “text”. The rule of thumb when using xpath in WebDriver is that you should not expect to be able to match against these implicit attributes.

Using JavaScript

You can execute arbitrary javascript to find an element and as long as you return a DOM Element, it will be automatically converted to a WebElement object.

Simple example on a page that has jQuery loaded:

WebElement element = (WebElement) ((JavascriptExecutor)driver).executeScript("return $('.cheese')[0]");

Finding all the input elements to the every label on a page:

List<WebElement> labels = driver.findElements(By.tagName("label"));
List<WebElement> inputs = (List<WebElement>) ((JavascriptExecutor)driver).executeScript(
    "var labels = arguments[0], inputs = []; for (var i=0; i < labels.length; i++){" +
    "inputs.push(document.getElementById(labels[i].getAttribute('for'))); } return inputs;", labels);

User Input - Filling In Forms

We’ve already seen how to enter text into a textarea or text field, but what about the other elements? You can “toggle” the state of checkboxes, and you can use “click” to set something like an OPTION tag selected. Dealing with SELECT tags isn’t too bad:

WebElement select = driver.findElement(By.tagName("select"));
List<WebElement> allOptions = select.findElements(By.tagName("option"));
for (WebElement option : allOptions) {
    System.out.println(String.format("Value is: %s", option.getAttribute("value")));;

This will find the first “SELECT” element on the page, and cycle through each of its OPTIONs in turn, printing out their values, and selecting each in turn. As you will notice, this isn’t the most efficient way of dealing with SELECT elements. WebDriver’s support classes include one called “Select”, which provides useful methods for interacting with these.

Select select = new Select(driver.findElement(By.tagName("select")));

This will deselect all OPTIONs from the first SELECT on the page, and then select the OPTION with the displayed text of “Edam”.

Once you’ve finished filling out the form, you probably want to submit it. One way to do this would be to find the “submit” button and click it:


Alternatively, WebDriver has the convenience method “submit” on every element. If you call this on an element within a form, WebDriver will walk up the DOM until it finds the enclosing form and then calls submit on that. If the element isn’t in a form, then theNoSuchElementException will be thrown:


Moving Between Windows and Frames

Some web applications have many frames or multiple windows. WebDriver supports moving between named windows using the “switchTo” method:


All calls to driver will now be interpreted as being directed to the particular window. But how do you know the window’s name? Take a look at the javascript or link that opened it:

<a href="somewhere.html" target="windowName">Click here to open a new window</a>

Alternatively, you can pass a “window handle” to the “switchTo().window()” method. Knowing this, it’s possible to iterate over every open window like so:

for (String handle : driver.getWindowHandles()) {

You can also switch from frame to frame (or into iframes):


It’s possible to access subframes by separating the path with a dot, and you can specify the frame by its index too. That is:


would go to the frame named “child” of the first subframe of the frame called “frameName”.All frames are evaluated as if from *top*.


Before we leave these next steps, you may be interested in understanding how to use cookies. First of all, you need to be on the domain that the cookie will be valid for. If you are trying to preset cookies before you start interacting with a site and your homepage is large / takes a while to load an alternative is to find a smaller page on the site, typically the 404 page is small (

// Go to the correct domain

// Now set the cookie. This one's valid for the entire domain
Cookie cookie = new Cookie("key", "value");

// And now output all the available cookies for the current URL
Set<Cookie> allCookies = driver.manage().getCookies();
for (Cookie loadedCookie : allCookies) {
    System.out.println(String.format("%s -> %s", loadedCookie.getName(), loadedCookie.getValue()));

// You can delete cookies in 3 ways
// By name
// By Cookie
// Or all of them

Changing the User Agent

This is easy with the Firefox Driver:

FirefoxProfile profile = new FirefoxProfile();
profile.addAdditionalPreference("general.useragent.override", "some UA string");
WebDriver driver = new FirefoxDriver(profile);

Drag And Drop

Here’s an example of using the Actions class to perform a drag and drop. Native events are required to be enabled.

WebElement element = driver.findElement("source"));
WebElement target = driver.findElement("target"));

(new Actions(driver)).dragAndDrop(element, target).perform();

Driver Specifics and Tradeoffs

Selenium-WebDriver’s Drivers

WebDriver is the name of the key interface against which tests should be written, but there are several implementations. These include:

HtmlUnit Driver

This is currently the fastest and most lightweight implementation of WebDriver. As the name suggests, this is based on HtmlUnit. HtmlUnit is a java based implementation of a WebBrowser without a GUI. For any language binding (other than java) the Selenium Server is required to use this driver.


WebDriver driver = new HtmlUnitDriver();


  • Fastest implementation of WebDriver
  • A pure Java solution and so it is platform independent.
  • Supports JavaScript


  • Emulates other browsers’ JavaScript behaviour (see below)

JavaScript in the HtmlUnit Driver

None of the popular browsers uses the JavaScript engine used by HtmlUnit (Rhino). If you test JavaScript using HtmlUnit the results may differ significantly from those browsers.

When we say “JavaScript” we actually mean “JavaScript and the DOM”. Although the DOM is defined by the W3C each browser has its own quirks and differences in their implementation of the DOM and in how JavaScript interacts with it. HtmlUnit has an impressively complete implementation of the DOM and has good support for using JavaScript, but it is no different from any other browser: it has its own quirks and differences from both the W3C standard and the DOM implementations of the major browsers, despite its ability to mimic other browsers.

With WebDriver, we had to make a choice; do we enable HtmlUnit’s JavaScript capabilities and run the risk of teams running into problems that only manifest themselves there, or do we leave JavaScript disabled, knowing that there are more and more sites that rely on JavaScript? We took the conservative approach, and by default have disabled support when we use HtmlUnit. With each release of both WebDriver and HtmlUnit, we reassess this decision: we hope to enable JavaScript by default on the HtmlUnit at some point.

Enabling JavaScript

If you can’t wait, enabling JavaScript support is very easy:

HtmlUnitDriver driver = new HtmlUnitDriver(true);

This will cause the HtmlUnit Driver to emulate Firefox 3.6’s JavaScript handling by default.

Firefox Driver

Controls the Firefox browser using a Firefox plugin. The Firefox Profile that is used is stripped down from what is installed on the machine to only include the Selenium WebDriver.xpi (plugin). A few settings are also changed by default (see the source to see which ones) Firefox Driver is capable of being run and is tested on Windows, Mac, Linux. Currently on versions 3.6, 10, latest - 1, latest


WebDriver driver = new FirefoxDriver();



Modifying the Firefox Profile

Suppose that you wanted to modify the user agent string (as above), but you’ve got a tricked out Firefox profile that contains dozens of useful extensions. There are two ways to obtain this profile. Assuming that the profile has been created using Firefox’s profile manager (firefox -ProfileManager):

ProfileIni allProfiles = new ProfilesIni();
FirefoxProfile profile = allProfiles.getProfile("WebDriver");
profile.setPreferences("", 23);
WebDriver driver = new FirefoxDriver(profile);

Alternatively, if the profile isn’t already registered with Firefox:

File profileDir = new File("path/to/top/level/of/profile");
FirefoxProfile profile = new FirefoxProfile(profileDir);
WebDriver driver = new FirefoxDriver(profile);

As we develop features in the Firefox Driver, we expose the ability to use them. For example, until we feel native events are stable on Firefox for Linux, they are disabled by default. To enable them:

FirefoxProfile profile = new FirefoxProfile();
WebDriver driver = new FirefoxDriver(profile);


See the Firefox section in the wiki page for the most up to date info.

Internet Explorer Driver

This driver is controlled by a .dll and is thus only available on Windows OS. Each Selenium release has it’s core functionality tested against versions 6, 7 and 8 on XP, and 9 on Windows7.


WebDriver driver = new InternetExplorerDriver();


  • Runs in a real browser and supports JavaScript with all the quirks your end users see.


  • Obviously the Internet Explorer Driver will only work on Windows!
  • Comparatively slow (though still pretty snappy :)
  • XPath is not natively supported in most versions. Sizzle is injected automatically which is significantly slower than other browsers and slower when comparing to CSS selectors in the same browser.
  • CSS is not natively supported in versions 6 and 7. Sizzle is injected instead.
  • CSS selectors in IE 8 and 9 are native, but those browsers don’t fully support CSS3


See the Internet Explorer section of the wiki page for the most up to date info. Please take special note of the Required Configuration section.

Chrome Driver

Chrome Driver is maintained / supported by the Chromium project iteslf. WebDriver works with Chrome through the chromedriver binary (found on the chromium project’s download page). You need to have both chromedriver and a version of chrome browser installed. chromedriver needs to be placed somewhere on your system’s path in order for WebDriver to automatically discover it. The Chrome browser itself is discovered by chromedriver in the default installation path. These both can be overridden by environment variables. Please refer to the wiki for more information.


WebDriver driver = new ChromeDriver();


  • Runs in a real browser and supports JavaScript
  • Because Chrome is a Webkit-based browser, the Chrome Driver may allow you to verify that your site works in Safari. Note that since Chrome uses its own V8 JavaScript engine rather than Safari’s Nitro engine, JavaScript execution may differ.



See our wiki for the most up to date info. More info can also be found on the downloads page

Getting running with Chrome Driver

Download the Chrome Driver executable and follow the other instructions on the wiki page

Opera Driver

See the Opera Driver wiki article in the Selenium Wiki for information on using the Opera Driver.

iPhone Driver

See the iPhone Driver wiki article in the Selenium Wiki for information on using the Mac iOS Driver.

Android Driver

See the Android Driver wiki article in the Selenium Wiki for information on using the Android Driver.

Alternative Back-Ends: Mixing WebDriver and RC Technologies

WebDriver-Backed Selenium-RC

The Java version of WebDriver provides an implementation of the Selenium-RC API. These means that you can use the underlying WebDriver technology using the Selenium-RC API. This is primarily provided for backwards compatibility. It allows those who have existing test suites using the Selenium-RC API to use WebDriver under the covers. It’s provided to help ease the migration path to Selenium-WebDriver. Also, this allows one to use both APIs, side-by-side, in the same test code.

Selenium-WebDriver is used like this:

// You may use any WebDriver implementation. Firefox is used here as an example
WebDriver driver = new FirefoxDriver();

// A "base url", used by selenium to resolve relative URLs
 String baseUrl = "";

// Create the Selenium implementation
Selenium selenium = new WebDriverBackedSelenium(driver, baseUrl);

// Perform actions with selenium"");
selenium.type("name=q", "cheese");"name=btnG");

// Get the underlying WebDriver implementation back. This will refer to the
// same WebDriver instance as the "driver" variable above.
WebDriver driverInstance = ((WebDriverBackedSelenium) selenium).getWrappedDriver();

//Finally, close the browser. Call stop on the WebDriverBackedSelenium instance
//instead of calling driver.quit(). Otherwise, the JVM will continue running after
//the browser has been closed.


  • Allows for the WebDriver and Selenium APIs to live side-by-side
  • Provides a simple mechanism for a managed migration from the Selenium RC API to WebDriver’s
  • Does not require the standalone Selenium RC server to be run


  • Does not implement every method
  • More advanced Selenium usage (using “browserbot” or other built-in JavaScript methods from Selenium Core) may not work
  • Some methods may be slower due to underlying implementation differences

Backing WebDriver with Selenium

WebDriver doesn’t support as many browsers as Selenium RC does, so in order to provide that support while still using the WebDriver API, you can make use of theSeleneseCommandExecutor

Safari is supported in this way with the following code (be sure to disable pop-up blocking):

DesiredCapabilities capabilities = new DesiredCapabilities();
CommandExecutor executor = new SeleneseCommandExecutor(new URL("http://localhost:4444/"), new URL(""), capabilities);
WebDriver driver = new RemoteWebDriver(executor, capabilities);

There are currently some major limitations with this approach, notably that findElements doesn’t work as expected. Also, because we’re using Selenium Core for the heavy lifting of driving the browser, you are limited by the JavaScript sandbox.

Running Standalone Selenium Server for use with RemoteDrivers

From Selenium’s Download page download selenium-server-standalone-<version>.jar and optionally IEDriverServer. If you plan to work with Chrome, download it from Google Code.

Unpack IEDriverServer and/or chromedriver and put them in a directory which is on the $PATH / %PATH% - the Selenium Server should then be able to handle requests for IE / Chrome without additional modifications.

Start the server on the command line with

java -jar <path_to>/selenium-server-standalone-<version>.jar

If you want to use native events functionality, indicate this on the command line with the option

For other command line options, execute

java -jar <path_to>/selenium-server-standalone-<version>.jar -help

In order to function properly, the following ports should be allowed incoming TCP connections: 4444, 7054-5 (or twice as many ports as the number of concurrent instances you plan to run). Under Windows, you may need to unblock the applications as well.

Additional Resources

You can find further resources for WebDriver in WebDriver’s wiki

Of course, don’t hesitate to do an internet search on any Selenium topic, including Selenium-WebDriver’s drivers. There are quite a few blogs on Selenium along with numerous posts on various user forums. Additionally the Selenium User’s Group is a great resource.

Next Steps

This chapter has simply been a high level walkthrough of WebDriver and some of its key capabilities. Once getting familiar with the Selenium-WebDriver API you will then want to learn how to build test suites for maintainability, extensibility, and reduced fragility when features of the AUT frequently change. The approach most Selenium experts are now recommending is to design your test code using the Page Object Design Pattern along with possibly a Page Factory. Selenium-WebDriver provides support for this by supplying a PageFactory class in Java and C#. This is presented,along with other advanced topics, in the next chapter. Also, for high-level description of this technique, you may want to look at the Test Design Considerations chapter. Both of these chapters present techniques for writing more maintainable tests by making your test code more modular.

发布了430 篇原创文章 · 获赞 415 · 访问量 925万+


©️2019 CSDN 皮肤主题: 编程工作室 设计师: CSDN官方博客