Introduction to Selenium
Selenium is a powerful tool for automating web browsers. It is a popular open-source framework used for automated testing of web applications across different browsers and platforms. You can write test scripts in various programming languages, including Python, JavaScript, Java, C#, and Ruby.
Selenium WebDriver
This is the most widely used component. It allows you to programmatically control a web browser (e.g., Chrome, Firefox, Edge) as if a human were interacting with it. It is commonly used for end-to-end testing.
Selenium IDE
A browser extension for Chrome/Firefox, it provides a simple record-and-playback tool for creating test cases without coding. It's good for beginners.
Selenium Grid
Selenium Grid allows you to run tests on multiple machines and browsers in parallel. This is useful for speeding up test execution and cross-browser testing.
Common Use Cases
- Automated testing of websites and web applications
- Regression testing (to ensure old features still work)
- Continuous Integration/CI/CD pipelines
1. Setup & Installation
Install Selenium & WebDriver Manager
pip install selenium webdriver-manager
Import Required Modules
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Launch Chrome Browser
Basic setup
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
driver.get("https://example.com")
With options (maximized, headless)
You can launch the browser with specific options, such as maximizing the window or running in headless mode (without a GUI).
chrome_options = Options()
chrome_options.add_argument("--start-maximized")
chrome_options.add_argument("--headless") # Run without GUI
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=chrome_options)
2. Locating Elements
To interact with elements on a web page, you first need to locate them. Selenium provides several methods for this.
Locator | Method | Example |
---|---|---|
ID | `find_element(By.ID, "id")` | `driver.find_element(By.ID, "username")` |
Name | `find_element(By.NAME, "name")` | `driver.find_element(By.NAME, "email")` |
Class Name | `find_element(By.CLASS_NAME, "class")` | `driver.find_element(By.CLASS_NAME, "btn")` |
Tag Name | `find_element(By.TAG_NAME, "tag")` | `driver.find_element(By.TAG_NAME, "input")` |
Link Text | `find_element(By.LINK_TEXT, "text")` | `driver.find_element(By.LINK_TEXT, "Sign Up")` |
Partial Link Text | `find_element(By.PARTIAL_LINK_TEXT, "partial")` | `driver.find_element(By.PARTIAL_LINK_TEXT, "Sign")` |
CSS Selector | `find_element(By.CSS_SELECTOR, "selector")` | `driver.find_element(By.CSS_SELECTOR, "#login .btn")` |
XPath | `find_element(By.XPATH, "xpath")` | `driver.find_element(By.XPATH, "//input[@name='username']")` |
3. Interacting with Elements
Send Text to Input Field
element = driver.find_element(By.ID, "username")
element.send_keys("testuser")
Clear Input Field
element.clear()
Click a Button/Link
driver.find_element(By.ID, "submit-btn").click()
Check if Element is Displayed/Enabled/Selected
element = driver.find_element(By.ID, "checkbox")
print(element.is_displayed()) # True/False
print(element.is_enabled()) # True/False
print(element.is_selected()) # True/False (for checkboxes/radio buttons)
Get Element Text & Attributes
element = driver.find_element(By.ID, "header")
print(element.text) # Get visible text
print(element.get_attribute("href")) # Get attribute value
4. Dropdowns (Select Class)
For interacting with dropdowns (HTML select elements), Selenium provides the `Select` class.
from selenium.webdriver.support.select import Select
dropdown = Select(driver.find_element(By.ID, "country"))
dropdown.select_by_visible_text("USA") # Select by text
dropdown.select_by_value("us") # Select by value
dropdown.select_by_index(1) # Select by index
5. Handling Alerts & Popups
Selenium allows you to interact with JavaScript alert, confirm, and prompt dialogs.
Accept alert
alert = driver.switch_to.alert
print(alert.text) # Get alert text
alert.accept() # Click OK
Dismiss alert
alert.dismiss()
Send text to prompt
alert.send_keys("Yes")
alert.accept()
6. Waits (Implicit & Explicit)
Web pages often load dynamically, so it's crucial to implement waits to ensure elements are present and interactive before attempting to interact with them. Explicit waits are preferred over `time.sleep()` for better efficiency.
Implicit Wait (Global Wait for All Elements)
driver.implicitly_wait(10) # Wait up to 10 sec for elements
Explicit Wait (Wait for a Specific Condition)
Explicit waits allow you to wait for a specific condition to be met before proceeding.
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "dynamic-element"))
)
Common Expected Conditions:
- `EC.presence_of_element_located` (Element exists in DOM)
- `EC.visibility_of_element_located` (Element is visible)
- `EC.element_to_be_clickable` (Element is clickable)
- `EC.title_contains("Welcome")` (Page title contains text)
7. Keyboard & Mouse Actions
Keyboard Actions (Send Special Keys)
from selenium.webdriver.common.keys import Keys
driver.find_element(By.ID, "search").send_keys("Selenium" + Keys.ENTER)
Mouse Actions (Hover, Drag & Drop, Right-Click)
The `ActionChains` class allows you to perform complex mouse interactions.
from selenium.webdriver.common.action_chains import ActionChains
element = driver.find_element(By.ID, "menu")
action = ActionChains(driver)
action.move_to_element(element).click().perform() # Hover & Click
action.context_click(element).perform() # Right-Click
action.drag_and_drop(source, target).perform() # Drag & Drop
8. Handling Frames & Windows
Switch to Frame
Web pages can contain iframes. You need to switch to the frame to interact with elements inside it.
driver.switch_to.frame("frame-name") # By name, ID, or index
driver.switch_to.default_content() # Switch back to main page
Switch to New Window/Tab
When a new window or tab opens, you can switch the driver's focus to it.
driver.find_element(By.LINK_TEXT, "Open New Tab").click()
driver.switch_to.window(driver.window_handles[1]) # Switch to new tab
driver.close() # Close current tab
driver.switch_to.window(driver.window_handles[0]) # Switch back
10. Closing Browser
Close Current Window
driver.close() # Close current window
Quit Entire Browser Session
driver.quit() # Quit entire browser session
Final Notes
- Always use `try-finally` to ensure the browser closes properly.
- Prefer Explicit Waits over `time.sleep()` for better efficiency.
- Use XPath/CSS Selectors for complex element locating.
Selenium Python Script Examples
Example (1) : Demonstrate the usage of the below:
- Importing Selenium webdriver module
- Importing time module
- Adding option to start Chrome browser
- Initializing Chrome browser
- get() - method/function
- sleep() - method/function
- quit() - method/function
- title - property
# First Selenium Program
from selenium import webdriver # Import the main Selenium webdriver module
from selenium.webdriver.chrome.service import Service # Import Service class to manage ChromeDriver service
from selenium.webdriver.chrome.options import Options # Import Options class to set Chrome browser options
from webdriver_manager.chrome import ChromeDriverManager # Import ChromeDriverManager to automatically manage ChromeDriver
import time # Import time module to use sleep for delays
chrome_options = Options() # Create an instance of Chrome Options
chrome_options.add_argument("--start-maximized") # Add option to start the Chrome browser maximized
# chrome_options.add_argument("--headless") # (Optional) Run Chrome in headless mode (without UI)
# Initialize Chrome browser using ChromeDriverManager to auto-download driver and apply options
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),
options=chrome_options)
# Open the specified URL in the Chrome browser
driver.get("https://skillzam.com")
# Get the title of the current web page and store it in a variable
webTitle = driver.title
print(webTitle) # Print the page title to the console
# Pause the execution for 2 seconds to allow the page to load or for visual confirmation
time.sleep(2)
# Close the browser window and end the session
driver.quit()
Example (2) : Demonstrate the usage of the below:
- Importing 'By' class
- Importing 'Keys' class
- find_element() - method/function
- send_keys() - method/function
- clear() - method/function
- close() - method/function
- current_url - property
- Keys.RETURN - property
from selenium import webdriver # Import Selenium's webdriver to control the browser
from selenium.webdriver.chrome.service import Service # Import Service class to manage the ChromeDriver
from selenium.webdriver.chrome.options import Options # Import Options to customize Chrome's behavior
from webdriver_manager.chrome import ChromeDriverManager # Automatically handles downloading and setting up ChromeDriver
from selenium.webdriver.common.by import By # Import By class to locate elements on a web page
from selenium.webdriver.common.keys import Keys # Import Keys class to simulate keyboard key presses
import time # Import time module for adding delays
chrome_options = Options() # Create an Options object to set Chrome browser settings
chrome_options.add_argument("--start-maximized") # Set Chrome to start in maximized window
# Initialize the Chrome browser with the specified options and automatically managed driver
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),
options=chrome_options)
# Open the Google homepage
driver.get("https://google.com")
# Print the title of the page (should be "Google")
print(driver.title)
# Locate the search bar using its 'name' attribute (name="q" is used by Google's search bar)
search_bar = driver.find_element(By.NAME, "q") # Find the search input field
search_bar.clear() # Clear any pre-existing text in the search bar
search_bar.send_keys("getting started with python") # Type the search query into the search bar
search_bar.send_keys(Keys.RETURN) # Press "Enter" to submit the search
# Print the current URL after the search is performed
print(driver.current_url)
# Close the browser window
driver.close()
Example (3) : Demonstrate the usage of the below:
- Importing 'By' class
- Importing 'Keys' class
- find_element() - method/function
- send_keys() - method/function
- clear() - method/function
- close() - method/function
- current_url - property
- Keys.RETURN - property
from selenium import webdriver # Import Selenium's webdriver to control the browser
from selenium.webdriver.chrome.service import Service # Import Service class to manage the ChromeDriver
from selenium.webdriver.chrome.options import Options # Import Options to customize Chrome's behavior
from webdriver_manager.chrome import ChromeDriverManager # Automatically handles downloading and setting up ChromeDriver
from selenium.webdriver.common.by import By # Import By class to locate elements on a web page
from selenium.webdriver.common.keys import Keys # Import Keys class to simulate keyboard key presses
import time # Import time module for adding delays
chrome_options = Options() # Create an Options object to set Chrome browser settings
chrome_options.add_argument("--start-maximized") # Set Chrome to start in maximized window
# Initialize the Chrome browser with the specified options and automatically managed driver
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),
options=chrome_options)
# Open the Google homepage
driver.get("https://google.com")
# Print the title of the page (should be "Google")
print(driver.title)
# Locate the search bar using its 'name' attribute (name="q" is used by Google's search bar)
search_bar = driver.find_element(By.NAME, "q") # Find the search input field
search_bar.clear() # Clear any pre-existing text in the search bar
search_bar.send_keys("getting started with python") # Type the search query into the search bar
search_bar.send_keys(Keys.RETURN) # Press "Enter" to submit the search
# Print the current URL after the search is performed
print(driver.current_url)
# Close the browser window
driver.close()