How To Use Selenium Webdriver To Crawl Websites

I have Selenium 4.7.2 version installed. You can check yours using following command.

In [11]:

!pip show selenium

Name: selenium
Version: 4.7.2
Summary: 
Home-page: https://www.selenium.dev
Author: 
Author-email: 
License: Apache 2.0
Location: /home/anaconda3/envs/condapy38/lib/python3.8/site-packages
Requires: certifi, trio, trio-websocket, urllib3
Required-by:

Let us first import the necessary packages

In [8]:

from selenium import webdriver
chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_argument("--headless")
chromeOptions.add_argument("--remote-debugging-port=9222")
chromeOptions.add_argument('--no-sandbox')
chromeOptions.add_argument("user-agent=Mozilla/5.0 (Windows NT 10.0;Win64) AppleWebkit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36")
wd = webdriver.Chrome(options=chromeOptions)

Let us try to crawl the following url...

In [9]:

url = 'https://www.linkedin.com/jobs/search?keywords=&location=San%20Francisco%2C%20California%2C%20United%20States&locationId=&geoId=102277331&f_TPR=&distance=100&position=1&pageNum=0'
wd.get(url)
no_of_jobs = int(wd.find_element_by_css_selector('h1>span').get_attribute('innerText'))

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_1309880/129965872.py in <cell line: 2>()
      1 wd.get(url)
----> 2 no_of_jobs = int(wd.find_element_by_css_selector('h1>span').get_attribute('innerText'))

AttributeError: 'WebDriver' object has no attribute 'find_element_by_css_selector'

How To Fix AttributeError: 'WebDriver' object has no attribute 'find_element_by_css_selector'

Note -
In Selenium version 4.0 and later, the find_element_by_* and find_elements_by_* methods have been deprecated in favor of the find_element() and find_elements() methods, respectively.

To locate an element by its CSS selector, you can use the find_element() method and specify the By.CSS_SELECTOR attribute, like this:

In [2]:

from selenium.webdriver.common.by import By

In [5]:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
element = WebDriverWait(wd, 20).until(
    EC.visibility_of_element_located((By.CSS_SELECTOR, "h1>span"))
)

In [7]:

element.text

Out[7]:

'231,000+'

How To Use Selenium Webdriver To Crawl Websites

How To Fix AttributeError: 'WebDriver' object has no attribute 'find_element_by_css_selector'

Related Notebooks