NbShare
  • Nbshare Notebooks

  • Table of Contents

  • Python Utilities

    • How To Install Jupyter Notebook
    • How to Upgrade Python Pip
    • How To Use Python Pip
  • Python

    • Python Datetime
    • Python Dictionary
    • Python Generators
    • Python Iterators and Generators
    • Python Lambda
    • Python Sort List
    • String And Literal In Python 3
    • Strftime and Strptime In Python
    • Python Tkinter
    • Python Underscore
    • Python Yield
  • Pandas

    • Aggregating and Grouping
    • DataFrame to CSV
    • DF to Numpy Array
    • Drop Columns of DF
    • Handle Json Data
    • Iterate Over Rows of DataFrame
    • Merge and Join DataFrame
    • Pivot Tables
    • Python List to DataFrame
    • Rename Columns of DataFrame
    • Select Rows and Columns Using iloc, loc and ix
    • Sort DataFrame
  • PySpark

    • Data Analysis With Pyspark
    • Read CSV
    • RDD Basics
  • Data Science

    • Confusion Matrix
    • Decision Tree Regression
    • Logistic Regression
    • Regularization Techniques
    • SVM Sklearn
    • Time Series Analysis Using ARIMA
  • Machine Learning

    • How To Code RNN and LSTM Neural Networks in Python
    • PyTorch Beginner Tutorial Tensors
    • Rectified Linear Unit For Artificial Neural Networks Part 1 Regression
    • Stock Sentiment Analysis Using Autoencoders
  • Natural Language
    Processing

    • Opinion Mining Aspect Level Sentiment Analysis
    • Sentiment Analysis using Autoencoders
    • Understanding Autoencoders With Examples
    • Word Embeddings Transformers In SVM Classifier
  • R

    • DataFrame to CSV
    • How to Create DataFrame in R
    • How To Use Grep In R
    • How To Use R Dplyr Package
    • Introduction To R DataFrames
    • Tidy Data In R
  • A.I. News
NbShare Notebooks
  • Publish Your Post On nbshare.io

  • R Python Pandas Data Science Excel NLP Numpy Pyspark Finance

How To Use Selenium Webdriver To Crawl Websites

I have Selenium 4.7.2 version installed. You can check yours using following command.

In [11]:
!pip show selenium
Name: selenium
Version: 4.7.2
Summary: 
Home-page: https://www.selenium.dev
Author: 
Author-email: 
License: Apache 2.0
Location: /home/anaconda3/envs/condapy38/lib/python3.8/site-packages
Requires: certifi, trio, trio-websocket, urllib3
Required-by: 

Let us first import the necessary packages

In [8]:
from selenium import webdriver
chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_argument("--headless")
chromeOptions.add_argument("--remote-debugging-port=9222")
chromeOptions.add_argument('--no-sandbox')
chromeOptions.add_argument("user-agent=Mozilla/5.0 (Windows NT 10.0;Win64) AppleWebkit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36")
wd = webdriver.Chrome(options=chromeOptions)

Let us try to crawl the following url...

In [9]:
url = 'https://www.linkedin.com/jobs/search?keywords=&location=San%20Francisco%2C%20California%2C%20United%20States&locationId=&geoId=102277331&f_TPR=&distance=100&position=1&pageNum=0'
wd.get(url)
no_of_jobs = int(wd.find_element_by_css_selector('h1>span').get_attribute('innerText'))
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_1309880/129965872.py in <cell line: 2>()
      1 wd.get(url)
----> 2 no_of_jobs = int(wd.find_element_by_css_selector('h1>span').get_attribute('innerText'))

AttributeError: 'WebDriver' object has no attribute 'find_element_by_css_selector'

How To Fix AttributeError: 'WebDriver' object has no attribute 'find_element_by_css_selector'

Note -
In Selenium version 4.0 and later, the find_element_by_* and find_elements_by_* methods have been deprecated in favor of the find_element() and find_elements() methods, respectively.

To locate an element by its CSS selector, you can use the find_element() method and specify the By.CSS_SELECTOR attribute, like this:

In [2]:
from selenium.webdriver.common.by import By
In [5]:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
element = WebDriverWait(wd, 20).until(
    EC.visibility_of_element_located((By.CSS_SELECTOR, "h1>span"))
)
In [7]:
element.text
Out[7]:
'231,000+'

Related Notebooks

  • How To Use Python Pip
  • How To Use Pandas Correlation Matrix
  • How To Use R Dplyr Package
  • How To Use Grep In R
  • Crawl Websites Using Python
  • How To Write DataFrame To CSV In R
  • How To Convert Python List To Pandas DataFrame
  • How to Export Pandas DataFrame to a CSV File
  • How To Append Rows With Concat to a Pandas DataFrame

Register

User Already registered.


Login

Login

We didn't find you! Please Register

Wrong Password!


Register
    Top Notebooks:
  • Data Analysis With Pyspark Dataframe
  • Strftime and Strptime In Python
  • Python If Not
  • Python Is Integer
  • Dictionaries in Python
  • How To install Python3.9 With Conda
  • String And Literal In Python 3
  • Privacy Policy
©