I’m having trouble with a web scraping project. I’m using Python and Selenium to scrape an e-commerce website. The code runs without any errors, but I’m not getting any output. Here’s what I’ve tried:
- Used Selenium 4.8.2
- Set up Chrome WebDriver (it opens successfully)
- Added time.sleep() to wait for page load
- Tried different XPaths to locate elements
- Attempted to print() values for debugging
My code is supposed to:
- Open the website
- Find product divs
- Extract title, price, and rating
- Save data to a JSON file
But nothing is being saved or printed. I’ve looked at other solutions on forums, but they haven’t helped. Any ideas on what might be going wrong or how to troubleshoot this?
Here’s a simplified version of what I’m trying to do:
from selenium import webdriver
from selenium.webdriver.common.by import By
import json
import time
driver = webdriver.Chrome()
driver.get('https://example-shop.com/products')
time.sleep(5)
items = driver.find_elements(By.CLASS_NAME, 'product-card')
product_data = []
for item in items:
name = item.find_element(By.CLASS_NAME, 'product-name').text
price = item.find_element(By.CLASS_NAME, 'product-price').text
product_data.append({'name': name, 'price': price})
with open('products.json', 'w') as f:
json.dump(product_data, f)
driver.quit()
Any help would be greatly appreciated!
hey max, had similar issues b4. try adding explicit waits instead of time.sleep(). like:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
items = wait.until(EC.presence_of_all_elements_located((By.CLASS_NAME, ‘product-card’)))
might help if the site loads stuff dynamically. good luck!
I’ve encountered similar issues when scraping e-commerce sites. One potential problem could be that the website is using JavaScript to render content dynamically. In this case, Selenium might not be able to find the elements immediately after the page loads.
To address this, you could try implementing a more robust waiting strategy. Instead of using time.sleep(), consider using WebDriverWait with expected conditions. This approach waits for specific elements to be present before proceeding.
Additionally, you might want to check if the website has any anti-scraping measures in place. Some sites use CAPTCHAs or IP-based blocking to prevent automated access. In such cases, you may need to implement additional techniques like rotating user agents or using proxy servers.
Lastly, have you verified that the class names you’re using in your selectors are correct and consistent across different pages? Sometimes, e-commerce sites use slightly different class names for various product categories or during sales events.
Hey Max_31Surf! Web scraping can be tricky, especially with dynamic sites. Have you tried checking if the elements are in an iframe or shadow DOM? Sometimes that can trip things up. Also, what about using a different locator strategy? Maybe CSS selectors instead of class names?
Just curious, what made you choose Selenium over something like requests-html or Scrapy? I’ve found those can be easier for some e-commerce sites.
Oh, and don’t forget to check the network tab in your browser’s dev tools. It might give you clues about any AJAX calls or hidden APIs the site is using. Could be a goldmine!
Keep us posted on what you find out. Web scraping can be a fun puzzle to solve!