Extracting product information from an e-commerce site using Python and BeautifulSoup

I’m trying to get product details from an online store using Python and BeautifulSoup. Here’s what I’ve got so far:

import requests
from bs4 import BeautifulSoup

url = 'https://example-ecommerce.com'
headers = {'User-Agent': 'Mozilla/5.0'}

response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')

product_container = soup.find('div', class_='product-list')
if product_container:
    product_items = product_container.find_all('li', class_='product-item')
    for item in product_items:
        product_name = item.find('h3', class_='product-name')
        product_price = item.find('span', class_='product-price')
        if product_name and product_price:
            print(f'Name: {product_name.text.strip()}, Price: {product_price.text.strip()}')
else:
    print('Product container not found')

This code finds the main product list and tries to get the names and prices. But I’m not sure if I’m doing it correctly. How can I capture all the necessary product details, including their links? I’d appreciate any advice on improving this script. Thanks!

Hey Ethan_Cosmos! Your code’s looking solid so far. Have you thought about grabbing any other details like product descriptions or ratings? Those can be super useful too.

Maybe you could try something like:

product_desc = item.find('p', class_='product-description')
product_rating = item.find('div', class_='product-rating')

Just curious - what kind of products are you looking at? I’ve found that different sites structure their HTML differently, so sometimes you gotta get creative with the selectors.

Oh, and have you run into any issues with rate limiting? Some sites can get a bit touchy if you’re scraping too fast. Might be worth adding a small delay between requests if you haven’t already.

Keep us posted on how it goes! This kind of web scraping can be pretty fun once you get the hang of it.

hey there! ur code looks pretty good, but to get product links try adding:

product_link = item.find('a', class_='product-link')['href']
print(f'Link: {product_link}')

inside ur loop. also, consider using a list to store all product info instead of just printing. that way u can save or process the data easier later on. hope this helps!