HomeBlogGrowthHow to scrape E-commerce Prices with Screaming Frog and XPATH

How to scrape E-commerce Prices with Screaming Frog and XPATH

Date:

For any digital marketer or e-commerce specialist, having up-to-date data on competitors’ products is essential. Manually collecting prices, SKUs, or product descriptions is a time-consuming and often inaccurate process. Thankfully, with the right tools, you can automate this task efficiently.

This guide will walk you through a powerful, no-code method for scraping specific elements from web pages using the Screaming Frog SEO Spider and XPATH.

What is Screaming Frog?

The Screaming Frog SEO Spider is a desktop program that crawls websites and collects data. While it’s most famous for technical SEO audits, its “Custom Extraction” feature turns it into a powerful web scraper.

Understanding XPATH

XPATH (XML Path Language) is a query language for selecting nodes or elements from an XML document. Since HTML pages are essentially structured in a similar way, XPATH allows you to navigate the page’s code to find and extract very specific data points. Think of it as a set of directions to a particular piece of information on a web page.

While a browser’s “Inspect Element” feature can give you a basic XPATH, you often need to refine it to make it more robust for a large-scale crawl.

Step 1: Find the right XPATH for your data

Before you open Screaming Frog, you need to find the correct XPATH for the data you want to scrape. In our example, we’ll focus on a product’s price.

  • Go to the product page you want to scrape and right-click on the price.
  • Select “Inspect” from the context menu. This will open your browser’s Developer Tools and highlight the corresponding HTML element.
  • Right-click on the highlighted code in the Elements panel, and go to Copy > Copy XPATH.

The copied XPATH will likely look something like this: /html/body/div[2]/div[1]/div[3]/div[1]/div/div/span/span[2]. This is an absolute XPATH, which is fragile and can break if the website’s structure changes. For a more reliable solution, we’ll create a relative XPATH.

Creating a more robust XPATH:

A better approach is to find a unique, stable attribute near the element, such as an id or a specific class name.

  • Look at the HTML code for the price. You might see a div or span with a class like "product-price" or "price-sale".
  • A more reliable XPATH would look like this: //span[@class='product-price']. This tells the crawler to find any <span> tag anywhere (//) on the page that has the class attribute (@class) with the value "product-price".
  • If the price is inside a parent div with a unique ID, you can make the path even more specific: //div[@id='product-info']//span[@class='product-price'].

Step 2: Set up custom extraction in Screaming Frog

Now that you have your XPATH, it’s time to set up the Screaming Frog crawler.

  • Open the Screaming Frog SEO Spider and go to Configuration > Custom > Extraction.
  • In the extraction window, click the Add button.
  • Name your extractor in the Extractor Name field (e.g., “Product Price”).
  • Select XPath from the Extractor Type dropdown.
  • Paste your robust XPATH into the XPath text box.
  • Choose the Extract Text filter. This will ensure you only get the price value, not the surrounding HTML. For more complex elements, you can also use Extract HTML Element or Extract Inner HTML.

Step 3: Run the crawl and export the data

With your extractor configured, you’re ready to start the crawl.

  • Paste the URL(s) you want to crawl into the address bar at the top of Screaming Frog. For a full site, use the homepage URL. For a specific list of products, switch to Mode > List and paste your list of URLs.
  • Click the Start button.

The crawler will begin visiting each page. As it encounters the XPATH you defined, it will scrape the data and populate it in a dedicated column.

Step 4: Analyze your scraped data

Once the crawl is complete, you can review your data.

  • Go to the Custom Extraction tab in the main window to view all of your scraped data.
  • For a comprehensive view, you can also filter the Internal tab by HTML and scroll to the right to see your custom column alongside other data like page titles and meta descriptions.
  • Export the data to a CSV file by clicking the Export button. From there, you can open it in Google Sheets or Excel for further analysis.

This automated process allows you to quickly gather intelligence on your competitors’ pricing strategies, enabling you to make data-driven decisions and stay ahead in the market.

Building a dynamic price tracking dashboard

After mastering the art of scraping e-commerce prices, the next logical step is to build a system that monitors these prices over time. A dynamic dashboard can transform your raw data into actionable insights, helping you react quickly to market shifts and competitive changes.

This guide will show you how to set up a simple yet effective price tracking dashboard using Google Sheets and a few key formulas.

Step 1: Set up Your Google Sheet

First, you need a structured place to store your scraped data. Create a new Google Sheet with the following columns:

  • URL: The URL of the product page.
  • Product Name: The name of the product (optional, but helpful for context).
  • Date: The date the data was scraped.
  • Price: The price extracted from the page.

You can manually paste the data from your Screaming Frog CSV exports into this sheet. To automate this, you could use a script, but for now, we’ll focus on the dashboard itself.

Step 2: Create a Master Data tab

For a clean and manageable dashboard, dedicate a single tab (e.g., “Raw Data”) to all your scraped information. This is where you’ll paste new data from your Screaming Frog crawls.

Next, create a separate tab called “Dashboard.” This tab will contain all your visual and summary information.

Step 3: Use formulas to pull key data

On your “Dashboard” tab, you will use formulas to analyze the “Raw Data” tab. We’ll start with a simple setup for a single product.

Let’s assume the product data you want to analyze is for a product on https://example.com/product-123.

Latest Price

To get the most recent price for a specific product, you can use a combination of VLOOKUP and MAX.

The MAX function will find the most recent date for a given URL, and VLOOKUP will then find the price associated with that date.

=VLOOKUP(MAX(FILTER(‘Raw Data’!C:C, ‘Raw Data’!A:A=”https://example.com/product-123″)), ‘Raw Data’!C:D, 2, FALSE)

FILTER('Raw Data'!C:C, 'Raw Data'!A:A="https://example.com/product-123"): This filters all dates (C:C) in the “Raw Data” tab for a specific URL.

MAX(...): This finds the latest date from the filtered list.

VLOOKUP(..., 'Raw Data'!C:D, 2, FALSE): This looks up that latest date (...) in the date and price columns (C:D) and returns the price (the 2nd column).

Historical Price

To find the earliest recorded price, you can use a similar formula, replacing MAX with MIN.

=VLOOKUP(MIN(FILTER(‘Raw Data’!C:C, ‘Raw Data’!A:A=”https://example.com/product-123″)), ‘Raw Data’!C:D, 2, FALSE)

Step 4: Visualize your data

Visualizing the data is where a dashboard truly shines. You can create charts to track price trends over time.

  • On the “Raw Data” tab, select the Date and Price columns.
  • Go to Insert > Chart.

Choose a Line Chart to show how the price has changed over time.

Step 5: Create a Price Change Alert System

A true dynamic dashboard should alert you to significant changes. We can use a simple conditional formatting rule to highlight products that have changed price since the last crawl.

On your “Dashboard” tab, you can add a column for “Price Change” and use the following formula:

=IF(LATEST_PRICE_CELL <> HISTORICAL_PRICE_CELL, “Price Changed!”, “Stable”)

Then, you can apply a conditional formatting rule to this column to turn the cell red if the text is “Price Changed!”.

This dashboard setup provides a solid foundation for competitive analysis, allowing you to quickly spot trends, identify pricing opportunities, and react to your competitors’ moves.

Enjoy!

Let's Connect!

I am eager to expand my professional network and connect with individuals from all career stages and backgrounds. Please feel free to reach out to discuss shared interests and perspectives.

Related articles:

LLM Comparison Case Study: Finding Basketball Stores in Greece – A Deep Dive into AI Response Quality

Discover how different AI models approach the same query...

Unlocking Hidden Potential: Mastering “Related Searches” and “People Also Ask” for Digital Acquisition

In the ever-evolving landscape of digital acquisition, staying ahead...

Overcoming the Synchronization Challenge: A Guide for Performance Marketers

In today's fast-paced digital landscape, where consumer expectations are...

Why Business Acumen is Important in Performance Marketing

Let's face it, the world of performance marketing can...

My Golden List of Google Ads Scripts and the Reason Why

Google Ads Scripts are essentially small programs written in...