For any digital marketer or e-commerce specialist, having up-to-date data on competitors’ products is essential. Manually collecting prices, SKUs, or product descriptions is a time-consuming and often inaccurate process. Thankfully, with the right tools, you can automate this task efficiently.
This guide will walk you through a powerful, no-code method for scraping specific elements from web pages using the Screaming Frog SEO Spider and XPATH.
What is Screaming Frog?
The Screaming Frog SEO Spider is a desktop program that crawls websites and collects data. While it’s most famous for technical SEO audits, its “Custom Extraction” feature turns it into a powerful web scraper.
Understanding XPATH
XPATH (XML Path Language) is a query language for selecting nodes or elements from an XML document. Since HTML pages are essentially structured in a similar way, XPATH allows you to navigate the page’s code to find and extract very specific data points. Think of it as a set of directions to a particular piece of information on a web page.
While a browser’s “Inspect Element” feature can give you a basic XPATH, you often need to refine it to make it more robust for a large-scale crawl.
Step 1: Find the right XPATH for your data
Before you open Screaming Frog, you need to find the correct XPATH for the data you want to scrape. In our example, we’ll focus on a product’s price.
- Go to the product page you want to scrape and right-click on the price.
- Select “Inspect” from the context menu. This will open your browser’s Developer Tools and highlight the corresponding HTML element.
- Right-click on the highlighted code in the Elements panel, and go to
Copy > Copy XPATH.
The copied XPATH will likely look something like this: /html/body/div[2]/div[1]/div[3]/div[1]/div/div/span/span[2]. This is an absolute XPATH, which is fragile and can break if the website’s structure changes. For a more reliable solution, we’ll create a relative XPATH.
Creating a more robust XPATH:
A better approach is to find a unique, stable attribute near the element, such as an id or a specific class name.
- Look at the HTML code for the price. You might see a
divorspanwith a class like"product-price"or"price-sale".
- A more reliable XPATH would look like this:
//span[@class='product-price']. This tells the crawler to find any<span>tag anywhere (//) on the page that has theclassattribute (@class) with the value"product-price".
- If the price is inside a parent
divwith a unique ID, you can make the path even more specific://div[@id='product-info']//span[@class='product-price'].
Step 2: Set up custom extraction in Screaming Frog
Now that you have your XPATH, it’s time to set up the Screaming Frog crawler.
- Open the Screaming Frog SEO Spider and go to
Configuration > Custom > Extraction.
- In the extraction window, click the
Addbutton.
- Name your extractor in the
Extractor Namefield (e.g., “Product Price”).
- Select
XPathfrom theExtractor Typedropdown.
- Paste your robust XPATH into the
XPathtext box.
- Choose the
Extract Textfilter. This will ensure you only get the price value, not the surrounding HTML. For more complex elements, you can also useExtract HTML ElementorExtract Inner HTML.
Step 3: Run the crawl and export the data
With your extractor configured, you’re ready to start the crawl.
- Paste the URL(s) you want to crawl into the address bar at the top of Screaming Frog. For a full site, use the homepage URL. For a specific list of products, switch to
Mode > Listand paste your list of URLs.
- Click the
Startbutton.
The crawler will begin visiting each page. As it encounters the XPATH you defined, it will scrape the data and populate it in a dedicated column.
Step 4: Analyze your scraped data
Once the crawl is complete, you can review your data.
- Go to the
Custom Extractiontab in the main window to view all of your scraped data.
- For a comprehensive view, you can also filter the
Internaltab byHTMLand scroll to the right to see your custom column alongside other data like page titles and meta descriptions.
- Export the data to a CSV file by clicking the
Exportbutton. From there, you can open it in Google Sheets or Excel for further analysis.
This automated process allows you to quickly gather intelligence on your competitors’ pricing strategies, enabling you to make data-driven decisions and stay ahead in the market.
Building a dynamic price tracking dashboard
After mastering the art of scraping e-commerce prices, the next logical step is to build a system that monitors these prices over time. A dynamic dashboard can transform your raw data into actionable insights, helping you react quickly to market shifts and competitive changes.
This guide will show you how to set up a simple yet effective price tracking dashboard using Google Sheets and a few key formulas.
Step 1: Set up Your Google Sheet
First, you need a structured place to store your scraped data. Create a new Google Sheet with the following columns:
- URL: The URL of the product page.
- Product Name: The name of the product (optional, but helpful for context).
- Date: The date the data was scraped.
- Price: The price extracted from the page.
You can manually paste the data from your Screaming Frog CSV exports into this sheet. To automate this, you could use a script, but for now, we’ll focus on the dashboard itself.
Step 2: Create a Master Data tab
For a clean and manageable dashboard, dedicate a single tab (e.g., “Raw Data”) to all your scraped information. This is where you’ll paste new data from your Screaming Frog crawls.
Next, create a separate tab called “Dashboard.” This tab will contain all your visual and summary information.
Step 3: Use formulas to pull key data
On your “Dashboard” tab, you will use formulas to analyze the “Raw Data” tab. We’ll start with a simple setup for a single product.
Let’s assume the product data you want to analyze is for a product on https://example.com/product-123.
Latest Price
To get the most recent price for a specific product, you can use a combination of VLOOKUP and MAX.
The MAX function will find the most recent date for a given URL, and VLOOKUP will then find the price associated with that date.
=VLOOKUP(MAX(FILTER(‘Raw Data’!C:C, ‘Raw Data’!A:A=”https://example.com/product-123″)), ‘Raw Data’!C:D, 2, FALSE)
FILTER('Raw Data'!C:C, 'Raw Data'!A:A="https://example.com/product-123"): This filters all dates (C:C) in the “Raw Data” tab for a specific URL.
MAX(...): This finds the latest date from the filtered list.
VLOOKUP(..., 'Raw Data'!C:D, 2, FALSE): This looks up that latest date (...) in the date and price columns (C:D) and returns the price (the 2nd column).
Historical Price
To find the earliest recorded price, you can use a similar formula, replacing MAX with MIN.
=VLOOKUP(MIN(FILTER(‘Raw Data’!C:C, ‘Raw Data’!A:A=”https://example.com/product-123″)), ‘Raw Data’!C:D, 2, FALSE)
Step 4: Visualize your data
Visualizing the data is where a dashboard truly shines. You can create charts to track price trends over time.
- On the “Raw Data” tab, select the
DateandPricecolumns.
- Go to
Insert > Chart.
Choose a Line Chart to show how the price has changed over time.
Step 5: Create a Price Change Alert System
A true dynamic dashboard should alert you to significant changes. We can use a simple conditional formatting rule to highlight products that have changed price since the last crawl.
On your “Dashboard” tab, you can add a column for “Price Change” and use the following formula:
=IF(LATEST_PRICE_CELL <> HISTORICAL_PRICE_CELL, “Price Changed!”, “Stable”)
Then, you can apply a conditional formatting rule to this column to turn the cell red if the text is “Price Changed!”.
This dashboard setup provides a solid foundation for competitive analysis, allowing you to quickly spot trends, identify pricing opportunities, and react to your competitors’ moves.
Enjoy!