|
Screaming Frog is known and loved for his versatility. Among its most useful functions we remember Custom Extraction , with which we extract certain information on the pages of a site using a set of rules , such as Regex and CSS or Xpath selectors .Within an e-commerce we expect to find a duplication of content, for example in the description within the product sheet. For example, let's think about the typical situation of non-canonicalized filters and sorting .By way of example, let's take a card from a historic leather goods brand.example product sheetOnce we have identified the section containing the description we can peek into the code through Chrome's Inspection tool .
product sheet code examplesIn this way we discover that the Special Data description is inserted inside a tag p identified by the class named description . We expect that the pages are built with the same logic and therefore that all product descriptions are identified with the same class.We can build a CSS selector to delimit the portion of the page that we are going to extract. In our specific case the selector will be a trivial and simple .description .Fast on CSS selectors?There are two ways: review this guide or let Google Chrome help you . Once inside "inspect", click on the portion of code you are interested in and then right click. Like magic, Chrome allows you to extrapolate ready -to-use CSS or XPath selectors .example product sheet code-copy selectorsGreat, now that we have our selector let's set the Custom Extraction (pathconfiguration > custom > extraction ) and then we launch the site crawl. For an analysis of this type it is best to configure the crawler to respect noindex and canonical .Here is the setting to extrapolate all the descriptions of the product sheets for the ecommerce used in our example.custom extraction screaming frogWe specify the Extract Text option to better read the data.

Once the crawl is finished, we go into the custom extraction tab . By clicking on the selector label ( product description in our case) we display the information in alphabetical order . In this way we find blocks of similar and/or identical contents at a glance .custom extraction result screaming frogOnce you have the data set you will need to export the data and pass it onto spreadsheets such as Excel and Google Sheet . We recommend using Excel because it allows you to quickly highlight duplicates using conditional formatting rules . If you want to know more, please leave us a comment.This function is also suitable for other strategic uses. For example, it can give you input and ideas useful for analyzing competitors . For example, how do you write the description of the product sheet? Does it follow rigid patterns or are they compiled naturally? Furthermore, we can set up a custom extraction to obtain product prices and analyze the pricing .
|
|