URLsMatch.eu is a tool for SEO copywriting that allows you to analyze up to 3 different URLs and highlight common terms and many other useful information for on-page SEO . Through this tool it is possible to find relevant keywords used by TOP competitors who rank on the first page on Google.
With URLsMatch.eu we can analyze, for example, the first 3 sites present in Google for a specific search and view the terms used in the related websites by filtering the common words. If all three websites use certain keywords, maybe you should too if you want to rank because both Google and users expect to find those concepts expressed .
How URLsMatch.eu works
In this paragraph I explain all the options available in this new tool for SEO copywriters .
Form for entering data
- URL A / B / C:in these fields you can enter 1 to 3 URLs to be analyzed. By entering a single URL, the tool performs a Keyword Density analysis while by entering 2 or 3 URLs the tool compares the words used, highlighting those in common.
- Stop Words:stop words are words to be excluded from the analysis, such as articles and conjunctions. It is possible to modify the list by inserting the desired stop words.
- UserAgent:The term UserAgent refers to clients accessing the World Wide Web. In addition to browsers, web user agents can be crawlers of Search Engines, cell phones, screen readers and braille browsers used by blind people. When Internet users visit a website, a text string is usually sent to make the server identify the user agent. This is part of the HTTP request, prefixed with “User-agent:” or “User-Agent:” and typically includes information such as the client application name, version, operating system, and language. Bots often include the owner’s web address and email address as well, so that the site administrator can contact him. The user-agent string is one of the criteria for which some bots can be excluded from some pages using the robots.txt file. This allows webmasters, who believe that some parts of their site (or the whole site) should not be included in the data collected by a particular bot or that that particular bot is using too much bandwidth, to block access to the pages. The UserAgent of this tool can be modified at will by replacing the string in the field.
- Character Limit:the character limit sets the minimum length of the words to be analyzed. The default setting is 4 which means that all words with less than 4 characters are excluded.
- Repetitions:minimum number of repetitions to be considered in the analysis. The basic setting is 1, increasing the limit only considers keywords that are repeated on the page 2 or more times up to a maximum of 6.
- Number of Words:this field sets the number of words to consider in the comparison. The base value is set to 1, setting the value to 2 analyzes word pairs, and so on up to 6 consecutive words.
- Analyze:pressing this button starts the data analysis.
- Excel download:the tool allows you to extract data by generating an Excel file on the client side. If a warning is shown, press “ok” to continue.
- Canonical:if the Rel Canonical tag is found , the tool activates the link to the canonical URL.
- HTTP Status: status code (200, 3xx, 4xx and 5xx) .
- Hops:number of redirects needed to reach the requested page.
- Headers: HTTP header of the web server response.
- Content Words:Count of words between the H1 tag and the end of the post or page. The algorithm tries to automatically exclude the words included in the footer, in the sidebar, in the navigation bar and in any comments at the end of the article.
- Total Words:Count of words included in the HTML body tag.
- Words Ratio:ratio between Content Words and Total Words.
- Text characters:count of the characters included in the HTML body tag.
- HTML Characters: Charactercount of all HTML
- Character Ratio:ratio of Text Fonts to HTML Fonts.
- Title:title tag of the requested page.
- Title Length:length of the title tag in characters.
- Description:meta description of the requested page.
- Description Length:Length of the meta description tag in characters.
- Keywords:meta keywords field.
- Keywords Length:length of the keywords meta tag in characters.
- NoIndex:this field indicates YES if the requested page is tagged NoIndex .
- NoFollow:this field indicates YES if the requested page is tagged NoFollow .
- Sitemap:if the tool finds the Sitemap.xml it displays the link to the XML file.
- Robots:if the tool detects Robots.txt, it displays the link to the TXT file.
- H1: H1Heading tag.
- H1 Multiple:this field indicates YES if the tool detects more than one H1 tag.
- Number of Images:number of images found on the page. By clicking on the number a popup displays all the Alt Tags.
- Internal Link Number:number of internal links identified on the page. By clicking on the number a popup displays the Anchor Text .
- Outbound Link Number:number of outbound links found on the page. By clicking on the number a popup displays the Anchor Text.
Data in column
- Filters:When entering 2 or 3 URLs to be analyzed you can use filters to highlight common words. The filter A + B + C (green) highlights the words common to all 3 analyzed pages. Filters A + B (yellow), A + C (orange) and B + C (blue) highlight the words in common on two pages.
- Search:through the Search field it is possible to filter the words analyzed in the pages in real time.
- Occurrences (All):Repetitions of the keyword within the HTML of the parsed URLs.
- Occurrences (Content) –Repetitions of the keyword within the body text of the scanned URLs.
- TF-IDF:the TF-IDF value indicates the rarity of each single term in relation to all the terms identified in the analyzed URLs. The tf-idf (term frequency – inverse document frequency) weight function is a function used in Information Retrieval to measure the importance of a term with respect to a document or collection of documents. This function increases proportionally to the number of times the term is contained in the document, but increases inversely with the frequency of the term in the collection. The idea behind this behavior is to give more importance to the terms that appear in the document, but which in general are infrequent.