FixThatAppAll Tools
SEO

Sitemap URL List Cleaner

Clean and normalize URL lists before sitemap submission.

How This Tool Works

The Sitemap URL List Cleaner processes a raw list of URLs and removes duplicates, strips UTM tracking parameters (?utm_source=, ?utm_medium=, etc.), normalizes trailing slashes for consistency, and filters to a single domain. This is essential before generating or submitting a sitemap. Sitemaps with tracking parameters create thousands of 'unique' URLs from a single page, wasting crawl budget and potentially causing duplicate content issues. Google's sitemap protocol allows maximum 50,000 URLs per file — cleaning ensures you don't waste slots on junk URLs.

How to Use

  1. Paste your raw URL list in field A (one URL per line).
  2. Click Run. The cleaned URL list is returned with duplicates removed and tracking parameters stripped.
  3. The result shows how many URLs were removed and why.
  4. Copy the output directly into your sitemap generator or migration tool.

Common Questions

Why do tracking parameters cause problems in sitemaps?

UTM parameters (utm_source, utm_medium, utm_campaign, fbclid, gclid) don't change page content. A URL with and without UTMs serves the same page. In a sitemap, each unique URL string is treated as a separate page — leading Google to index tracked versions and split ranking signals.

What is URL normalization?

Normalization ensures equivalent URLs are recognized as the same. Steps: lowercase the domain, remove default ports (:443 for https), sort query parameters consistently, apply a uniform trailing-slash rule, and decode unnecessarily encoded characters.

Should noindex pages be in the sitemap?

No. Including noindex pages in your sitemap sends contradictory signals — you're telling Google to index (sitemap) and not index (noindex) the same page simultaneously. Remove all noindex, redirect destination (non-canonical), and login/admin pages from sitemaps.