I have crawled over 200 travel and tourism websites with Screaming Frog. Hotel chains with 50,000 pages of location content. DMO sites with decade-old redirects pointing to deleted pages. Tour operators running three different booking systems that create infinite URL parameters. Every crawl teaches me something new about how travel sites break in ways other industries never experience.
This walkthrough covers how I actually configure Screaming Frog for travel site audits. Not the generic settings you find in every tutorial, but the specific configurations that matter when you are dealing with multi-language hotel pages, seasonal tour content, and booking engines that generate thousands of junk URLs.
Why Travel Sites Need Custom Crawl Settings
Travel websites are structurally different from most sites Screaming Frog users crawl. A B2B SaaS site might have 500 clean pages with consistent templates. A mid-sized DMO site easily hits 15,000 pages across destination guides, event listings, partner directories, blog archives, and filtered search results.
The problems compound quickly. That resort website looks simple until you realize their booking widget generates unique URLs for every date combination. That tourism board appears organized until you discover they have seven different URL structures from seven different CMSs they have used since 2008.
Default Screaming Frog settings will either miss critical pages or waste hours crawling booking calendar URLs that add zero SEO value. Custom configuration is not optional for travel sites.
Initial Setup: License and Memory Allocation
Before touching any crawl settings, configure your system properly. Travel site crawls are resource intensive.
Open Screaming Frog and go to File > Configuration > System. Increase memory allocation to at least 4GB if you have it available. For sites over 20,000 pages, I use 8GB. Running out of memory mid-crawl on a hotel chain site with 40,000 location pages means starting over from scratch.
The free version limits you to 500 URLs. That is useless for any real travel site audit. The £199 annual license pays for itself on the first client project. If you are serious about technical SEO for tourism, this is non-negotiable.
Under Configuration > Speed, I start travel site crawls at 2 URLs per second with 2 concurrent threads. Travel sites often run on shared hosting or legacy infrastructure that cannot handle aggressive crawling. I have crashed client staging environments by crawling too fast. Slow and complete beats fast and blocked.
Essential Configuration for Tourism Site Crawls
URL Handling and Exclusions
Go to Configuration > Exclude to set up your exclusion patterns before starting the crawl. For travel sites, I always exclude:
- Booking engine parameters: Add patterns like
/book?,/reservation?,/availability?, and any date-related parameters. One resort client had a booking system generating over 800,000 unique URLs from date range combinations. Without exclusions, the crawl would run for days and tell you nothing useful. - Calendar and date filtering: Patterns like
/calendar/,/events?date=, and/tours?month=create infinite crawl loops. Exclude them unless you specifically need to audit date-based pages. - Print and PDF versions: Many travel sites generate print-friendly versions of pages. These duplicate content without adding value to your audit.
- Session IDs and tracking parameters: Add patterns for
?sid=,?session=,?utm_, and similar parameters. Travel sites often append tracking to internal links from booking confirmations and email integrations.
Crawl Settings for Multi-Language Sites
Most tourism sites serve multiple languages. Go to Configuration > Spider and check these settings:
Enable Crawl All Subdomains if the site uses subdomains for languages (de.hotel.com, fr.hotel.com). Many European hotel groups use this structure.
For subdirectory languages (/en/, /es/, /de/), the default settings work, but pay attention to hreflang tags later in your analysis. Travel sites have the highest rate of hreflang errors I see across any industry.
Under Configuration > Extraction, set up custom extraction for hreflang values. I use XPath extraction with //link[@rel='alternate']/@hreflang to pull all declared language versions. This lets me quickly identify pages missing from the hreflang cluster.
JavaScript Rendering Settings
Go to Configuration > Spider and set Rendering to JavaScript. Modern travel sites load critical content through JavaScript, especially:
- Hotel room availability and pricing
- Tour date selections
- Review aggregations from third-party platforms
- Interactive maps and location content
- Dynamic filtering for search results
JavaScript rendering increases crawl time significantly. For initial audits, I often run two crawls: one HTML-only for speed, then a JavaScript-rendered crawl of pages I want to inspect more closely.
Set a reasonable timeout under Configuration > Speed. I use 20 seconds for JavaScript rendering. Travel sites with heavy booking widget integrations sometimes take 15+ seconds to fully render.
What to Look for in Travel Site Crawl Data
Indexation Problems
Export your crawl and filter for pages with noindex directives. Tourism sites frequently noindex pages that should be indexed and vice versa.
Common patterns I find:
- Noindexed location pages: A hotel chain I audited had 3,400 city pages set to noindex because a developer was testing something in 2019 and forgot to revert it. Those pages could have been ranking for years.
- Indexed filtered results: Tour operator sites often fail to noindex or canonicalize filtered search pages. You end up with hundreds of indexed pages like /tours?duration=3-days&difficulty=easy that compete with your main category pages.
- Inconsistent canonicalization: Check the Canonicals tab. Travel sites commonly have pages that canonical to themselves (correct) mixed with pages that canonical to category parents (sometimes wrong) and pages with no canonical at all (needs attention).
Redirect Chain Analysis
Travel sites accumulate redirect chains faster than any other industry I work with. DMOs rebrand every few years. Hotels change ownership. Tour operators merge booking platforms.
In Screaming Frog, filter for redirect chains in the Response Codes tab. I have found chains 8-9 hops deep on government tourism sites that have been through multiple administrations and website redesigns.
Each redirect adds latency. Chains over 3 hops cause crawl budget waste. Chains over 5 hops risk not passing link equity effectively. Export all chains and create a redirect cleanup document prioritized by inbound link value.
Internal Linking Structure
Use the Site Structure visualization (Visualisations > Crawl Tree Graph) to see how content is organized. Travel sites should have clear hierarchies:
Homepage > Destinations > Regions > Cities > Individual Properties/Tours
What I often find instead: orphaned pages buried 7+ clicks deep, critical location pages only linked from footer navigation, and internal links pointing to redirected URLs instead of final destinations.
Export the Internal tab and filter by Unique Inlinks. Pages with fewer than 3 internal links pointing to them need attention, especially if those pages target competitive keywords like “things to do in [destination]” or “best hotels in [city].”
Page Speed Indicators
Screaming Frog provides basic timing data under the Response Time column. For travel sites, I flag any page over 2 seconds. These deserve PageSpeed Insights analysis.
Travel sites have specific speed problems:
- Unoptimized hero images (I have seen 8MB destination photos)
- Third-party booking widget scripts blocking render
- Multiple map embeds loading on a single page
- Review aggregation widgets pulling from external APIs
Sort by Response Time descending to identify your slowest pages. These are usually either image-heavy destination guides or pages with multiple third-party integrations.
Duplicate and Thin Content
Use the Content tab to analyze page content. Filter for pages with identical or near-identical Page Titles and H1s. Hotel sites are notorious for this, using the same template title across hundreds of room pages.
Check the Word Count column. Filter for pages under 300 words. Tour operator product pages often contain nothing but booking widgets and bullet points, which creates thin content signals even if the page has commercial value.
Extract on-page content using Configuration > Content > Store HTML and extract text. This lets you use the Near Duplicates feature to find content that has been copied across multiple pages with minor variations. Tourism boards often do this with boilerplate destination descriptions.
Common Issues I Find on Travel Site Crawls
Faceted Navigation Disasters
Tour search pages with filters for destination, duration, difficulty, price range, and activity type can generate thousands of URL combinations. Without proper handling, these all get crawled and potentially indexed.
During your crawl, watch for URL patterns like /tours?dest=costa-rica&duration=7&type=adventure&price=1000-2000. If these are being crawled in high numbers, the site needs faceted navigation cleanup: parameter handling in Search Console, canonical tags to main category pages, or noindex directives on filtered results.
Expired Event and Tour Pages
Tourism sites accumulate dead content. That festival page from 2019 still exists and gets crawled. Those 47 tour pages for trips that stopped running in 2021 are still indexed.
Use custom extraction to pull date information from event pages. Filter for anything with dates in the past. These pages either need to be redirected, updated with current information, or consolidated into evergreen content.
Broken Booking Integration Links
Travel sites frequently have broken internal links to booking pages or partner sites. Check the External tab for links returning 4xx or 5xx errors. Broken booking links directly impact revenue.
Also check for soft 404s. Some booking systems return 200 status codes for unavailable products instead of proper 404s. These need to be identified manually by reviewing page content.
Missing or Incorrect Schema Markup
Under Configuration > Spider > Extraction, enable Structured Data extraction. Travel sites should have:
- LocalBusiness or LodgingBusiness schema for hotels
- TouristAttraction schema for destination pages
- Event schema for tours and activities with specific dates
- Organization schema on the homepage
- BreadcrumbList schema for navigation structure
Export the Structured Data tab and check for validation errors. I find incorrect schema on roughly 60% of travel sites I audit, usually outdated schema types or required fields missing from otherwise valid markup.
Creating Your Audit Report
After completing the crawl, I organize findings by priority:
- Critical (fix immediately): Indexation issues on revenue pages, broken booking links, site-wide redirect chains, canonical errors on primary landing pages.
- High (fix within 30 days): Duplicate content across location pages, thin content on tour product pages, faceted navigation creating crawl bloat, slow page speed on key entry pages.
- Medium (fix within 90 days): Internal linking improvements, expired content cleanup, schema implementation, hreflang corrections.
For each issue, include the specific URLs affected, the crawl data showing the problem, and a clear recommendation. Travel clients need actionable specifics, not vague suggestions to “improve site structure.”
Frequently Asked Questions
How long should a Screaming Frog crawl take for a travel site?
A 10,000 page travel site with JavaScript rendering enabled takes 4-8 hours at conservative speed settings. Larger sites can take overnight. I start crawls in the evening for big sites and review results the next morning. Do not rush crawl speed on travel infrastructure.
Should I crawl staging or production for travel site audits?
Always crawl production unless you are specifically auditing a pre-launch redesign. Staging environments for travel sites are notoriously different from production. Booking integrations, CDN configurations, and third-party scripts often only exist on production. Staging crawls can miss critical issues.
How do I handle password-protected areas like member booking portals?
Go to Configuration > Authentication and set up form-based or standard authentication. For travel sites with member areas, I usually exclude these from initial crawls since they are typically personalized content that should not be indexed anyway. Only include authenticated areas if there are SEO-relevant pages behind login.
What if the travel site is blocking Screaming Frog?
First, check robots.txt for crawl restrictions. If the site is actively blocking, use Configuration > User-Agent to set a different user agent string. For persistent blocking, coordinate with the client’s development team to whitelist your IP address. Some travel sites have aggressive bot protection that requires whitelisting.
How often should travel sites run Screaming Frog crawls?
Monthly for sites with frequent content updates like DMOs with event listings. Quarterly for more static sites like individual hotels. Always run a full crawl after any significant site changes: CMS migrations, booking system updates, or structural redesigns. I have clients on monitoring schedules where I run automated crawls weekly and flag changes.
Get Your Travel Site Audited Properly
Running Screaming Frog is straightforward. Interpreting the data for travel-specific contexts and turning findings into prioritized action plans requires experience with tourism site architecture and business models.
If you want a complete technical SEO audit of your travel or tourism website, including Screaming Frog analysis combined with log file review, competitive analysis, and implementation guidance, get in touch. I work with DMOs, hotel groups, and tour operators who want actionable recommendations, not generic audit reports that sit in a folder forever.

Written by Peter Sawicki, an experienced strategist with a background spanning multiple industries, from private enterprises to government projects. Having worked across different countries and markets, I bring a global perspective and practical insights to every SEO strategy I design. As a diver and adventure seeker, I’ve learned to balance attention to detail with a drive to explore new solutions, a mix that shapes both my work and my life.




