Mastering Sitecore Search: Creating Sources and Utilizing Crawlers

Sitecore Search is Sitecore’s new artificial intelligence (AI) search tool that creates automated and personalized search experiences with content derived from a multitude of different sources. The platform provides ways to configure and extract your content into multiple or single indexes. Sitecore Search tracks user behavior anonymously to learn about various activities a user may be viewing to deliver personalized and focused search content.

The configuration and setup of your content sources are crucial to what gets pulled into the site’s search index. These allow for proper attribution of content as well as allow it to be ingested and viewed in a way that makes sense and that can be found easily.

I (Casey Stanutz, Senior Sitecore Account Manager) had a unique chance to be a beta tester for the upcoming learning course for Sitecore Search for business users. This course gives a great overview of the platform and its capabilities of how you can tailor and configure Sitecore Search to meet your specific use cases on your website and beyond. It gives a great overview of getting started with Sitecore Search, Sources, Pages, Widgets, Content Collections, and Analytics. It goes into great depth about each of these topics and then combines them all together through a practical lab and case study using real-life examples and use cases.

If you’re interested in learning more about how to successfully implement Sitecore Search, check out the following episode of the Sitecore Water Cooler, “How to Successfully Implement Sitecore Search.”

Person using digital device with site search ai functionality, representing business success with personalized experiences

Understanding Sources and Crawlers in Sitecore Search

Sitecore Search provides versatile methods for content indexing, primarily categorized into pull sources and push sources. Understanding these options is essential for effectively utilizing Sitecore's capabilities to manage and index web content.

What are Sources in Sitecore Search?

Sitecore Search has multiple ways that users can index content. The main types of sources within Sitecore Search are pull sources or push sources.

Pull sources include a web crawler, advanced web crawler, and an API crawler. The web and advanced web crawlers allow content to be crawled by providing a starting point and using hyperlinks to follow – or a sitemap can be provided. If you are looking to pull your website and all pages that you have added to your sitemap, it would be recommended to use the standard web crawler. The advanced web crawler supports more complex requirements and use cases like content that is protected behind authentication or specific localized content for a region or language. This allows for the use of JavaScript to pull specific content attributes and meta data for your content.

Push sources are another type of source that can be customized through an API and designed very specific for an index of products or content that is coming from an external system. This option can be set up by a developer using the ingestion API.

Step-By-Step Guide to Creating a Source in Sitecore Search

Here’s how to create a source in Sitecore Search:

  1. Select a Source Type: Choose one of the available source types (e.g., pull sources or push sources).
  2. Refer to Sitecore Documentation: Consult Sitecore's references to determine the most suitable source type for your specific use case.
  3. Create Triggers: Set up one or more triggers to direct the crawler where to find the specific content to index. Triggers can be configured with full lists of URLs or a Sitemap URL.
  4. Configure Crawler Settings: Define the settings for your crawler to control how content is indexed.
  5. Set Up Document Extractors: Configure document extractors to determine which content attributes (e.g., content type, description, title) will be pulled from the URLs.
  6. Customize Content Attribution: Decide the level of attribution for each item, ensuring control over how much detail is extracted.

For more information on how to create a Source in Sitecore Search, check out the Sitecore documentation.

Best Practices for Naming and Describing Sources

For consistency, you want to make sure to use standard naming conventions that stay consistent with the content you are indexing into your source.

Americaneagle.com is an original Sitecore development agency partner. For more than 10 years, we’ve provided excellent services to customers in all industries. Browse our Sitecore website examples to learn more.

Widgets and Components in Sitecore Search

Widgets and components in Sitecore Search play a crucial role in enhancing the search experience by offering a variety of tools and functionalities. These elements work together to create dynamic and personalized search pages, improving user interaction and content discovery.

Intro to Sitecore Search Widgets and Components

Sitecore search has a toolkit of components that can be utilized within your Sitecore Search experience. The search page and various widgets used work alongside each other to create unique search experiences across your website.

A search results page and a preview search widget are where you begin your journey in building out your pages. The search results page can be viewed as the placeholder or container for your widgets within your search results page. Sitecore Search also allows for great flexibility through the page variation framework. The framework allows for a default experience, global experience, and then if specific search experiences need to be tuned or configured separately, a variation can be utilized.

The main three widgets of Sitecore Search consist of a search results widget, preview search widget, and recommendation widget.

  • The search results widget displays a full-page search result after a user has searched for a specific item in the search box and executes the search.
  • The preview search widget is a pop-up that automates additional content or indexed items based on what the user is looking for. This can be based on spelling/auto-correction or predictive technology trying to complete what a user is searching for. These can be set up as standard and use the default analyzer throughout the site or setup on a specific page to tailor different suggestions.
  • The recommendation widget displays recommended content for the users browsing the website or viewing a specific page.

Each of these widgets can be tailored to utilize specific rules that may allow to show items based on things a user might be doing, excluding or removing certain results from being returned or boosting or bumping certain results over others.

Personalize Commerce Search with Sitecore Discover

Sitecore Discover revolutionizes personalized commerce search by utilizing advanced AI and machine learning algorithms to understand user behaviors and preferences. It enables ecommerce businesses to deliver highly relevant search results and product recommendations tailored to individual shoppers. This personalization enhances the shopping experience by predicting customer needs and presenting products that are more likely to interest them, leading to increased engagement and higher conversion rates. Additionally, Sitecore Discover supports seamless integration with existing systems, allowing for a unified and consistent customer experience across various digital touchpoints.

Partner with an Expert Sitecore Development Agency

Sitecore Search is an excellent choice in the search and relevance space. As a Sitecore development agency, Americaneagle.com has completed many implementations of the tools and seen many successful customer stories. We hope that this breakdown of the key capabilities and features of Sitecore Search inspires you to reach out and learn more about one of the top Sitecore implementation partners in the world.

Americaneagle.com can help you plan, strategize, and implement Sitecore Search. Contact us at [email protected] or +1(877) 932-6691 to get started today.

About the Author

photo of Casey Stanutz, a business analyst at digital marketing and web development agency Americaneagle.com

Casey
Stanutz

Casey joined Americaneagle.com in 2015 and dove head first into the Sitecore solution; he hasn’t looked back since! Currently, Casey is a Senior Account Manager within the Americaneagle.com Sitecore practice, where he works with an array of clients, enabling them to get the most out of their Sitecore instances. A lot of his days are spent planning and working through client backlogs and enhancements within the Sitecore platform. Outside of Sitecore and Americaneagle.com, Casey really enjoys being outside golfing, as well as cooking all types of meals and cuisines.
View All Posts

Featured Posts