Getting Started with Sitecore Cortex Content Tagging

Last February Americaneagle.com participated in the 2020 Sitecore Hackathon competition and our team XTeam was honored to win first place. Our module used the latest Sitecore 9.3 components and directs them toward meeting the core needs of the Sitecore Marketplace website with minimal overhead in 24-hour window. We needed a powerful content tagging service to drive the search with minimum efforts to integrate with Sitecore and we found our answer in Cortex Content Tagging. So what is that?

The Sitecore Cortex Content Tagging feature integrates the Sitecore XP with machine learning-based natural language processing (NLP) engines to process the content and return back metadata to Sitecore where it can be classified based on user-defined taxonomy.

Sitecore made it really easy to kick-start content tagging using Refinitiv Intelligent Tagging Open Calais, which is a service by Thomson Reuters that analyze the content text and generates highly accurate and detailed metadata. If you wish to know more about this great service you can visit the official site Intelligent Tagging Text Analytics

Literally, It takes minutes to setup Open Calais on Sitecore:

- First you need to register for Thomson Reuters Permanent Identifier. You can register for a free PermID at: https://permid.org/

- You can find the configuration file at: App_Config/Sitecore/ContentTagging/Sitecore.ContentTagging.OpenCalis.config.
Make sure it is enabled and create a patch file to update CalaisAccessToken, CalaisLanguage ,and CalaisEndpoint, if needed. The patch file can be as simple as:

<?xml version="1.0" encoding="utf-8" ?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/">
  <sitecore role:require="Standalone or ContentManagement">
    <settings>
      <setting name="Sitecore.ContentTagging.OpenCalais.CalaisEndpoint" value="https://api.thomsonreuters.com/permid/calais" />
      <setting name="Sitecore.ContentTagging.OpenCalais.CalaisAccessToken" value="XXXXXXXXXXXXXXXXXXXX" />
      <setting name="Sitecore.ContentTagging.OpenCalais.CalaisLanguage" value="English" />
    </settings>
  </sitecore>
</configuration>

At this step, you have it all set and you can start tagging items, however, Sitecore by default includes the content of Multi-Line Text and Rich Text fields with the name of Title only. You can check this setting in the config file “App_Config\Sitecore\ContentTagging\Sitecore.ContentTagging.Core.config”.

You will probably have more fields to consider so you will need another patch file, in my case I added the Single-Line Text field type and Content, Readme field names:

<?xml version="1.0" encoding="utf-8" ?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/">
  <sitecore role:require="Standalone or ContentManagement">
    <contentTagging>
    <fieldMap>
        <!--
        FieldTypes
          Specifies list of fields type names allowed to be used in tagging process.

        -->
        <fieldTypes>
            <fieldType fieldTypeName="Single-Line Text"/>
            <fieldType fieldTypeName="Multi-Line Text"/>
            <fieldType fieldTypeName="Rich Text"/>
        </fieldTypes>
        <!--
        FieldNames
          Specifies list of fields names allowed to be used in tagging process.

        -->
        <fieldNames>
            <field fieldName="Title"/>
            <field fieldName="Content"/>
            <field fieldName="Readme"/>
        </fieldNames>
    </fieldMap>
</contentTagging>
</sitecore>
</configuration>
 

Now you have a running, tagging, high-quality service with basic customization in minutes!

Run The Content Tagging Process.

After having everything set up, the author needs to run the service against the Sitecore items to generate tags. Obviously, the author needs to have the permission to edit the target item, then navigate to the item and select Tag Item from the Home ribbon. As with many similar commands in Sitecore, you can choose to tag the item alone or with its subitems.

Just remember, If the ribbon was disabled, check the configuration one more time.



 

After the progress indication close, the new tags will be added to the Semantics field in the Tagging section.




That’s all. Sitecore Cortex Content is a powerful way to integrate content tagging service to Sitecore. This article covers the out-of-the-box integration but, like everything else in Sitecore, it is highly customizable as well but that’s a story for another article.


About Author

NaimAl
Naim holds a Bachelor Degree in Information Technology and an MBA in Finance. He started his career in programming in 2001 building desktop financial applications. In 2004 he shifted his focus to develop websites. In 2016 Naim joined Americaneagle.com as a Senior Sitecore Developer. In 2020 Naim won the 1st place in Sitecore Hackathon 2020 competition with American Eagle XTeam Winner, over 82 teams from more than 23 countries were participating. When he is not coding, he likes to spend time with his wife and two kids, watching TV, or relax in his backyard.


Featured Posts