SharePoint Search by default crawls the contents of all site columns so that they are available for search queries. While most would consider that to be a good attribute there are times when it can be a hindrance, especially when content is shown on public facing websites which is not relevant to the content on the page or the metadata about the page.
The search index is the center of SharePoint Search. What is in your search index determines what people will find when they look for information by entering search queries or by interacting with internet or intranet pages. The search schema is used to build up the search index. The search schema contains the mapping from crawled properties to managed properties and the settings on the managed properties.Crawled and managed properties
A crawled property is content and metadata that is extracted from an item, such as a document or a URL, during a crawl. A crawled property can be an author, title, or subject. To include the content and metadata of crawled properties in the search index, you map crawled properties to managed properties. Managed properties can have a large number of settings, or attributes. These attributes determine how the contents are shown in search results. The search schema contains the attributes on managed properties and the mapping between crawled properties and managed properties.
Irrelevant crawled and managed properties for public facing websites
In an Intranet scenario you can imagine it is relevant for search results to show information about who created a certain document or page, who recently edited it or to whom a document or page is checked out to. But for a public facing website this information is irrelevant or even uncomfortable, since you might expose user names or account names. A recent study on this matter resulted in the following irrelevant properties:
Managed Property: |
---|
Author |
AuthorOWSUSER |
CheckoutUserOWSUSER |
DisplayAuthor |
DMSDocAuthor |
EditorOWSUSER |
MetadataAuthor |
ModifiedBy |
PublishingContactOWSUSER |
People |
Crawled Property: |
---|
ows__Author |
ows__CheckinComment |
ows__ModerationComments |
ows_CheckoutUser |
ows_Created_x0020_By |
ows_MetadataFacetInfo |
ows_ModifiedBy |
ows_Modified_x0020_By |
ows_PublishingContact |
Make it not Searchable
To make sure the previously mentioned managed properties are not shown in search results, we have to disable the Searchable flag.
From the search administration in Central Administration, navigate via the Search Schema link under Queries and Results on the quick launch. This will display a list of all the managed metadata properties.Browse through the list and find the appropriate managed property. Locate the ‘Seachable’ checkbox,make sure it is not selected and press OK to save your changes.
Exclude metadata from the full-text index
Even though a managed property is not searchable, it does not mean it will not show up in search results. We also have to make sure the crawled property is not included in the full-text index. From the search administration in Central Administration, navigate via the Search Schema link under Queries and Results on the quick launch. Navigate to the Crawled Properties section, browse through the list and find the appropriate crawled property. Locate the ‘Include in full-text index’ checkbox, make sure it is not selected and press OK to save your changes.
Note: SharePoint already gives us a hint in the description of the ‘Include in full-text index’ checkbox: “Include the content of this crawled property in the full-text index. This enables searching for the content of this crawled property without mapping to a managed property. Use this setting if the content of this property may be relevant for end-user queries, but you do not see a need for a managed property that contains this content. … Including unnecessary properties in the full-text index may have a negative effect on search relevance and performance.“
IMPORTANT: Make sure to run a full crawl after having made any changes to a managed or crawled property! It may also be necessary to reset the search index first before you run a full crawl.
Summary
SharePoint Search by default crawls the contents of all site columns so that they are available for search queries. While in most scenarios it is relevant to get user related search result information, when working on a public facing website it is best to remove this information.
References
Overview of crawled and managed properties in SharePoint Server 2013: http://technet.microsoft.com/en-us/library/jj219630(v=office.15).aspx
Overview of the search schema in SharePoint Server 2013: http://technet.microsoft.com/en-us/library/jj219669(v=office.15).aspx
Originally posted at http://blog.marnixdevrije.nl/removing-unwanted-user-related-search-results-sharepoint-2013-websites/