Tuesday 20 October 2009

SharePoint Advanced Search Properties Don't Work - the skinny on Created By, Modified By, Author and more

If you've ever done much work with the Advanced Search web part it won't take you long until you discover that most of the default property searches simply don't work. As of writing this post there were dozens of very long forum threads describing the problem with no comprehensive fix in sight by Microsoft or anyone else. Perhaps until now...?

What's not Broken?

Despite the claims of many, the Size and (Created/Modified)Date properties are not completely broken. They simply require the correct format for input. It's also worth noting that the explicit value is required when using property searches. No wildcards or partials!
  • Size - takes a value in bytes, so that 1000000 = 1Mb.
  • Dates - require a xx/xx/xxx or xx/xx/xxxx format. The order of the days/months will depend on your regional settings. If you're still getting no results, then check the metadata mappings and XSLT references below.
Although uncovering the correct format was pain, it was nothing compared with what followed.

What is Broken and Why?

As many have discovered, the Created By, Last Modified By, Created Date, Last Modified Date and Author properties are all affected to some degree.
The root cause for all this is down to improper mapping of crawled properties to their managed properties within Search Administration and keeps going all the way through to the XSLT for the Advanced Search and Search Core Results web parts.
The reasons behind these poor relationships become obvious shortly after you begin looking for a solution. To be frank, it also becomes obvious why most people gave up trying!

Thanks to...

Much thanks goes Anne Stenberg and her 6-part series entitled - Mystery Solved - Crawled Properties in SharePoint.
In this series Anne patently and painstakingly goes through every last property in each defined category, and providing a description for many. I'm not entirely sure where she came by all this information but it proved invaluable when it came to identifying and testing the result of many changes to come in my metadata property mapping.

Please explain!

Using a combination of Anne's tables, the U2U CAML Query Builder feature, the ever useful SharePoint Manager, and the XSLT within the search web parts - it quickly becomes obvious that it's going to take more than a packet of off-the-shelf headache tablets to get through this.
Without going into too much detail - ignorance being bliss - let's take a look at something as simple as Author.
  • We have a visible Author column whose internal name is _Author.
  • A hidden Created By column whose internal name is Author.
  • And a managed property called Author that seems to want to hedge its bets by trying to cover all these bases as well as a few more.
But that that's just the beginning - Created By and Modified By searches will invariably return zero results and also have their fair share of possible mappings and hidden values. What the heck is Write anyway?? Apparently just another value for Modified Date...but more on that later. I'm sure anyone's who's that interested can do their own research. I won't bore everyone alse any further.

What's the fix already!?

OK, OK. Keep your propeller hat on.
After days of stuffing around, tweaking mappings, modifying web part properties and performing a full crawl each time(!) I have finally found - I think - a solution. At least, a number of searches - using Author, Created By and Last Modified By properties with the AND operator - all returned correct results.
It's also worth noting that this solution is not Office-centric and will work with any document type.

First, the Metadata

This assumes a good knowledge of Central Administration. If you require detailed steps they can be fond elsewhere.
You can use all or some of the settings shown below but the only ones that really matter are the Mappings themselves. After you've added the crawled properties, be sure to click each one and check the "Include values for this property in the search index" checkbox, otherwise it won't get added to the index! In all cases I went with the default "Inlcude values from all crawled properties mapped" option.
Also note that there are often TWO properties with exactly the same name - e.g. Office:4(Text). Picking the right one is essential and I have provided the Property Set IDs below where this is relevant.
And, remember, what follows is in no way Gospel - it's just what worked for me.
NB: Don't forget to run a Full Crawl after making these changes.
Property Name
Type
May be deleted
Use in scopes
Mappings
AuthorTextNoYes_Author(Text), ows__Author(Text)
CreatedDate and TimeYesNoOffice:12(Date and Time), Basic:15(Date and Time)
CreatedByTextYesNoOffice:4(Text), ows_Created_x0020_By(Text
LastModifiedTimeDate and TimeNoYesBasic:14(Date and Time), Basic:16(Date and Time), ows_Modified(Date and Time)
ModifiedByTextYesYesOffice:8(Text)
  • Office:12(Date and Time) - f29f85e0-4ff9-1068-ab91-08002b27b3d9
  • Basic:15(Date and Time) - b725f130-47ef-101a-a5f1-02608c9eebac
  • Office:4(Text) - f29f85e0-4ff9-1068-ab91-08002b27b3d9
  • Office:8(Text) - f29f85e0-4ff9-1068-ab91-08002b27b3d9

Advanced Search XSLT

The following go in PropertyDefs. There are many default values here, I'm just providing the full block. You'll then need to add the same 'Name' references to each ResultType in the order you prefer.
<propertydef name="Author" datatype="text" displayname="Author">
<propertydef name="Size" datatype="integer" displayname="Size">
<propertydef name="Keywords" datatype="text" displayname="Keywords">
<propertydef name="CreatedBy" datatype="text" displayname="Created By">
<propertydef name="Created" datatype="datetime" displayname="Created Date">
<propertydef name="ModifiedBy" datatype="text" displayname="Last Modified By">
<propertydef name="LastModifiedTime" datatype="datetime" displayname="Last Modified Date">

Search Core Results XSLT

Unless you're trying to provide custom results using some of the values described above you won't need to make any changes here. Quite frankly it's a little daunting but great things can be done - such as displaying Size, Author and a custom link to open the containing folder for each result. I'll probably leave this for another post as it's a topic in itself.

In conclusion...

So, hopefully, if you've done everything right and performed a full crawl, you should now be able to search using one or all of the properties we've discussed here.
One thing you may find still doesn't work is the "Does not equal" operator. You might also that it's not described anywhere in the web part code but is managed by a separate core JavaScript file. I'm just not willing to look into this right now - or the reasons why "Contains" and "Doesn't contain" aren't available for partial search term querying. If anyone else has any ideas - performance notwithstanding - feel free to drop me a line.
I look forward to any further insight and feedback others might have and hope that all my hard work isn't undone with the next upgrade!