By leveraging SharePoint metadata to correctly tag our documents we can build world class search functionality allowing users to easily find content dispersed across SharePoint farms or tenants.
Once the metadata is in place, we can easily create Search refiners. However, one of the difficulties, especially when first migrating to SharePoint from an unstructured source such as file shares, is adding the initial tags. This can be a very erroneous and time-consuming process.
Of course, there are tools and methods available to extract certain tags from folder structures and apply these as pieces of metadata. However, as users have often dumped documents into folders over time, these don’t always add richness to the data.
One approach SR1 Development have taken lately is to leverage Microsoft Cognitive services. We worked with a large utility company to migrate nearly 250,000 drawings from file shares into SharePoint. However during the process we called Microsoft Custom Vision against a set of trained models to identify the drawing type, and then against OCR to extract text, and compared this to a database of known words (for example site names).
From this we were able to add the drawings into SharePoint programmatically with a rich set of data. A search front end was then developed to help users easily find the required content.
Moving forward we will be looking to add this as an event receiver so whenever a new document is added, there will be the option to run through this process automatically.
If this is something that sounds of interest, then don’t hesitate to get in touch to explore how a similar solution may work for your organisation.