Information Governance

A good information governance program has the ability to apply policies, procedures, processes, and controls to all enterprise information, whether it's corporate records, business data, email, or office documents. Check out our blog to help you make the most out of your information governance initiative.

Building a File Plan Part 3: Creating Classification

March 9, 2017 by Andrew Borgschulte

Tags: File Plan, Classification

We are re-posting this series as we recently held a webinar on this topic (which can be found here)

This is Part 3 in a series of 6 posts around creating and executing an effective file plan for your organization. Click the following links to read the previous posts: Part 1, Part 2

Read More

Adding yet more class to Information Governance (Part 3)

March 7, 2016 by Peter Sloan

Tags: Classification

The following is the third post in a series of three about information governance from Peter Sloan, a partner for Husch Blackwell LLP. The original post can be found here.

Read Part One and Part Two to get caught up.

Read More

Adding more class to Information Governance (Part 2)

February 29, 2016 by Peter Sloan

Tags: Classification

The following is the second post in a series of three from Peter Sloan, a partner for Husch Blackwell LLP. The original post can be found here.

Read More

Adding some class to Information Governance

February 22, 2016 by Peter Sloan

Tags: PII, Classification

The following is the first post in a series of three from Peter Sloan, a partner for Husch Blackwell LLP. The original post can be found here.

When governing information, it works well to identify and bundle rules (for legal compliance, risk, and value), identify and bundle information (by content and context), and then attach the rule bundles to the information bundles. Classification is a great means to that end, by both framing the questions and supplying the answers. With a classification scheme, we have an upstream “if-then” (if it’s this kind of information, then it has this classification), followed by a downstream “if-then” (if it’s information with this classification, then we treat it this way). A classification scheme is simply a logical paradigm, and frankly, the simpler, the better. For day-to-day efficiency, once the rules and classifications are set, we automate as much and as broadly as possible, thereby avoiding laborious individual decisions that reinvent the wheel.

Read More

10 rules for creating a successful mailroom classification project

September 2, 2015 by Team RecordLion

Tags: Mailroom, Classification

This post comes from Alexander Goerke, CEO and Founder at Skilja. The original article can be seen here.

Automatic, context based classification for mailrooms has proven to generate significant ROI and acceleration of processes in the last few years. But we have also seen failures and disappointments. I have managed and monitored many of these projects in the past and would like to share 10 golden rules derived from my experience to make a mailroom classification process successful.

  1. Plan enough time to prepare: Changing the way how business processes originate is a severe organizational change for the company. Although work in the mailroom is often considered of minor importance, this is the area where everything starts. Any error here has significant effects on quality of service and response time. Any improvement will ensure that responses to requests are faster and customer satisfaction is maintained. So don’t begin with implementation right away. Forget about technology for a moment, put aside tools and spend time to analyze existing processes upfront. Take them and define clearly which of them you can automate and what would be the best way to approach this task. Document the findings and have them signed off by the customer.
  2. Involve stakeholders: Classification drives and initiates business processes. In each organization you find existing stakeholders and subject matter experts (SMEs) who are familiar with the processes and have done manual sorting for years. They know the documents that arrive, they know explicit rules but also a lot of implicit procedures. Identify the SMEs and invite them to the team that defines the classification scheme. You can learn a lot from them. Create incentives for their active participation so you get access to the hidden knowledge which they need to share with you to make you understand their business. Often a simple, straightforward valuation of the work they have been doing until now is enough to get them involved.
  3. Define goals: Clearly define goals and have them signed off by the team. Set the expectation and explain to the team what classification can achieve – and what it can’t! Very often clients have unrealistic expectations on the performance of the new system and wrong assumptions about the manual process at the same time. Depending on the kind of documents manual classification creates as much as 5% of errors and misclassification. It makes sense to hold a general educational session about classification technologies and the preconditions for successful classification. If clients are introduced (on a high level) to the technological foundation, they will understand better that even an automation rate of 70% with an error rate of 3% can be a big success. And that 99% are not realistic. Make sure that everybody understands that quality can only be measured statistically and that it makes no sense to focus on single documents that might have been misclassified.
  4. Use good data sets: Get a large and representative set of documents from several weeks to account for changes in content by weekday. Typically 1,000 to 2,000 documents need to be reviewed. Work with the SME team to sort the documents manually and make sure to understand why they sort them as they do. Often the reasons they give for sorting contain valuable hints on exceptions that need to be coded as rules and cannot be trained automatically. Create 3 data sets from these documents: A training or development set, a test set and a reference set and tag them with the correct classes. The last one will be touched only for finally measuring the quality to achieve sign off. The test set will be continuously used to measure effects of training and rules during development.
  5. Use clean data: Make sure that you clean the documents before using them for classification. Documents contain “noise” like footers, general terms and disclaimers that are not relevant for the content. These need to be removed by prior analysis (e.g font size) as they will for sure mislead the classifier. For e-mails make sure to remove text from old threads (which can be identified by indents) that often cascades over many responses.
  6. Start small: Do not attempt to solve the complete categorization problem at once. Focus on the main categories and start with them. In the beginning use the 10 major classes (classes with the highest number of documents) and create a working scheme. Then expand by adding more classes in steps of 10 or 20 while continuously measuring quality.
  7. Use hierarchy: If the classification software allows to use hierarchies (it should!) make intelligent use of it. Main classes can be easily identified and subsequent classification can further break them down into the desired target classes corresponding to business processes. As the differences between documents become smaller and smaller the closer the topics are, it is easier to handle these distinctions in a hierarchy. Together with rule 6 hierarchy provides and easy and straightforward way to reduce complexity and build the scheme step by step.
  8. Run continuous tests: Make sure that you test after each change. Quality can be assured if you know what you are doing. Running automated tests after each change allows you to better understand the reasons for possible deterioration. Searching these reasons later might prove very difficult.
  9. Ramp up production: Don’t attempt to process the complete volume from day one when you start production. Rather start with a fraction of the volume (10%) or, if possible, run automated classification in parallel for some days. This allows the business users to accommodate themselves with the system and allows you to correct errors that shine up in production only. Ramp up the volume step by step until you process the complete volume after a few weeks.
  10. Monitor in production: Measure and monitor quality in production. To achieve this you need statistics that show how many documents had to be classified manually and for how many the class was changed by users. Deterioration starts with day one. This is not necessarily the fault of the classification software but more often due to slow but steady changes in the document content over time. By monitoring the system it can be tuned during production to stay up to date. If you are using a system with automated learning, monitoring is essential to find out if the quality really goes up.

I hope you find these practical tips useful for your projects. Please leave a comment below to share your thoughts.

If you want to learn more about auto-classification, please check our regular blog onwww.skilja.com

Read More

5 Steps to Curb Information Sprawl [SlideShare]

February 3, 2015 by Team RecordLion

Tags: Policies, Classification, Cost

Digital information is doubling in size within your organization at a minimum of 24 months. The cost to finding lost information and the cost of eDiscovery are also on the rise. Other problems with information sprawl are the value of analytics when there is too much noise (obsolete information) and the fact that although the price of a hard drive may continue to drop, the cost of full time employees, electricity and other data center related costs will continue to increase.

Read More