Solutions for automated invoice capturing are a dime a dozen. Evaluating an invoice capturing solution for a company is not an easy task – and selecting the right solution is even more difficult.

High-quality solutions offer the possibility to test individual invoices directly online. This gives you a first feeling for the quality. If the company is a medium-sized enterprise or even larger, then one should not only listen to one’s intuition, but also carry out a measurement or evaluation. The next test should then at least be carried out with a representative set of the company’s own invoices. However, in order to arrive at a representative statement for the entire invoice volume, one must first deal with one’s own invoice data. We often do this on behalf of our clients and call this step basic analysis.

The result of the basic analysis should represent a statement about the data quality of the company invoices. It is the foundation for any evaluation of a system for automated invoice capturing. All further assessments are based on this. At we carry out so-called potential analyses, i.e. predictions as to how well our product (without/with model customisation) can extract data from the invoices of the company in question and make it available for further processing (% of individual features as well as overall recognition rate for all required features).

Below you will find our best practices and cornerstones of a sound basic analysis:

Before the basic analysis: Know your minimum goal and take small steps from there. Define the minimum goal as the first step. What does the automated invoice capturing have to achieve so that your (accounts payable) accounting department feels an initial relief (MUST criteria)? Define an outlook. What would be desirable beyond the minimum goal (NICE-TO-HAVE criteria)?

Do not assume that a certain recognition rate of a company is transferable to your invoices. The quality of the incoming invoices is decisive for the expected recognition and extraction quality and varies from company to company. Scan quality, document types, number of suppliers, required invoice details, etc. are just a few points that play a significant role here.

Always compare systems from different manufacturers with the same set of test invoices.

Pay particular attention to selecting a representative sample for testing from your incoming invoices. Our recommendation is to look at least at a sample over an entire month. If this sample is too large, then you should conduct a random draw from this sample. For large companies, the sample should consist of at least 1,000 invoices.

Analyse the quality of your incoming channels (scan, email, portal, PDF). A scan is generally worse than digitally transmitted PDF files. PDF/text is ideal. Here, there is no need for text recognition. Text recognition errors are generally difficult to correct later on. What is your scan quality (dpi) or how variable is your scan quality? Does the scan quality match your system? Enquire about the preferred DPI quality for the system in question. Higher DPI does not always mean better text recognition!

What types of documents do you receive? Usually, it’s not just perfect A4 invoices. In our analyses, we regularly see a wide variety of document types, such as cash register receipts, cash register closing reports, bank statements, parking tickets, public transport tickets, notifications etc. Most systems can cover A4 invoices well, other than that it often gets difficult. If you have a high proportion of unusual document types, then consider a special evaluation for this type!

How high is the diversity of your invoice layouts? You can easily determine this by looking at the number of suppliers in your sample. (A professional company will also do a cluster analysis on the sample and thus actually be able to determine similar layouts and not just carry out an evaluation based on suppliers). At, of course, we have the advantage that we can compare things with other companies.

How many pages do your invoices have on average and what is the distribution across the sample? Caution: Some system suppliers charge by pages – not by the number of documents. A common average is between 1 and 3 pages.

How many different countries and languages does your sample cover in absolute and relative terms? If you have a high proportion of certain countries/languages, then consider a language-dependent evaluation.

What are the invoice details that are necessary for the minimum target and which details do you only need in the subsequent step? The basic analysis and all subsequent steps are only as good as the initial selection of the representative set. Our mission is to bring real added value to each of our clients through our system and the basic analysis is both the first and most important step in this direction.

What is the second step? Determining the degree of automation – the BLU DELTA potential analysis


Christian WeilerAuthor: Christian Weiler is the former General Manager of a global IT company based in Seattle/US. Since 2016, Christian Weiler has been increasingly active in the field of artificial intelligence in a wide variety of roles and has strengthened the management team of Blumatix Intelligence GmbH since 2018.