How to Create an iOS Document Scanner
August 1, 2025 by Marvin

How to Create an iOS Document Scanner

Can't wait to see the fun part?

Jump to code

Document Scanner App

A document scanner app is a professional tool that uses the device's camera to take a picture of a physical (paper) document and transforms it into a digital version. The main goal is to make it look like it was scanned with a Desktop Scanner. That is, crisp text and high contrast between background and text. In order to achieve this, the document scanner needs to do a few processing steps:

  • Document edge detection within the image in order to crop the document
  • Do a perspective correction to make the document look as it was scanned from top
  • Apply some sort of filter to enhance the colors and make the background mostly white

Most successful scanner apps like Adobe Scan or CamScanner are able to do the document recognition within the camera frame in real time to make the document scanning process as easy and less error prone as possible.

If you take a look at more sophisticated scanner apps or document management apps like our app Docutain, they go a few steps further. Once the document was scanned successfully, OCR text recognition is run. With this, you can get two things:


Building a high quality document scanner app for iOS from scratch takes a lot of time, effort and knowledge in a lot of different specializations. Even if you were to build a document scanner successfully, your job won't be done. The vast majority of devices and OS versions cause a lot of maintenance work to be done in order to maintain the compatibility and high quality of your document scanner. All in all it costs a lot of money, time and manpower to have a successful document scanner - trust us, we've been there.

Should you need support with the implementation, we also offer developer support.


So what if I told you that there is a possibility to get a world class document scanner and being able to integrate it into your iOS app in no time?

This short tutorial will guide you through the process of adding a top notch scanner into your iOS app by using our Docutain Document Scanner SDK.

Prerequesites

Prepare the project

Open Xcode and create a new iOS App project. Provide a name (e.g., "iOS Document Scanner"), choose Storyboard as the interface, and Swift as the language.

Docutain SDK Dependencies

Once the app project is initialized and opened, you need to add the Docutain SDK as a dependency. You can either download the XCFramework from our website and add it manually or use CocoaPods to install it. You can find a detailed explanation here: Docutain SDK Dependencies.

Camera Permission

Open your App Target’s Info tab and add a Privacy – Camera Usage Description key with a value such as "In order to scan documents, you need to grant camera access.".

Initialize the Docutain SDK

To use any functionality of the Docutain SDK, you must initialize it first. For testing purposes, the Docutain SDK can run for 60 seconds without a valid license key. If you already have a free trial license or even a production license, you can replace the empty key with your license key. Add the following code to your AppDelegate class:

import DocutainSdk

func application(_ application: UIApplication, 
    didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?) -> Bool {

    if(!DocutainSDK.initSDK(licenseKey: "")){
        //init of Docutain SDK failed, get the last error message
        let error = DocutainSDK.getLastError()
        //your logic to deactivate access to SDK functionality
        //...
    }

    return true
}

Implement the Document Scanner Feature

A great Document Scanner SDK stands out by providing the functionality with just a few lines of code, so that's what we did. Add a button to your layout and connect it to an action called scanButtonClicked. To start the document scanner, you just need to add the following lines of code:

import DocutainSdk

class ViewController: UIViewController, ScanDelegate {

    func didFinishScan(withResult result: Bool) {
         if(result){
            //user finished scan process, continue with your workflow
            //generate PDF by using Document.writePDF()
            //get detected Text by using DocumentDataReader.getText()
            //get data by using DocumentDataReader.analyze()
        } else{
            //user canceled scan process
        }
    }    

    @objc private func scanButtonClicked(){
        let scanConfig = DocumentScannerConfiguration()
        UI.scanDocument(scanDelegate: self, scanConfig: scanConfig)
    }
}

Now when you click the button, the scanner should start, and you should be able to scan documents like in the video below:

Document Scanner Configuration

The Docutain Document Scanner SDK comes with UI components and predefined settings, theming and layouts. The default settings will be a good choice in most cases. The predefined layouts are the result of our long-standing experience of our Docutain Document Management App, which includes the Docutain SDK and is used by millions of users around the world daily.

If your use case requires different default values, e.g. single page scanning instead of multi-page scanning, you can of course change them. Either by setting them fixed in your code, or by providing a custom settings page providing the user to change these values on his behalf. It is also possible to alter the default theming to visually integrate the Scanner into your corporate branding.

Most common Document Scanner Settings

Default scan filter

The preparation of camera images is one of the requirements that require a lot of effort. The removal of the background and especially the processing of embedded color images is one of our strengths.

Our iOS Document Scanner Plugin has a set of currently 8 predefined image filters. All of them have their own strengths, however, the one we have defined as default will provide you with the best possible scan quality in almost all cases. Nonetheless, it is of course possible to set any of the predefined filters as the default scan filter.

In the following video you can see for yourself the outstanding quality of our filters.

The unfiltered scanned document looks grayish and has poor contrast. Our automatic filter gives you a perfectly white document background (like the original paper document), increasing text contrast to a maximum while preserving all color information.



Automatic Capture

Docutain's Document Scanner captures the image at the perfect moment automatically. The only thing the user needs to do is to hold the phone above the document to be scanned. However, if you need the user to capture the image manually, you can disable automatic capture.

Multi Page Scan

Scanning multiple pages at once is the default setting, this will ensure maximum efficiency when scanning. If you need single page scanning, of course this is also possible.

Theming

Theming is an important part when it comes to matching the scanner to your corporate branding. You can find all possible theming options in our documentation: iOS Document Scanner Color Configuration. The Docutain SDK supports native dark mode theming as well.

There are a lot more configuration options. If you are interested in them, feel free to check out our documentation or sample app on Github.

Document Scanner Onboarding

You might want to onboard your users and show some explanation about the scanner. For this, the Docutain SDK provides two optional Onboarding UI components. You can either use the defaults or alter the contents according you your needs. You can find details here: Onboarding.

iOS document scanner onboarding

Scan Tips

Even though the Docutain SDK provides the best possible scan results in almost all scenarios, you might still want to provide your users a screen showing some tips on how to achieve the best scan results. You can optionally enable such functionality and use either the defaults or alter the contents according you your needs. You can find details here: Scan Tips.

iOS document scanner scan tips for best scan results

Add PDF Export Feature

For most use cases you would want to export the scan as PDF document for further processing. With this you get the capability to have a multi-page document within a single file and if you have OCR text recognition enabled, a text layer will be included, making the PDF document searchable and selectable. You can also compress the PDF file size and define a custom page format, e.g. A4 or letter. For more details, visit the PDF Export documentation.

To export the scan as PDF document, add the following line of code after a successful scan:

let pdfUrl = Document.writePDF(fileUrl: path, fileName: "Test")

If you open the pdf, you can see that the text is searchable and selectable.

iOS document scanner pdf export searchable

Add Image Export Feature

As an alternative to a PDF Export, you can also export the scanned pages as image files. You can choose between a local file and in memory images, as well as the page source type, e.g. the cut and filtered or the cut-only image. For more details, visit the Image Export documentation.

To export the scan as image file, add the following line of code after a successful scan:

let paths = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)
let pageCount = Document.pageCount()
for page in 1 ... pageCount {
    let path = paths[0].appendingPathComponent("Image_\(page).jpg")
    let savedPath = Document.writeImage(page: page, fileUrl: path)
}

Add OCR Text Recognition Feature

Adding OCR Text recognition to your document scanner is just as easy. Just add the following line of code once the scan process finished:

let text = DocumentDataReader.getText()
This will provide you with the text of the scanned document. If it has multiple pages, it will return the text of all scanned pages. Of course, you can get the text of a specific page by providing the page number. OCR Text recognition also works on imported files (images and PDF files).

OCR Text Recognition sample text

Add Data Extraction Feature

You can probably guess it already. Adding data extraction features to your iOS app also is possible by adding just one line of code:

let jsonData = DocumentDataReader.analyze() 
The returned data will be JSON formatted, you can find samples here: Data Extraction Documentation. Currently the best results will be achieved for documents that contain German language. If you need to support different languages our you need to detect specific kinds of data, we would love to hear from you: Contact us.

Learn more about the Docutain SDK. Besides the Document Scanner SDK we also offer a Barcode Scanner SDK and a Photo Payment SDK. All SDKs are available for all major platforms/frameworks.

If you want to test the Document Scanner SDK without writing any lines of code, check out our Showcase App.

Contact us and receive your quote

Our pricing is tailored to your use case. Let our colleague Harry Beck know how we can help and receive your quote.