News

Article

YYC Data Society Showcases Pandell for its Innovative Use of AI to Digitize Ancient Land Documents for Energy

Mar 01 2023 | Community

Calgary, Alberta, Canada – The YYC Data Society, a network of 4,000 people working as a community to advance Calgary’s thriving data science scene, recently featured an article on Pandell’s innovative use of AI to digitize ancient documents for the energy industry. The article highlights how Pandell automates land document digitalization to help their clients' speed up project delivery time and save on valuable capital budget dollars. [View original post here.]

Imagine you are a large utility or pipeline company overwhelmed with the myriad of land documents required to operate and maintain powerlines or pipelines over landowners’ properties. These documents include:

  • Leases - The rights to use a piece of land for a specific time.
  • Easements - Agreements to cross one’s land.
  • Deeds - Agreements to transfer land ownership.
  • Titles - Ownerships of land.
  • Contracts - Agreements for the land and real estate on the land purchase with the seller financing the buyer.
  • Permits - Agreements for the specific uses of a land dictated by governments.

The land-owning parties can be a single person, a company, a government organization, a non-profit organization, a crown corporation, an Indigenous group, Etc. You have tens of thousands of such documents to track, so unless you are playing monopoly and only need to deal with a few people, you will need help.

This is where Pandell comes in. Pandell's software suite helps their clients in oil & gas, renewables, utilities, and pipelines by enabling organizational functions like accounting, land management, operations, and project management to work together seamlessly in a single, centralized digital hub, similar to a maestro orchestrating a symphony. The success of their solutions has resulted in a client list that includes Chevron, Shell, and BP Wind Energy.

Improve Your Asset Management

Discover How LandWorks Can Help

When a client first chooses Pandell's land management software, their historical land records must be pre-loaded to ensure they are productive from day one. Customers must make scheduled lease payments, manage joint interests, view assets on maps, manage chains of titles, and track obligations and provisions, such as the right to cut down trees or if an owner needs notification when accessing their land. While helping clients manage their land problems, Pandell has a problem when trying to load the land data.

In data science, digital data is categorized into structured, semi-structured, and unstructured data. The data is often on paper; lots and lots and lots of paper. About 300,000 pieces of unscanned paper pages per year, representing approximately 85,000 documents, are given to Pandell annually. They date back as far as the 1800s, with many pages barely legible by humans. Spills, ink blots, rips, imperfect copies (often copies of copies), and general fading abound. Could it be that Woodrow Wilson spilled his tea while signing a New Jersey land agreement in 1912? Quite possibly.

An image of original hand written paper land documents.

Click to enlarge

We sat down with Pandell's VP of R&D, David Beresford, to discuss how they historically tackled importing this data and how they are now making creative use of AI to streamline this process. In the past, David told us, there was no magic (sorry, Indy). Pandell employs a team of twenty dedicated Land Specialists who read, interpret and digitize documents manually, working year-round on the data. As skilled as the team is, the process is labor-intensive, prone to human error, and generally costly. "Large customers can provide tens of thousands of documents. The old manual process could take months to process and easily cost the client over a million dollars." David said.

The legacy process started when the Pandell team visited client offices and manually scanned the paper found in boxes and cabinets. PDF files were generated, one per box or drawer, and documents were individually separated to make it easier for scanning. When complete, the arduous task of examining every page in the PDFs began, with the team rotating pages to correct orientation, splitting them out into individual land agreements, and then manually inputting relevant data from the documents into Pandell's land system.

Earlier this year, Pandell set out to use modern technology to streamline this process. The first step was obvious - they needed a high-quality text extraction from the scanned pages. Traditional Optical Character Recognition (OCR) technology did not perform well due to the low quality of the documents. The latest AI Computer Vision-based services from Amazon (AWS), Google (GCP), and Microsoft (Azure) were each evaluated. Each vendor performed well in some areas and less so in others. AWS was proficient at reading handwritten text yet could not read rotated text; about 7% of the text was missing. Azure read the rotated text but struggled at text region detection (keeping related text blocks together). Google excelled at region detection but struggled to recognize handwriting. Pandell's answer was to combine data from each Cloud's AI service vendor to produce a consummate, "best-of-breed" result.

Duplicate documents were an immediate problem for the land team, causing the same work to be completed multiple times. Duplicates have been created over the years as clients transferred copies between field offices. With 45,000 documents in the backlog, the team had no way to identify these copies. By applying a document similarity algorithm against the newly extracted OCR text, 3,500 documents out of 46,000 were quickly flagged and removed. It's estimated that over 45 days worth of work was saved on an initial project that they experimented with this process on.

This "deduplication" provided the first proof of the business value of automation. Pandell then expanded this initial proof-of-concept into a comprehensive land document digitization pipeline. This included detecting PDF split points, orientating pages, classifying documents, linking documents to sub-documents, auto extracting data, cleaning data, and deduplication. David said it was also critical to build a business process around the technical models, so Pandell incorporated a workflow for quality assurance, reporting, forecasting, and billing.

Using OCR to detect duplicate documents.

Click to enlarge

To speed up the production rollout of the new framework, the first iteration implemented only "low hanging fruits". For example, only 20 of 40 document types were trained for classification, and about 20% of common entities were trained for value extraction (e.g., lessee, lessor, document date, document name). Despite being short of full automation, the system was still able to save much time and was rolled out for the land team to use in the real world. As the team validated classification results, the new engine achieved a success rate between 84% and 93%, depending on the document type, with about 80% of the documents successfully extracted data values (for currently trained entities). These were encouraging results, given the incomplete models. The news got even better when the document data-entry rates were computed - the entry speed was double historical norms.

Not resting on their laurels, Pandell began tackling an even more challenging problem - the automatic generation of GIS polygons from text descriptions. Land agreements contain "legal descriptions" that describe land boundaries that clients often want to visualize. In this case, Pandell Mappers read these descriptions, which can be several paragraphs long, and plot each point described using ESRI ArcGIS software.

The resulting polygons are imported into Pandell's software for display on a map. Pandell has a prototype that extracts this land description text, then parses and rewrites them into a normalized format. Next, Pandell will attempt to auto-plot points and auto-build polygons. Since mapping polygons can take twice as long as text entry, the potential time savings are enormous.

As with many companies attempting to apply AI and ML to their operations, Pandell is still exploring the endless possibilities. One future target is invoice recognition for their financial products. Customers find themselves reviewing and entering bills by hand as they arrive in hardcopy via snail mail or as PDF attachments in email, and automation can certainly deliver meaningful time savings.

In a relatively short order, Pandell has proven the real business value that innovative AI solutions bring clients, by reducing delivery times, saving costs, and increasing quality. With this AI solution already in use by international clients, it is a testament that Calgary tech is a real player on the world stage


About YYC Data Society

The YYC Data Society is Calgary's hub for data science, data engineering, and analytics. Born from grassroots communities, it is comprised of individuals and a consortium of partners including Data for Good, Calgary AI, PyData, Untapped Energy, Women in Data, and ADA.


About Pandell

Pandell is an industry leader in delivering Software-as-a-Service (SaaS) solutions to 500+ energy companies in Canada, the USA, and abroad. Our customers range from startups to major enterprises across energy sectors including oil & gas, pipelines, utilities, mining, and renewable energy. Our cloud-hosted product suite helps finance, land, and field operations run their business more effectively; while our enterprise division builds and manages large-scale web portal applications that facilitate work across organizations. Combining the strength of our industry experience, Lithium™ technology, and practical software subscription model, we are Crafting the Future of Energy Software.

Social and Info

Our Latest Posts Pandell's LinkedIN

Spreadsheets have their place, but in today's energy sector they're increasingly falling short 📊

Our latest article explores why many land, operations, and finance teams are moving away from spreadsheets as their primary tool. It highlights common challenges like errors, limited collaboration, and difficulty scaling as projects grow, while offering practical insights on improving workflows and data management.

Read the full article here: https://hubs.ly/Q03WX3Jj0

#DataManagement #LandManagement #GIS #DigitalTransformation #EnergyIndustry

We're excited to kick off the holiday season as a sponsor of this year's Canadian Association of Land and Energy Professionals, IRWA Right of Way, and Land and Energy Management Association of Canada Holiday Party! 🎉

Join us on December 4th at National on 10th for an evening of festive fun, great food, music, and good company. It's always one of the best events of the year!

#EnergyIndustry #Networking #OilandGas #LandManagement

Big moves on the horizon: Mark Carney has signed a major energy agreement with Alberta, a deal that could open the door to new oil pipeline projects, boost energy investment, and reshape Canada’s export strategy.

The pact also includes commitments to carbon pricing reforms and carbon capture initiatives, showing how energy growth and environmental responsibility might align.

🔎 Worth a read for anyone tracking Canada’s energy transition and infrastructure plans: https://hubs.ly/Q03W43nQ0

#CanadaEnergy #OilAndGas #EnergyIndustry #Pipelines

Turning Years of Lost Revenue into Real Value with Pandell Roads🚦

When Harvest Energy discovered their road-use billing hadn’t been completed in three years, they needed a solution fast. Their existing land system couldn’t back-bill, and revenue was slipping away.

After implementing Pandell Roads, a solution built for surface land professionals to automate and generate road use agreements and rental invoices, they cleared years of backlog, automated invoicing, and recovered 40× the cost of the system.

A major win for the team, and a powerful example of how the right technology can uncover hidden value.

#EnergyIndustry #RoadUse #LandManagement #OilandGas #Pipelines #Utilities

Happy GIS Day! 🌎

GIS brings clarity to complexity, helping us see patterns, make informed choices, and take action with confidence. Behind the scenes, it’s the mapping and data technology shaping business decisions and changing the world piece by piece.

As an Esri Partner, it's easily one of our favorite days of the year!

#GIS #GeographicInformationSystem #EnergyIndustry #ESRI #GISMapping

In the energy industry, data is abundant, but without true visibility and structure, it often stays hidden. GIS turns your raw land and infrastructure records into interactive maps and intelligence that drive smarter decisions and faster operations.

Learn more on how energy companies are embracing GIS to unlock their full potential.

Read the full article here: https://hubs.ly/Q03SS81l0

#GIS #LandManagement #GeospatialData #EnergyIndustry #EnergyGIS

☕ We had a fantastic time at the Land and Energy Management Association of Canada Annual Conference yesterday! It was great connecting with so many industry professionals and sponsoring the coffee break.

If we didn’t get a chance to chat, we’d still love to connect! Reach out to either Samantha Bush or Dean Ruston and set up a meeting.

#LEMAC #EnergyIndustry #LandManagement #OilandGas

View All Pandell Posts