News

Article

YYC Data Society Showcases Pandell for its Innovative Use of AI to Digitize Ancient Land Documents for Energy

Mar 01 2023 | Community

Calgary, Alberta, Canada – The YYC Data Society, a network of 4,000 people working as a community to advance Calgary’s thriving data science scene, recently featured an article on Pandell’s innovative use of AI to digitize ancient documents for the energy industry. The article highlights how Pandell automates land document digitalization to help their clients' speed up project delivery time and save on valuable capital budget dollars. [View original post here.]

Imagine you are a large utility or pipeline company overwhelmed with the myriad of land documents required to operate and maintain powerlines or pipelines over landowners’ properties. These documents include:

  • Leases - The rights to use a piece of land for a specific time.
  • Easements - Agreements to cross one’s land.
  • Deeds - Agreements to transfer land ownership.
  • Titles - Ownerships of land.
  • Contracts - Agreements for the land and real estate on the land purchase with the seller financing the buyer.
  • Permits - Agreements for the specific uses of a land dictated by governments.

The land-owning parties can be a single person, a company, a government organization, a non-profit organization, a crown corporation, an Indigenous group, Etc. You have tens of thousands of such documents to track, so unless you are playing monopoly and only need to deal with a few people, you will need help.

This is where Pandell comes in. Pandell's software suite helps their clients in oil & gas, renewables, utilities, and pipelines by enabling organizational functions like accounting, land management, operations, and project management to work together seamlessly in a single, centralized digital hub, similar to a maestro orchestrating a symphony. The success of their solutions has resulted in a client list that includes Chevron, Shell, and BP Wind Energy.

Improve Your Asset Management

Discover How LandWorks Can Help

When a client first chooses Pandell's land management software, their historical land records must be pre-loaded to ensure they are productive from day one. Customers must make scheduled lease payments, manage joint interests, view assets on maps, manage chains of titles, and track obligations and provisions, such as the right to cut down trees or if an owner needs notification when accessing their land. While helping clients manage their land problems, Pandell has a problem when trying to load the land data.

In data science, digital data is categorized into structured, semi-structured, and unstructured data. The data is often on paper; lots and lots and lots of paper. About 300,000 pieces of unscanned paper pages per year, representing approximately 85,000 documents, are given to Pandell annually. They date back as far as the 1800s, with many pages barely legible by humans. Spills, ink blots, rips, imperfect copies (often copies of copies), and general fading abound. Could it be that Woodrow Wilson spilled his tea while signing a New Jersey land agreement in 1912? Quite possibly.

An image of original hand written paper land documents.

Click to enlarge

We sat down with Pandell's VP of R&D, David Beresford, to discuss how they historically tackled importing this data and how they are now making creative use of AI to streamline this process. In the past, David told us, there was no magic (sorry, Indy). Pandell employs a team of twenty dedicated Land Specialists who read, interpret and digitize documents manually, working year-round on the data. As skilled as the team is, the process is labor-intensive, prone to human error, and generally costly. "Large customers can provide tens of thousands of documents. The old manual process could take months to process and easily cost the client over a million dollars." David said.

The legacy process started when the Pandell team visited client offices and manually scanned the paper found in boxes and cabinets. PDF files were generated, one per box or drawer, and documents were individually separated to make it easier for scanning. When complete, the arduous task of examining every page in the PDFs began, with the team rotating pages to correct orientation, splitting them out into individual land agreements, and then manually inputting relevant data from the documents into Pandell's land system.

Earlier this year, Pandell set out to use modern technology to streamline this process. The first step was obvious - they needed a high-quality text extraction from the scanned pages. Traditional Optical Character Recognition (OCR) technology did not perform well due to the low quality of the documents. The latest AI Computer Vision-based services from Amazon (AWS), Google (GCP), and Microsoft (Azure) were each evaluated. Each vendor performed well in some areas and less so in others. AWS was proficient at reading handwritten text yet could not read rotated text; about 7% of the text was missing. Azure read the rotated text but struggled at text region detection (keeping related text blocks together). Google excelled at region detection but struggled to recognize handwriting. Pandell's answer was to combine data from each Cloud's AI service vendor to produce a consummate, "best-of-breed" result.

Duplicate documents were an immediate problem for the land team, causing the same work to be completed multiple times. Duplicates have been created over the years as clients transferred copies between field offices. With 45,000 documents in the backlog, the team had no way to identify these copies. By applying a document similarity algorithm against the newly extracted OCR text, 3,500 documents out of 46,000 were quickly flagged and removed. It's estimated that over 45 days worth of work was saved on an initial project that they experimented with this process on.

This "deduplication" provided the first proof of the business value of automation. Pandell then expanded this initial proof-of-concept into a comprehensive land document digitization pipeline. This included detecting PDF split points, orientating pages, classifying documents, linking documents to sub-documents, auto extracting data, cleaning data, and deduplication. David said it was also critical to build a business process around the technical models, so Pandell incorporated a workflow for quality assurance, reporting, forecasting, and billing.

Using OCR to detect duplicate documents.

Click to enlarge

To speed up the production rollout of the new framework, the first iteration implemented only "low hanging fruits". For example, only 20 of 40 document types were trained for classification, and about 20% of common entities were trained for value extraction (e.g., lessee, lessor, document date, document name). Despite being short of full automation, the system was still able to save much time and was rolled out for the land team to use in the real world. As the team validated classification results, the new engine achieved a success rate between 84% and 93%, depending on the document type, with about 80% of the documents successfully extracted data values (for currently trained entities). These were encouraging results, given the incomplete models. The news got even better when the document data-entry rates were computed - the entry speed was double historical norms.

Not resting on their laurels, Pandell began tackling an even more challenging problem - the automatic generation of GIS polygons from text descriptions. Land agreements contain "legal descriptions" that describe land boundaries that clients often want to visualize. In this case, Pandell Mappers read these descriptions, which can be several paragraphs long, and plot each point described using ESRI ArcGIS software.

The resulting polygons are imported into Pandell's software for display on a map. Pandell has a prototype that extracts this land description text, then parses and rewrites them into a normalized format. Next, Pandell will attempt to auto-plot points and auto-build polygons. Since mapping polygons can take twice as long as text entry, the potential time savings are enormous.

As with many companies attempting to apply AI and ML to their operations, Pandell is still exploring the endless possibilities. One future target is invoice recognition for their financial products. Customers find themselves reviewing and entering bills by hand as they arrive in hardcopy via snail mail or as PDF attachments in email, and automation can certainly deliver meaningful time savings.

In a relatively short order, Pandell has proven the real business value that innovative AI solutions bring clients, by reducing delivery times, saving costs, and increasing quality. With this AI solution already in use by international clients, it is a testament that Calgary tech is a real player on the world stage


About YYC Data Society

The YYC Data Society is Calgary's hub for data science, data engineering, and analytics. Born from grassroots communities, it is comprised of individuals and a consortium of partners including Data for Good, Calgary AI, PyData, Untapped Energy, Women in Data, and ADA.


About Pandell

Pandell is an industry leader in delivering Software-as-a-Service (SaaS) solutions to 500+ energy companies in Canada, the USA, and abroad. Our customers range from startups to major enterprises across energy sectors including oil & gas, pipelines, utilities, mining, and renewable energy. Our cloud-hosted product suite helps finance, land, and field operations run their business more effectively; while our enterprise division builds and manages large-scale web portal applications that facilitate work across organizations. Combining the strength of our industry experience, Lithium™ technology, and practical software subscription model, we are Crafting the Future of Energy Software.

Social and Info

Our Latest Posts Pandell's LinkedIN

☕ We had a fantastic time at the Land and Energy Management Association of Canada Annual Conference yesterday! It was great connecting with so many industry professionals and sponsoring the coffee break.

If we didn’t get a chance to chat, we’d still love to connect! Reach out to either Samantha Bush or Dean Ruston and set up a meeting.

#LEMAC #EnergyIndustry #LandManagement #OilandGas

We're excited to be the Coffee Break sponsor of the 2025 LEMAC Annual Conference tomorrow! It'll be a day dedicated to exploring how Canada's land and energy sectors are evolving through innovation, collaboration, and sustainability.

We hope to see you there!

#LEMAC #EnergyIndustry #Sustainability #LandManagement

We had an amazing time connecting and golfing this week at #IMGIS2025 in Palm Springs! ⛳

Some of our customers who integrated Whitestar’s clean, standardized land data into their Pandell land system saw up to 80% less time spent on land administration, a true game changer for efficiency.

Pandell’s Enterprise Land unifies land workflows and geospatial data in one system. And since Pandell and Whitestar have been Esri partners for years, everything works seamlessly with the GIS tools your team already knows and trusts.

If you missed us and want to learn how to step up your land management, reach out, we’d love to connect!

#Esri #EsriPartner #GISMapping #LandManagement #GeospatialData

The latest McKinsey & Company Global Energy Perspective reveals that global energy demand continues to grow, even as decarbonization efforts accelerate. From regional shifts and rising electricity needs to the ongoing role of fossil fuels, the report highlights both opportunity and urgency for the energy sector.

Read more: https://hubs.ly/Q03PWX3K0

As the energy landscape evolves, our integrated land and financial software helps companies to plan, track and manage their business operations more efficiently.

#EnergyTransition #GlobalEnergy #Utilities #RenewableEnergy #OilAndGas

⚡Are your data systems holding your energy operations back?

As energy companies grow and adopt new technologies, managing data across multiple vendors can quickly become a major challenge. Disconnected systems lead to inefficiencies, higher costs, and slower decision-making.

Our article explores how integrated data management platforms can help unify information, break down silos, and deliver real-time insights for better, faster decisions.

👉 Read more: https://hubs.ly/Q03PxBwV0

#EnergyIndustry #DataManagement #DigitalTransformation #EnergyInnovation #Utilities #OilAndGas #Renewables

Exhibiting at Electricity Transformation Canada 2025 last week in Toronto was a blast! We loved exchanging ideas on how renewable energy companies can transform their land management. It was great to show how quickly land teams can access information by bringing all their land data into an integrated solution.

Check out Pandell LandWorks https://hubs.ly/Q03NCwSd0

#LandManagement #RenewableEnergy #CleanEnergy #ETC2025

It was so much fun sponsoring the Masquerade Ball at the Canadian Association of Land and Energy Professionals Conference last week in Saskatoon!

Unmask your land data with confidence - our experts guide you from start to finish.

Pandell Projects - where energy operators and land services partners can track, communicate and report land acquisition project details in real-time. https://hubs.ly/Q03NHZtc0

#LandManagement #CALEP2025 #LandAndEnergy #EnergyLeadership

View All Pandell Posts