Enhancing Geolocation Classification with LLMs & Google Maps API

The Challenge of Location Classification

 

Accurate location classification is important for any organisation seeking to understand and optimise the distribution of physical infrastructure, such as hospitals, schools, logistics hubs, or government facilities. In this context, location classification refers to the process of assigning places to a standardised geographic hierarchy (such as country, region, province, district, or city) based on their address or coordinates, thus ensuring that each facility or site is consistently placed within known administrative boundaries.

 

In many cases, organisations already have basic location data (the names of clinics, schools, offices, etc.), but it’s not classified into a clear geographic hierarchy. This could be because data is entered manually, comes from different sources, or is inherited from older systems without proper location tags, meaning that the organisation can’t plan or analyse properly until the data is standardised.

 

Dealing with geolocation data can be challenging, particularly when it comes to extracting precise location information: there may be inconsistencies in data sources and varying levels of detail. For example, facility names may be recorded differently across databases, making it difficult to match or geocode them accurately; addresses may be incomplete, misspelt, or use non-standard formats, especially when collected manually. What’s more, geographic boundaries can change over time, leading to outdated or conflicting classifications. All this makes it difficult to evaluate service coverage, identify underserved areas, or plan logistics that depend on accurate mapping.

 

We first encountered this challenge while working with a telecommunications company that needed to categorise infrastructure locations, like antennas and service centres, into clear geographic hierarchies. Their existing data lacked consistent formatting, which highlighted the many problems of working with unreliable, unstructured location data.

 

 

An AI-Driven Approach to Location Classification

 

In this use case, we looked at how to solve these challenges by combining the natural language processing capabilities of LLMs with structured address data obtained through the Google Maps API. Whilst our test dataset is focused on Spanish hospitals, the methodology can be used for many kinds of infrastructure facilities that require accurate geographic identification. We explored whether modern AI tools can automate the classification of such facilities into standardised geographic categories, using only the name of each location as the input in cases where more detailed geographic information is unavailable.

 

 

Initial Attempt Using LLMs

 

Our first attempt involved using an LLM via the OpenAI API, more specifically ChatGPT-4o, to extract location details from hospital names. The prompt was designed to instruct the model to return a structured JSON response containing four levels of location information: Autonomous Community, Province, Town, and Street.

 

 

Despite the natural language processing capabilities of ChatGPT, in many cases this method failed to classify the hospitals accurately, primarily because their names do not provide enough geographic context for the model to pinpoint precise locations. For example, some hospitals may have similar names across different towns or provinces, creating ambiguity; furthermore, the model might struggle to recognise less well-known or unique place names.

 

The following table presents several examples of this: one notable case is the “Centro de Rehabilitación Psicosocial Santo Cristo de los Milagros”, which the model incorrectly places in Jerez de la Frontera, although it is actually located in Huesca, a thousand kilometres away! Another example is the “Instituto de Enfermedades Neurológicas de Castilla-La Mancha”; whilst assigning it to the right town, the model fails to identify the correct province and omits the street. Similarly, the “Hospital Universitario de Jerez de la Frontera” is accurately classified in terms of location, but the street name is missing from the response. In one case, the model was unable to find any location at all for the “Casal de Curació”.

 

On a more positive note, there are four examples in which the model correctly classified all four levels of location information. However, in three of those cases, it included the street number in the “Street” field, despite the prompt specifying that only the street name (not number) should be returned:

 

Hospital NameAutonomous CommunityProvinceTownStreet
Institut Català D’Oncologia – Hospital Duran i ReynalsCataluñaBarcelonaL’Hospitalet de LlobregatAvinguda Granvia de l’Hospitalet, 199-203CORRECT
Centro de Rehabilitación Psicosocial Santo Cristo de los MilagrosAndalucíaCádizJerez de la FronteraCalle Santo Cristo de los MilagrosINCORRECT (wrong location)
Instituto de Enfermedades Neurológicas de Castilla-La ManchaCastilla-La ManchaToledoGuadalajaraNoneINCORRECT (wrong province, incomplete classification)
Casal de CuracióNoneNoneNoneNoneINCORRECT
(empty classification)
Hospital Universitario de Jerez de la FronteraAndalucíaCádizJerez de la FronteraNoneINCORRECT (incomplete classification)
Hospital Universitario Reina SofíaAndalucíaCórdobaCórdobaAvenida Menéndez PidalCORRECT
Hospital Medimar InternacionalComunidad ValencianaAlicanteAlicanteAvenida de Denia, 103CORRECT
Clínica PlanasCataluñaBarcelonaBarcelonaCarrer de Pere II de Montcada, 16CORRECT

 

 

Google Maps API: The Next Step Forward

 

After realising the limitations of relying only on hospital names, we integrated the Google Maps API to obtain more structured location data. The API provided complete addresses for each hospital, ensuring enhanced accuracy; these addresses were then processed by the LLM to extract hierarchical location details:

 

 

This refined approach led to significantly improved results, as the addresses provided offered enough context for the LLM to accurately extract location information. With access to precise address data, the model was able to successfully identify the «Autonomous Community», «Province», «Town», and «Street» for each hospital.

 

When applied to the same dataset used in the previous attempt, the combination of the Google Maps API and the LLM resulted in accurate classifications in all cases, demonstrating the effectiveness of this combined method:

 

Hospital NameGoogle Maps AddressAutonomous CommunityProvinceTownStreet
Institut Català D’Oncologia – Hospital Duran i ReynalsAvinguda de la Granvia de l’Hospitalet, 199, 08908 L’Hospitalet de Llobregat, Barcelona, SpainCataluñaBarcelonaL’Hospitalet de LlobregatAvinguda de la Granvia de l’Hospitalet
Centro de Rehabilitación Psicosocial Santo Cristo de los MilagrosCarretera de Arguis, Km 2.5, 22006 Huesca, SpainAragónHuescaHuescaCarretera de Arguis
Instituto de Enfermedades Neurológicas de Castilla-La ManchaP.º Estación, 2, 19001 Guadalajara, SpainCastilla-La ManchaGuadalajaraGuadalajaraP. º Estación
Casal de CuracióCarrer de Maria Vidal, 48, 08340 Vilassar de Mar, Barcelona, SpainCataluñaBarcelonaVilassar de MarCarrer de Maria Vidal
Hospital Universitario de Jerez de la FronteraCtra. Trebujena, s/n, 11407 Jerez de la Frontera, Cádiz, SpainAndalucíaCádizJerez de la FronteraCtra. Trebujena
Hospital Universitario Reina SofíaAv. Menéndez Pidal, s/n, Poniente Sur, 14004 Córdoba, SpainAndalucíaCórdobaCórdobaAv. Menéndez Pidal
Hospital Medimar InternacionalAvinguda de Dénia, 78, 03016 Alacant, Alicante, SpainComunidad ValencianaAlicanteAlicanteAvinguda de Dénia
Clínica PlanasCarrer de Pere II de Montcada, 16, Sarrià-Sant Gervasi, BarcelonaCataluñaBarcelonaBarcelonaCarrer de Pere II de Montcada

 

 

Broader Applications and Implications

 

Whilst this case focuses on healthcare facilities, the underlying methodology is applicable to any dataset involving physical locations where only name-based data is available. Accurate location classification can offer significant value to governments, NGOs, and logistics providers across various domains. Here are some more examples:

 

  • In healthcare infrastructure planning, this method can be used to identify service gaps in hospital or clinic coverage, as well as to understand regional healthcare capacities.
  • In the education sector, it helps to map the distribution of schools and universities to assess educational equity across regions.
  • In logistics and supply chain design, this classification allows organisations to categorise warehouses and distribution centres in ways that optimise routes and ensure efficient regional coverage.
  • In emergency planning and response, it enables the quick identification and analysis of critical infrastructure locations, essential for effective action during natural disasters, health crises, or other large-scale emergencies.

 

This approach transforms place names into clean, structured location data that can be easily used by mapping tools, Geographic Information Systems (GIS), and regional datasets. It also serves as an input for dashboards, data visualisations, and advanced spatial analysis models.

 

When formal addresses are unavailable, as with remote or restricted locations, latitude and longitude coordinates from the Google Maps API can be used to identify the area, and LLMs can infer hierarchical location details (like country or region) from these coordinates by recognising geographic patterns. However, relying solely on coordinates also presents limitations: although they provide precise points on the map, they lack semantic context, making it harder for models to consistently assign them to the correct geographic hierarchy. This can be especially problematic for locations near administrative boundaries, where even slight variations in coordinates might lead to misclassifications.

 

 

Conclusion

 

LLM-based location classification offers a flexible, scalable way to standardise geographic information. With tools like the Google Maps API, we can overcome the limitations of partial, unstructured, or inconsistent input data, guaranteeing more structured and accurate location data. When full address information is available, LLMs effectively extract detailed location hierarchies, delivering high accuracy across various levels.

 

This method provides a practical way to transform raw location names into useful geographic data: by continually improving the integration of external databases and geospatial tools, this approach becomes more accurate and reliable, making it an essential tool for classifying locations.

 

Want to learn more about how we use AI to solve real-world data challenges? Interested in applying this method to your organisation’s infrastructure, planning, or location intelligence workflows? Here at ClearPeaks we work with customers across sectors to help them clean, structure, and get more value from their data using advanced AI and geospatial tools. Whether you have a clear use case or just want to explore what’s possible, we’d love to hear from you. Contact us today!

Carme V
carme.vinas@clearpeaks.com