Home Methodology
Methodology — how we collect and verify the data.
This page describes, without rounding off, what sources we use, how often we refresh, what verifications we apply, what we cannot guarantee and how we respond to error reports. If you spot a mistake or get lost somewhere, write to contact@hartafarmacii.ro.
1. Data sources
1.1 Pharmacy locations
The primary source for locations is OpenStreetMap (OSM), queried
through the Overpass API
with a filter on amenity=pharmacy within Romania's administrative
boundaries. OSM data is distributed under the
ODbL
licence, which allows redistribution and use with attribution — attribution we
state explicitly in the footer of every page and in the public GeoJSON endpoint.
OSM data is supplemented with the public store-locators of the main chains (Catena, Dr. Max, Help Net, Dona, Farmacia Tei, Mattca, Spring Farma) — these are a secondary source for opening hours, phone numbers and chain identification. We use them to enrich, not to replace, the OSM coordinates.
1.2 Product prices
Prices come from public online-pharmacy websites. The preferred
mechanism is parsing JSON-LD schema.org/Product blocks — a
structured format that most retailers publish to ease indexing in Google
Shopping. Where JSON-LD is unavailable, we use site-specific HTML parsers
(specific CSS selectors).
Chains we currently extract prices from: Catena, Dr. Max (with the Richter sub-brand), Farmacia Tei, Help Net, Dona (when its crawler isn't blocked), Mattca, Spring Farma, plus independent online pharmacies (Minifarm, Farmacia Dav, Farmacia Ardealul) — effective coverage varies over time, depending on the availability of the public feeds.
1.3 Rx references (CANAMED)
For prescription drugs (Rx) we use the CANAMED list as a reference — the maximum retail price approved by the Ministry of Health, published monthly on the ministry's website. CANAMED is not a sale price — it is the cap above which a pharmacy cannot sell. We display it strictly as a benchmark. The list is imported monthly (systemd timer on the VPS).
2. Refresh cadence
- Locations — pulled from OSM daily, but only when changes are detected via diff. Locations are very stable (pharmacies don't move daily).
- Opening hours — updated manually on report or when the chain publishes an update on its store-locator. For 24/7 pharmacies we have a dedicated page that clearly marks when we cannot confirm 24/7 status.
- Prices — vary per chain: some scrapers run daily (those with easy JSON-LD that don't aggressively block), others weekly (Catena/Tei on listing pages). The last-update timestamp is stored per offer in the database and exposed on the product page.
- CANAMED — monthly import.
3. Data quality
3.1 Product ↔ price matching
The hardest part of a pharmaceutical price comparator is matching: deciding with confidence that „Nurofen 200mg pack of 24" on one site is the same product as „NUROFEN 200 MG 24 FILM-COATED TABLETS" on another site. We use three signals in order of trust:
- GTIN / EAN-13 code — when present in JSON-LD, this is a deterministic match.
- INN (International Non-proprietary Name) + strength + pharmaceutical form + pack size — used as secondary match.
- Normalised brand name + pack size — fallback when the first two are missing.
When two sources disagree on pack size (e.g. one says „24 tabs", the other „24 capsules"), we raise a collision flag: offers stay separate until manual review. We prefer a false negative (two duplicate entries) over a false positive (wrong price displayed under the wrong product).
3.2 Automatic checks
- Price within a reasonable range (RON 0.10 – 5,000) — out-of-range prices are flagged and not displayed.
- Detection of identical prices across multiple sources → usually indicates an upstream error which we flag.
- Heuristics that detect when a site has changed structure (sudden 0% match rates) → alert.
4. Limitations declared openly
The displayed price may differ from the price in the physical pharmacy. The prices we scrape come from chains' online stores — which aren't always in sync with the on-shelf price in a physical store. For purchases, verify directly at the pharmacy.
- Coverage is not 100%. Some chains (currently Dona, Ardealul) block crawlers or require Cloudflare challenges, which reduces coverage. We mark visibly when we have no data for a chain on a given product.
- Reimbursed Rx prices — we don't show them. The actual price a patient pays for a reimbursed drug depends on prescription status (50% reimbursement, 90%, free for some conditions), the prescribed INN, and the dispensing pharmacy. These prices are not fixed public prices — showing one figure would be misleading. CANAMED is the only absolute public benchmark, and that's what we display.
- Incomplete data for small pharmacies. For independent pharmacies not listed in OSM and without a website with a store-locator, data may be missing.
- Latency to appearance. A new pharmacy appears in our index after it enters OSM — usually a few weeks. We can speed this up on report.
5. Logo policy
We use chain logos under descriptive fair use — for brand identification in comparison lists and on chain pages. We do not imply endorsement, partnership, or authorisation. If a chain requests takedown at contact@hartafarmacii.ro, we remove the logo within 48h and replace it with a neutral text placeholder.
6. AI policy
HartaFarmacii uses AI only in assistive pipelines, not for publishing auto-generated content without review. Specifically:
- Editorial articles (patient guides, symptom and condition pages) are written by manual curation of cited public sources (ANMDMR, EMA, patient leaflets). They may be assisted by a language model for fluency, but every medical fact is verified in the primary source before publication.
- There is no mass-generated SEO „spun" content from LLMs published without review.
- Common Crawl / scraped pages are not republished on HartaFarmacii — we use data only to compute displayed prices.
7. Conflict of interest
Stated explicitly: zero conflict of interest. We are not affiliated, we don't receive sales commissions, and we don't have contracts with the listed chains. Revenue comes 100% from:
- contextual Google AdSense (web) and Google AdMob (iOS) ads, shown only after consent;
- iOS „Premium Lifetime" In-App Purchase at 4.99 RON, single payment, that disables ads.
No revenue source conditions price display order or which article appears on the site. If you suspect a conflict, write to us and we'll clarify.
8. How to report an error
Single channel: contact@hartafarmacii.ro →
- For urgent corrections (wrong price on an indexed page) — response in 24-48h.
- For non-urgent corrections (incorrect schedule, invalid phone, wrong chain) — corrected within 7 working days.
- For logo / pharmacy takedown — 48h.
Internal process: we receive the email → confirm in 24h → verify the source (usually back at the original public source) → apply the correction in the database → confirm the result by reply.
9. Versions and lifecycle
This methodology page was last reviewed on 2026-05-01 and is updated whenever we add a new source, change a refresh cadence, or discover a new limitation we want to declare. The full code repository is private but open to audit on request.
Read on: Editorial team · About · Terms · Privacy.