You have 3,000 people on your website right now. Two of them are ready to buy. Your Google Analytics dashboard will never tell you which two.
This is the anonymous traffic problem. 97% of B2B visitors never fill out a form. Your best-fit prospects browse your pricing page, check your integrations, maybe scroll through a case study, and then leave. By the time your SDRs see a lead, those visitors are three days deep into evaluating a competitor.
The fix isn't another form. It's visitor identification that runs in real time, paired with an AI chat that can tell the difference between a student doing research and a VP of Sales about to sign a contract.
This post walks through exactly how that works. I'll show you the real architecture: how we identify a visitor in under 100 milliseconds, what our AI chat does before it says hello, and the 4 actions it can take once the visitor is identified. No marketing abstractions. A real trace.
How to identify website visitors: the basic mechanics
Website visitor identification means resolving an anonymous browser session into a known company or person. There are three data paths, and a good inbound agent uses all of them.
- IP-to-company resolution. Every visitor has an IP address. Services like Clearbit, 6sense, and Warmly's own reverse-lookup graph map that IP to a company. Accuracy is roughly 60-80% depending on the vendor and the ISP. Consumer ISPs (Comcast, Verizon residential) are useless. Corporate networks are gold.
- Cookie stitching. If the visitor has been to any other site in your identity provider's network, they have a cookie. The provider (LiveIntent, FiveByFive, RB2B, and a few others) returns a hashed email. You enrich that into a full person record.
- First-party capture. When someone fills a form, provides an email in chat, or clicks an email link with a tracking parameter, you capture them directly and backfill their session history.
Most vendors only do one of the three. Single-source identification caps out around 40% visitor coverage. Stacking all three gets you into the 70-80% range at the company level and 30-50% at the person level. Those are the real numbers. Anyone quoting higher is lying or counting wrong.
What happens when a visitor lands on your site
Here's the actual sequence when someone loads your pricing page. Every number below is measured off our production pipeline.
Milliseconds 0-100: Identify
The visitor loads the page. A tiny JavaScript tag (gzipped under 20KB) fires to our session server, opens a WebSocket, and creates a session record. Metrics get tagged with OpenTelemetry for tracing.
Our backend runs an IP-to-company lookup against a waterfall of providers. The first hit wins. For this visitor, we get back acme-supply.com with confidence 0.94. (Fictional example; real traces live inside our customer workspaces.)
At the same moment, we check our cookie graph. Has this browser been identified on another Warmly-powered site in the last 90 days? Yes. We have an email on file. Now we have a person, not just a company.
Total time: 87 milliseconds.
Milliseconds 100-400: Decide
Once identification lands, the session fires an onSignalHit event into a BullMQ Pro queue with exponential backoff and 3 retries. The inbound workflow trigger picks it up and runs the gates.
Gate 1: Domain blocklist. Is this domain on the customer's do-not-engage list? Competitors, existing customers they're already talking to, companies with a "do not contact" flag in Salesforce. If yes, exit immediately. Log domain_block_listed.
Gate 2: Data quality tolerance. Is the session's firmographic data within acceptable bounds? Missing company name, bogus IP geography, known bot user-agents all trigger rejection. Log data_quality_not_met.
Gate 3: Segment match. Does the visitor match any active workflow's audience rules? Tier 1 ICP, intent score above 150, on the pricing page, new hire signal in the last 30 days. If no workflow applies, the agent does nothing. Silence is a valid outcome.
This visitor passes all three gates. A workflow matches: "Tier 1 visitors on pricing page get immediate AI chat."
Milliseconds 400-2000: The AI chat starts
The inbound agent initializes an agentic conversation. We use LangChain's tool-calling agent pattern on top of OpenAI (GPT-4o-mini by default, with automatic escalation to a larger model for complex accounts). State is held in Redis with a 90-minute TTL so the conversation can resume across page loads.
Before the agent speaks, it pulls visitor context into the system prompt:
- Company name, industry, employee count, tech stack (from enrichment)
- ICP tier (Tier 1, Tier 2, etc.)
- Intent score breakdown (which signals are firing)
- Any prior conversations or email threads
- Current page path and URL parameters
- Organization-specific brand voice, product info, and qualification criteria
Armed with that context, the agent picks an opening line. Not a canned greeting. A specific one.
For our Acme Supply visitor, the opener reads: "Hey, saw you're looking at pricing. Quick heads up that we have a wholesale distribution starter plan that might fit better than what's on this page. Want me to pull it up?"
Not "Hi! How can I help you today?" That one is where AI chatbots go to die.
Milliseconds 2000+: The conversation loop
Each turn of the conversation runs up to 3 iterations of the tool-calling agent. Available tools include:
ask_question: send a message to the visitor
provide_info: answer a product question with grounded content
capture_email: qualify and identify the visitor by email
book_meeting: route to the right rep's calendar via LeanData or native routing
qualify_lead: score the lead against the customer's ICP rules
transfer_to_human: hand off to a live rep with full context
end_conversation: gracefully wrap up when the visitor is done
The agent streams tokens back to the widget via Socket.IO as it generates. The visitor sees the response word by word, not a "typing..." indicator that sits there for 4 seconds.
If the agent gets stuck or the LLM times out, a fallback message fires: "I'm having trouble right now. Let me connect you with a team member." That handoff is routed through the same rep-assignment logic a human qualification would trigger.
The 4 actions an inbound agent can take
This is the part of visitor identification most tools miss. Identifying the visitor is step one. The hard part is deciding what to do once you know who they are.
Our inbound workflow engine can execute four distinct actions, chosen based on visitor context and customer policy.
| Action |
What it does |
When it fires |
| Show popup |
Renders a targeted overlay with copy tailored to the visitor's segment |
Moderate intent, no prior engagement, customer prefers passive prompts |
| Send to webhook |
Posts the full session context to the customer's endpoint (Zapier, Workato, custom) |
Customer runs their own routing logic or wants to enrich a CDP |
| LeanData BookIt |
Pulls a calendar link from the customer's LeanData routing engine and renders a booking button or redirect |
High intent, Tier 1 account, customer uses LeanData |
| Assign to rep |
Matches the visitor to the right rep (based on territory, account ownership, round-robin) and opens chat with that rep's name and avatar |
High intent, known account owner, customer prefers human-in-the-loop |
Most "AI chatbot for website" tools only do one of these. They always open a chat. They always ask for an email. They always treat every visitor the same. That's the chatbot era. It was a mistake.
Why real-time matters
The difference between identifying a visitor in 100 milliseconds and identifying them in 5 seconds isn't cosmetic. It's the difference between starting a conversation and losing one.
B2B website sessions average 47 seconds. If your tool takes 5 seconds to identify, 5 more seconds to decide, and 5 more to load a chat bubble, you've used a third of the visit on plumbing. Half the visitors have already bounced. The ones who stay are staring at a chat popup that feels like a trap because it loaded suspiciously late.
Sub-second visitor identification changes the surface area of what's possible. You can personalize the hero section in real time. You can rewrite the pricing CTA for the specific company. You can send a Slack alert to the AE before the visitor has scrolled past the fold.
Most importantly: you can decide to do nothing. The most premium action is often restraint. A Tier 1 prospect reading a case study doesn't want a chat popup. They want to read. The right inbound agent knows that and waits.
Why most AI website chatbots don't work
Most "AI website chatbot" products fail for three reasons, and none of them are the LLM.
They don't actually identify visitors. They start talking to everyone the same way because they have no context to do otherwise. The "AI" is just a template engine with good grammar.
They aren't connected to real tools. The chatbot can answer product questions but can't book a meeting, trigger a webhook, check a CRM, or route to a rep. It's a brochure with a typing cursor.
They don't know when to stop. They ask for emails on page 1. They fire popups on every visit. They interrupt pricing-page reads. They treat engagement volume as the success metric instead of conversion quality.
An inbound agent is different because the chatbot is one tool out of many, not the whole product. The agent decides whether to chat, show a popup, send a webhook, pull a calendar, or stay silent. The LLM is the decision-maker, not the decoration.
Where our inbound agent still falls short
Spare you the "we pioneered" routine. Here's what we actually still get wrong.
The first 48 hours of a new deployment are rough. When we spin up a new customer, the agent doesn't yet know their brand voice, their objection patterns, or their product positioning in depth. Our onboarding pipeline ingests the customer's website, docs, and past chat transcripts, but the first two days of chats read a little generic. By day 3, the voice locks in. Day 1 feels like a competent junior AE. Day 7 feels like someone who works there.
Deeply technical product questions still trip us up. If a senior engineer asks about our rate-limit behavior on a specific webhook, the agent does the right thing and hands off to a human. That's the design. But there's a real gap between "can confidently answer 80% of prospect questions" and "replaces your solution engineer." We're in the first camp. Anyone selling you the second is selling you vapor.
Returning visitors who already got AI chat want to talk to a human. Our chat UX makes the handoff clear when a rep is online. When no rep is available, the fallback to "I'll get a human to follow up over email" feels worse than the first chat. We're working on better async handoffs. Not solved.
None of these are reasons to skip an inbound agent. They're reasons to set honest expectations about where it excels (the 80%) and where it doesn't (the long tail).
How to set this up
If you're building visitor identification into your B2B site, the rough order of operations:
- Start with one identification source. Pick the one most likely to work for your traffic mix. For B2B with lots of corporate IPs, use IP-to-company. For consumer-adjacent, use a cookie graph provider.
- Capture first-party data aggressively. Form fills, email clicks with tracking, chat capture. Every captured email enriches every future session on the same browser.
- Define segments before tooling. "Tier 1 account on pricing page" is a segment. "Someone who visited twice this week" is a segment. Map segments to actions before you pick a vendor.
- Pick a tool that supports all four action types. If it only does chat, you're buying a chatbot. Make sure it can popup, webhook, book, and assign.
- Measure conversion quality, not conversation volume. Number of meetings booked. Pipeline created. Close rate on identified-visitor-sourced deals. Chat volume is a vanity metric.
- Add the AI chat layer last. The agent is the top of the stack. Get identification and routing right first, then bolt on the conversational layer.
If you want to skip steps 1 through 4 and see the whole thing running on your own traffic, that's what we do at Warmly. Book 20 minutes with our team and we'll pull a live trace of your visitors during the call. Real IPs. Real companies. Real decisions.
Related reading
FAQ
How do you identify anonymous website visitors?
By stitching three data paths: IP-to-company resolution, cookie-based identity providers (LiveIntent, FiveByFive, RB2B, etc.), and first-party capture from forms, email links, and chat. Consensus across the three gets you roughly 70-80% coverage at the company level.
What is a reverse IP lookup?
Reverse IP lookup is the process of mapping a visitor's IP address to the company that owns it. Services like Clearbit Reveal, 6sense, and Warmly maintain databases of IP-to-company mappings. Accuracy depends heavily on the network: corporate office IPs hit 80%+, residential ISPs are essentially unusable.
What is an AI inbound agent?
An AI inbound agent is an autonomous software agent that identifies website visitors in real time, decides what action to take based on context (chat, popup, webhook, meeting booking, or nothing), and executes without waiting for a human to click a button. It's different from a chatbot because chatting is one of many tools it can use, not the only tool.
How fast can you identify a website visitor?
Sub-100 milliseconds for the identification itself (IP lookup plus cookie stitching). Most production systems run end-to-end from page load to decision in 400-2,000 milliseconds. If your tool takes 5+ seconds, the visitor is already scrolling away.
What's the difference between a popup and an AI chat?
A popup is a one-way interruption. An AI chat is a two-way conversation. An agentic inbound system can use either, depending on context. High-intent visitors get chat. Moderate-intent visitors sometimes do better with a targeted popup. Low-intent visitors often get nothing at all.
Can AI website chatbots actually book meetings?
Yes, when they're integrated with a routing engine like LeanData or the customer's native CRM. The chatbot qualifies the visitor, pulls the right rep's calendar link via API, and renders a booking button inline. The handoff is seamless. The rep sees the full conversation context when the meeting lands on their calendar.
Does website visitor identification work in a cookieless future?
Partially. IP-to-company resolution doesn't require cookies. First-party email capture doesn't require cookies. What breaks in a cookieless world is third-party cookie-based person-level identification, which is already degraded in most browsers. Company-level identification is durable. Person-level needs to move to first-party.
How does visitor identification integrate with outbound?
A well-designed inbound system writes back to the same context graph the outbound agent reads from. When an identified visitor leaves without converting, the outbound system picks them up and drops them into an email sequence or an ad audience. Inbound and outbound share state, not silos.
Last Updated: April 2026