Semantic Discovery

Enable AI agents to find products by intent and context, not just keywords.

Overview

Traditional e-commerce search relies on keyword matching—if a product doesn't contain the exact words a user types, it won't appear in results. Semantic Discovery changes this by understanding the meaning behind queries.

When a buyer agent asks for "shoes for running a marathon in the rain," semantic search understands this means waterproof, performance running shoes—even if those exact words don't appear in product descriptions.

Hyperfold uses vector embeddings to represent both products and queries in a high-dimensional space where semantically similar items are close together.

Vector Embeddings

Every product in your catalog is transformed into a dense vector representation that captures its semantic meaning:

json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
{
"product_id": "prod_aero_x2",
"name": "AeroRun X2 Marathon Shoe",
"description": "Professional running shoe with Gore-Tex waterproofing",
"semantics": {
"category": "apparel/footwear/running",
"usage_context": ["marathon", "trail", "wet_conditions"],
"visual_tags": ["blue", "reflective", "mesh_upper", "chunky_sole"],
"attributes": {
"weight_g": 280,
"heel_drop_mm": 8,
"waterproof": true,
"breathable": true
},
"vibe_tags": ["professional", "performance", "outdoor", "serious_runner"]
},
"embedding": [0.023, -0.156, 0.891, ...], // 1536-dim vector
"similar_products": ["prod_storm_gt", "prod_trail_king"],
"frequently_bought_with": ["prod_socks_wp", "prod_insoles_pro"]
}

How Embeddings Are Generated

When you import a product, Hyperfold uses Vertex AI's multimodal embedding model to:

  1. Analyze product images to extract visual features (color, style, shape)
  2. Process text descriptions to understand attributes and use cases
  3. Combine visual and textual features into a unified embedding
  4. Store the embedding in Vertex AI Vector Search for fast retrieval

The difference between keyword and semantic search is dramatic:

text
1
2
3
4
5
6
7
8
9
10
11
# Traditional Keyword Search
Query: "blue running shoes waterproof"
Results: Exact matches for "blue" AND "running" AND "shoes" AND "waterproof"
Problem: Misses "navy marathon trainers with Gore-Tex"
# Semantic Vector Search
Query: "shoes for running a marathon in the rain"
Results: Products matching the *intent* and *context*
Finds: "AeroRun X2" (waterproof, marathon-optimized)
"StormRunner GT" (wet-condition specialty)
"TrailKing WP" (outdoor, water-resistant)

Testing Semantic Search

Use the CLI to test how your catalog responds to natural language queries:

bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Test semantic search via CLI
$ hyperfold search "cozy jacket for a rainy wedding"
> [Vector] Generating embedding for query...
> [Search] Querying Vertex AI Vector Search...
> [Results] 5 products found (semantic confidence: 0.91)
RANK PRODUCT CONFIDENCE PRICE
1 Elegant Rain Trench 0.94 $189
2 Waterproof Blazer 0.91 $245
3 All-Weather Sport Coat 0.88 $165
4 Classic Raincoat 0.85 $129
5 Water-Resistant Parka 0.79 $99
> [Insight] Top results emphasize "formal" + "waterproof"
> Query interpreted as: formal event, wet weather

Search Configuration

Tune search behavior for your use case:

ParameterDefaultDescription
min_confidence0.7Minimum similarity score to include in results
max_results10Maximum number of products to return
diversifytrueReduce duplicate categories in results
boost_in_stocktruePrioritize available inventory

Catalog Enrichment

Marketing-heavy product descriptions are optimized for humans, not AI agents. The catalog optimize command rewrites descriptions to be fact-dense and machine-readable:

bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Optimize catalog for agent readability
$ hyperfold catalog optimize --target="gpt-4o-buyer"
> [Job] Started Batch Job: job_opt_221
> [Progress] Processed 150/1500 SKUs...
# Example transformation:
BEFORE: "Our amazing jacket will keep you dry!
Perfect for ANY occasion! Buy now!
⭐⭐⭐⭐⭐"
AFTER: "Water-resistant blazer. Fabric: 65% wool,
35% polyester with DWR coating. Weight: 450g.
Suitable for: formal events, light rain.
Care: Dry clean only."
> [Diff] SKU 102: Removed 12 marketing phrases
> [Diff] SKU 102: Added explicit weight (450g) and fabric composition
> [Diff] SKU 102: Added care instructions
> [SUCCESS] Optimization complete. Agent readability +47%
Catalog optimization creates a parallel "agent-readable" version of your descriptions. Your original marketing copy remains unchanged for human-facing channels.

Vibe Matching

Beyond explicit attributes, Hyperfold captures the intangible "vibe" of products— the aesthetic, mood, and lifestyle they represent.

Query: "minimalist scandinavian desk lamp"

Matches products with clean lines, neutral colors, and modern aesthetic— even if they're not explicitly tagged "scandinavian."

Query: "cozy autumn vibes sweater"

Matches chunky knits in warm colors (rust, mustard, forest green)— the AI understands seasonal aesthetic preferences.

Vibe tags are automatically generated during product import using multimodal analysis of images and descriptions.

Ready to import products? See the Product Import guide.